Inferring multilingual domain-specific word embeddings from large document corpora