A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments

A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments.
Omer Levy, Anders Søgaard, and Yoav Goldberg. EACL 2017. [pdf]

This paper draws both empirical and theoretical parallels between the embedding and alignment literature, and suggests that adding additional sources of information, which go beyond the traditional signal of bilingual
sentence-aligned corpora, may substantially improve cross-lingual word embeddings.

Code & Data

The code and data used to generate our embeddings and run the benchmarks are available here.

.

.

.

.

.

Advertisement
%d bloggers like this: