Dependency-Based Word Embeddings.
Omer Levy and Yoav Goldberg. Short paper in ACL 2014. [pdf] [slides]
While continuous word embeddings are gaining popularity, current models are based solely on linear contexts. In this work, we generalize the skip-gram model with negative sampling introduced by Mikolov et al. to include arbitrary contexts.
The code used in this paper is publicly available on BitBucket and for direct download.
Word and Context Vectors
The embeddings produced from English Wikipedia are also available for download:
- Dependency-Based [words] [contexts]
- Bag of Words (k = 2) [words] [contexts]
- Bag of Words (k = 5) [words] [contexts]
We also provide a demo for comparing the different types of embeddings and observing their top contexts.