While continuous word embeddings are gaining popularity, current models are based solely on linear contexts. In this work, we generalize the skip-gram model with negative sampling introduced by Mikolov et al. to include arbitrary contexts.
Word and Context Vectors
The embeddings produced from English Wikipedia are also available for download:
- Dependency-Based [words] [contexts]
- Bag of Words (k = 2) [words] [contexts]
- Bag of Words (k = 5) [words] [contexts]
We also provide a demo for comparing the different types of embeddings and observing their top contexts.