Neural Word Embeddings as Implicit Matrix Factorization. Omer Levy and Yoav Goldberg. NIPS 2014. [pdf] We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, […]

Proposition Knowledge Graphs. Gabriel Stanovsky, Omer Levy, and Ido Dagan. AHA! Workshop 2014. [pdf] This position paper proposes a novel representation for Information Discovery — Proposition Knowledge Graphs. These extend the Open IE paradigm by representing semantic inter-proposition relations in a traversable graph. . . . . .

Focused Entailment Graphs for Open IE Propositions. Omer Levy, Ido Dagan, and Jacob Goldberger. CoNLL 2014. [pdf] [slides] Open IE methods extract structured propositions from text. However, these propositions are neither consolidated nor generalized, and querying them may lead to insufficient or redundant information. This work suggests an approach to organize open IE propositions using […]

Linguistic Regularities in Sparse and Explicit Word Representations. Omer Levy and Yoav Goldberg. CoNLL 2014. [pdf] [slides] This fascinating result raises a question: to what extent are the relational semantic properties a result of the embedding process? Experiments show that the RNN-based embeddings are superior to other dense representations, but how crucial is it for […]

Dependency-Based Word Embeddings. Omer Levy and Yoav Goldberg. Short paper in ACL 2014. [pdf] [slides] While continuous word embeddings are gaining popularity, current models are based solely on linear contexts. In this work, we generalize the skip-gram model with negative sampling introduced by Mikolov et al. to include arbitrary contexts. Code The code used in […]

word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method. Yoav Goldberg and Omer Levy. arXiv 2014. [pdf] The word2vec software of Tomas Mikolov and colleagues has gained a lot of traction lately, and provides state-of-the-art word embeddings. The learning models behind the software are described in two research papers. We found the description of the […]

The Excitement Open Platform for Textual Inferences. Bernardo Magnini, Roberto Zanoli, Ido Dagan, Kathrin Eichler, Günter Neumann, Tae-Gil Noh, Sebastian Padó, Asher Stern, and Omer Levy. Demo paper in ACL 2014. [pdf] This paper presents the Excitement Open Platform (EOP), a generic architecture and a comprehensive implementation for textual inference in multiple languages. Code The […]