跳转到内容

Efficient Estimation of Word Representations in Vector Space

作者: Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean (2013)

arXiv: 1301.3781

TLDR(中文)

Word2Vec 提出了词向量(词嵌入)的概念:通过在大规模文本上训练神经网络,让语义相近的词 在向量空间中距离相近。"king - man + woman ≈ queen"的类比关系让世人看到了词嵌入的威力, 为后来所有语言模型的嵌入层奠定了基础。

TLDR (English)

Word2Vec introduced the concept of word embeddings: training neural networks on large text corpora so semantically similar words cluster in vector space. The famous "king - man + woman ≈ queen" analogy demonstrated its power, laying the foundation for embedding layers in all subsequent language models.

出现在这些文章里