Word embeddings are vector-based meaning representations relying on the [[distributional hypothesis]].
In **sparse** embeddings, words (or [[tokenization|tokens]]) are represented as a function of the counts of words they co-occur with, often at the document level. Vector length is based on the size of the collection.
A **term-document matrix** provides the count of each term in a document. The term vector is simply the vector of counts per document in the corpus.
In **dense** representations, the dimensionality is reduced. Representations are trained from larger corpora using semi-supervised learning based on co-occurrence statistics. Vector length ranges from 300 to 10,000.
Word embeddings support
- [[similarity measure]]
- composition of meaning
- relational and analogical reasoning
- visualization and clustering methods
## latent semantic analysis
Uses truncated SVD.
Originally for query/document similarity
## word2vec
## GloVe
GloVe, short for Global Vectors, is a method for [[word embeddings]] based on ratios of word co-occurrence probabilities.
## FastText
FastText is a
Implementations of FastText are available at [FastText](https://fasttext.cc/) and in the [Gensim](https://radimrehurek.com/gensim/auto_examples/tutorials/run_fasttext.html#sphx-glr-auto-examples-tutorials-run-fasttext-py) library.
[[ELMo]]
[[BERT]]