Probabilistic FastText for Multi-Sense Word Embeddings
- Others:
- Gurevych, Iryna
- Miyao, Yusuke
Abstract
We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. In particular, we represent each word with a Gaussian mixture density, where the mean of a mixture component is given by the sum of n-grams. This representation allows the model to share statistical strength across sub-word structures (e.g. Latin roots), producing accurate representations of rare, misspelt, or even unseen words. Moreover, each component of the mixture can capture a different word sense. Probabilistic FastText outperforms both FastText, which has no probabilistic model, and dictionary-level probabilistic embeddings, which do not incorporate subword structures, on several word-similarity benchmarks, including English RareWord and foreign language datasets. We also achieve state-of-art performance on benchmarks that measure ability to discern different meanings. Thus, the proposed model is the first to achieve multi-sense representations while having enriched semantics on rare words.
Additional Information
© 2018 The Association for Computational Linguistics. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 License.Attached Files
Published - P18-1001.pdf
Accepted Version - 1806.02901.pdf
Files
Name | Size | Download all |
---|---|---|
md5:984c3d6d8ed790944d6ce79d235bee91
|
527.8 kB | Preview Download |
md5:9d4b6c6aec20c3decc2f8d68c29e756e
|
438.0 kB | Preview Download |
Additional details
- Eprint ID
- 94177
- Resolver ID
- CaltechAUTHORS:20190327-085800530
- Created
-
2019-03-28Created from EPrint's datestamp field
- Updated
-
2023-06-02Created from EPrint's last_modified field