Skip to content

Latest commit

 

History

History
 
 

week1_embeddings

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Word embeddings

Practice & homework

The practice for this week takes place in notebooks. Just open them and follow instructions from there.

  • Seminar: ./seminar.ipynb
  • Homework: ./homework.ipynb

Unless explicitly said otherwise, all subsequent weeks follow the same pattern (notebook with instructions).

If you have any difficulties with notebooks, just open them in colab.

More materials (optional)

  • On hierarchical & sampled softmax estimation for word2vec page
  • GloVe project page
  • FastText project repo
  • Semantic change over time - oberved through word embeddings - arxiv
  • Another cool link that you could have shared, but decided to hesitate. Or did you?

Related articles

Starting Point

  • Distributed Representations of Words and Phrases and their Compositionality Mikolov et al., 2013 [arxiv]

  • Efficient Estimation of Word Representations in Vector Space Mikolov et al., 2013 [arxiv]

  • Distributed Representations of Sentences and Documents Quoc Le et al., 2014 [arxiv]

  • GloVe: Global Vectors for Word Representation Pennington et al., 2014 [article]

Multilingual Embeddings. Unsupervised MT.

  • Enriching word vectors with subword information Bojanowski et al., 2016 [arxiv]

  • Exploiting similarities between languages for machine translation Mikolov et al., 2013 [arxiv]

  • Improving vector space word representations using multilingual correlation Faruqui and Dyer, EACL 2014 [pdf]

  • Learning principled bilingual mappings of word embeddings while preserving monolingual invariance Artetxe et al., EMNLP 2016 [pdf]

  • Offline bilingual word vectors, orthogonal transformations and the inverted softmax [arxiv] Smith et al., ICLR 2017

  • Word Translation Without Parallel Data Conneau et al., 2018 [arxiv]