- Slides
- Our lecture, seminar
- The only relevant video-lecture we could find - video
- Will hopefully record our lecture in english soon!
- Self-critical sequence traning original article
- An awesome post explaining attention and long-term memory models.
- BLEU and CIDEr articles.
- Image captioning
- Other articles on reinforcement learning for natural language:
- task-oriented conversation system
- generating dialogues
- sequential adversarial networks (a.k.a. SeqGAN)
- A large overview for machine translation (touching on RL, including RL failures) - article
- How not to evaluate conversation models - article
- Overview of other non-games applications ("that article again") - https://arxiv.org/abs/1701.07274
As usual, go to practice_theano.ipynb or practice_tf.ipynb and follow instructions from there.
Other frameworks: as usual, your task remains the same as in the main track:
- Implement or borrow seq2seq model for the same translation task
- Neat tenworflow repo
- Important - this repo uses simplified phoneme dict - make sure you change preprocessing phase to meaningfully compare results.
- Implement self-critical sequence training ( = basic policy gradient with a special baseline, see notebook)
- Beat the baseline (main notebook: step6)
Even if you decide to use custom frameworks, it is highly recommended that you reuse evaluation code (e.g. min Levenshtein) from the main notebook to avoid confusion.