shagunsodhani/A Neural Conversational Model.md

Created July 9, 2016 12:53

Star (1) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/shagunsodhani/ec6835964df0e49fdef0459c8b334b94.js"></script>
Save shagunsodhani/ec6835964df0e49fdef0459c8b334b94 to your computer and use it in GitHub Desktop.

Download ZIP

Notes for paper titled A Neural Conversational Model

Raw

A Neural Conversational Model.md

A Neural Conversational Model

Introduction

The paper presents a domain agnostic approach for conversational modelling based on Sequence to Sequence Learning Framework.
Link to the paper

Model

Neural Conversational Model (NCM)
A Recurrent Neural Network (RNN) reads the input sentence, one token at a time, and predicts the output sequence, one token at a time.
Learns by backpropagation.
The model maximises the cross entropy of correct sequence given its context.
Greedy inference approach where predicted output token is used as input to predict the next output token.

Dataset

IT HelpDesk dataset of conversations about computer related issues.
OpenSubtitles dataset containing movie conversations.

Results

The paper has reported some samples of conversations generated by the interaction between human actor and the NCM.
NCM reports lower perplexity as compared to n-grams model.
NCM outperforms CleverBot in a subjective test involving human evaluators to grade the two systems.

Strengths

Domain-agnostic.
End-To-End training without handcrafted rules.
Underlying architecture (Sequence To Sequence Framework) can be leveraged for machine translation, question answering etc.

Weakness

The responses are simple, short and at times inconsistent.
The objective function of Sequence To Sequence Framework is not designed to capture the objective of conversational models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment