shagunsodhani/Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge.md

Last active August 28, 2016 04:20

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/shagunsodhani/004d803bc021f579d4aa3b24cec5b994.js"></script>
Save shagunsodhani/004d803bc021f579d4aa3b24cec5b994 to your computer and use it in GitHub Desktop.

Download ZIP

Notes for paper "Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge"

Raw

Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge.md

Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

Introduction

Task of translating natural language queries into regular expressions without using domain specific knowledge.
Proposes a methodology for collecting a large corpus of regular expressions to natural language pairs.
Reports performance gain of 19.6% over state-of-the-art models.
Link to the paper

Architecture

LSTM based sequence to sequence neural network (with attention)
Six layers
- One-word embedding layer
- Two encoder layers
- Two decoder layers
- One dense output layer.
Attention over encoder layer.
Dropout with the probability of 0.25.
20 epochs, minibatch size of 32 and learning rate of 1 (with decay rate of 0.5)

Dataset Generation

Created a public dataset - NL-RX - with 10K pair of (regular expression, natural language)
Two step generate-and-paraphrase approach
- Generate step
  - Use handcrafted grammar to translate regular expressions to natural language.
- Paraphrase step
  - Crowdsourcing the task of translating the rigid descriptions into more natural expressions.

Results

Evaluation Metric
- Functional equality check (called DFA-Equal) as same regular expression could be written in many ways.
Proposed architecture outperforms both the baselines - Nearest Neighbor classifier using Bag of Words (BoWNN) and Semantic-Unify

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment