lstm_baseline

Implementation of a standard LSTM using PyTorch and Torchtext for text classification baseline measurements.

Model Type

rand: All words are randomly initialized and then modified during training.
static: A model with pre-trained vectors from word2vec. All words -- including the unknown ones that are initialized with zero -- are kept static and only the other parameters of the model are learned.
non-static: Same as above but the pretrained vectors are fine-tuned for each task.

To run the model on Reuters dataset on static, just run the following from the Castor working directory.

python -m lstm_baseline --mode static

We experiment the model on the following datasets.

Adadelta is used for training.

Support ONNX export. Currently throws a ONNX export failed (Couldn't export Python operator forward_flattened_wrapper) exception.
Add dataset results with different hyperparameters
Parameters tuning