E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System

Overview

E-LSTM provides an efficient implementation of LSTM inference on the RISC-V based embedded system. The project contains three blocks as follows,

Benchmark: Three fundamental LSTM benchmarks with MNIST, PTB and WikiText dataset respectively. Note that we trained the sparse LSTM models in which a portion of weight values are zero.
Software tools: The software scripts, including generator of eSELL sparse representation for LSTM weight matrix, functional simulator and performance fine-tuning scripts.
[Hardware tools]: The forked version of official RISC-V toolchains, packeted with a the cycle-level simulator for E-LSTM heterogeneous system.

Benchmark

The three models are implemented on Pytorch with the requirements.
Obtain the dataset and pretrained models:

$ cd benchmark
$ ./data_download.sh

Evaluate the accuracy:

# MNIST
$ python MNIST/eval.py -model MNIST/models/nhid:128-nlayer:2-epoch:10.ckpt -ws 0.3 0.5 0.2 0.4 -ht 0.3 0.8 -b 4

# PTB
$ python LM/eval.py --model_path LM/models/PTB/model:LSTM-em:800-nhid:800-nlayers:2-bptt:35-epoch:40-lr:20-tied:False-l1:False-l1_lambda:1e-05-dropout:0.65.ckpt.retrain -m sparsity -ws 0.2 0.5 0.2 0.6 -ht 0.12 0.22 -b 4

# WikiText
$ python LM/eval.py --model_path LM/models/wikitext/model:LSTM-em:1500-nhid:1500-nlayers:2-bptt:35-epoch:20-lr:8.0-tied:False-l1:False-l1_lambda:1e-05-dropout:0.65.ckpt -m sparsity --data LM/data/wikitext --emsize 1500 --nhid 1500 -ws 0.4 0.5 0.3 0.4 -ht 0.28 0.43 -b 4

where the arguments represent: - -m running mode, select the sparsity controlling variable (option: sparsity/threshold) - -data dataset location - -emsize embedded length of LSTM input vector - -nhid size of hidden state - -ws weight sparsity (for W, U matrix in LSTM-layer1 & LSTM-layer2 respectively) - -ht threshould for dynamic puring of hidden state (elements in vector h) - -b granularity of hidden state puring (-b 4 means puring 4 elements in h as a whole) - other arguments in training and inference phase please refer to *.py --help

Software Tools

eSELL sparse format construction and comparison (esell_format.py): eSELL is the an sparse matrix representation developed based on SELL-C-sigma format, that is used in the E-LSTM system to save the on-chip buffer area cost.
Hardware golden reference (lstm_sim_esell.py): Package the weight (half-precision float) to 64-bit words for memory storage and CPU-Accelerator interface communication, with the accelerator behavioral simulation. It also dumps the .h file for RISC-V cycle-level simulation.
Group size fine tuning (group_size_tuning.py): Modeling the accelerator performance for selecting the best factor (N_grp) in cell fusion optimization. The description of input arguments are list by *.py --help

Hardware Tools

The RISC-V toolchain is forked from the original repo, and the E-LSTM accelerator cycle-level simulation model is added in. Please reference the following brief to understand the framework and test the accelerator.

IMPORTANT: Prepare the submodules of rocket-chip recursively by running git submodule update --init --recursive --remote in the root folder.
riscv-tools: all software toolchains and simulator of RISC-V
- riscv-tools/riscv-isa-sim: Spike, a quasi cycle-level simulator of RISC-V system.
  - riscv-tools/riscv-isa-sim/elstm_rocc: behavioral model of E-LSTM accelerator that coupled with RISC-V via ROCC interface.
- Please refer to Tools' README for other toolsets.
- For now, please run 'build.sh' to build up all toolsets.
- riscv-tools/riscv-tests/benchmarks/rocc-acc: User space test program for ROCC-based accelerator.
  - elstm.c: host program to simulator the E-LSTM simulator.
  - Before running the host program, please download the pre-pared header files that store the weight and input activation for each layer (dumped from the benchmark scripts), as follows,
```
wget https://storage.googleapis.com/rbsharing/elstm/weight_header.tar.gz
tar -xzvf weight_header.tar.gz
```
  - change the included header file and layer corresponding variables in the elstm.c for a particular layer;
  - run make to compile the host program for E-LSTM simulation.
  - run spike --extension=elstm_rocc elstm.riscv for simulation, the program iteratively processed the 20 input sequences and gives the cycle cost.
  - For the control instruction and workflow details in elstm.c, please refere to elstm-instruction.md.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmark		benchmark
rocket-chip @ 19d2169		rocket-chip @ 19d2169
software		software
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System

Overview

Benchmark

Software Tools

Hardware Tools

About

Releases

Packages

Languages

rbshi/elstm

Folders and files

Latest commit

History

Repository files navigation

E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System

Overview

Benchmark

Software Tools

Hardware Tools

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages