https://www.storylinestructure.com/
Storyline Structure Analysis is a project that visualizes the progression of complex concepts through the course of a book. Utilizing advanced Natural Language Processing techniques, it identifies and plots the ebbs and flows of themes like fear, hope, and oppression in literary works.
The project leverages the power of large language models, particularly Facebook's BART and DeBERTa, to read and interpret books. Key functionalities include sentiment analysis, summarization, and chapter-based text processing.
- Python 3.x
- PyTorch
- Transformers library
- NumPy
- Pandas
-
Clone the repository:
git clone https://github.com/your-username/storyline-structure-analysis.git cd storyline-structure-analysis
-
Install dependencies:
conda create --name lit --file requirements.txt
To run the analysis, use the following command with appropriate arguments:
python ml/generate_vectors.py --input_dir gutenberg/data/tokens/ --output_dir public/data/ PG1399
To build the server locally, you can use
node server.js
-
Zero-Shot Classification: Utilizes
MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli
model for classifying paragraphs against a set of predefined motifs. -
Summarization: Employs
facebook/bart-large-cnn
for generating concise summaries of selected paragraphs. -
Custom Scripts: Includes scripts for processing books, tokenizing text, and organizing data for visualization.
The project reads books by paragraph, classifies them according to specified motifs, and generates summaries for a subset of paragraphs. Summarization and classification results are stored in JSON format for visualization.
{
"x": [0, 1, 2, ...],
"y": [0.1, 0.5, 0.2, ...]
}
Data visualization is implemented using D3.js, which plots the progression of concepts throughout a book. This is accompanied by an interactive web interface.
Contributions to the project are welcome. Please read the contributing guidelines before submitting your pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.
- Facebook AI Research for the BART model
- Moritz Laurer for the DeBERTa-v3-base-mnli-fever-anli model
- Project Gutenberg for providing a vast collection of books
Created by: Thomas Ryun Dougherty © MIT License