!! DISCLAIMER: This is work in progress experiment in early alpha stages, there's a lot of work to be done to make this a useful tool !!
This project is designed to visualise and track the performance of various Large Language Models (LLMs) across different benchmarks. The visualisations aim help in understanding trends, comparing models, and predicting future performances.
- Data Entry: Easily add new benchmark data for models.
- Visualisation: Interactive charts showing model performance over time.
- Predictive Analysis: Predict future performances based on historical data.
- Node.js (v22+)
-
Clone the repository:
git clone https://github.com/sammcj/closing-the-gap.git cd closing-the-gap
-
Install dependencies:
npm install
-
Start the development server:
npm start
-
Access the application in your browser at
http://localhost:3000
.
The project is structured as follows:
public/
: Static index.html.src/
: Source code for the application.components/
: Reusable UI components.DataEntryForm.js
: Form to add new benchmark data.LLMBenchmarkVisualisation.js
: Component to visualise benchmark data using ChartJS.LLMBenchmarkDashboard.js
: Dashboard to display benchmark data and predictions.LeftPanel.js
: Side panel to display model information.
config.js
: Configuration settings for the application, including chart colors and titles.App.js
: Main application component that integrates all other components.
server.js
: Express server to serve static files and API endpoints.ingest/
: Scripts to aid with data ingestion (not used by the app itself).package.json
: Project metadata and scripts.llm_bechmarks.db
: SQLite database to store benchmark data.
-
GUI Data Entry: Use the
DataEntryForm
component to add new benchmark data for models. This includes entering dates, selecting models, benchmarks, scores, and whether the model is open or closed. -
CLI Data Entry: Add correctly formatted JSON benchmark results to
ingest/import.json
and runnode ingest/ingest.js
-
Visualisation: The
LLMBenchmarkVisualisation
component provides interactive charts that show the performance of different models over time. Predictions are also provided based on historical data trends. -
Predictive Analysis: Historical data is used to predict future performances, helping in understanding model growth and potential improvements.
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and test them thoroughly.
- Submit a pull request with a clear description of your changes.
Copyright 2024 Sam McLeod
This project is licensed under the MIT License - see the LICENSE file for details.