Highlights
- Pro
Stars
Entropy Based Sampling and Parallel CoT Decoding
Code repo for paper "Quantifying Generalization Complexity for Large Language Models"
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Awesome LLM compression research papers and tools.
SGLang is a fast serving framework for large language models and vision language models.
llama3 implementation one matrix multiplication at a time
A framework for few-shot evaluation of language models.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Fast and memory-efficient exact attention
Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.
Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
DeepSeek-VL: Towards Real-World Vision-Language Understanding
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives
Python package for measuring memorization in LLMs.
Ongoing research training transformer models at scale
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A repository of Language Model Vulnerabilities and Exposures (LVEs).
Representation Engineering: A Top-Down Approach to AI Transparency
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"