-
Character.ai
- Palo Alto, CA
-
14:45
(UTC -07:00) - https://www.linkedin.com/in/tian-xie-a13287128/
Stars
Magnificent app which corrects your previous console command.
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Library for fast text representation and classification.
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Running large language models on a single GPU for throughput-oriented scenarios.
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Incredibly fast Whisper-large-v3
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Manipulate audio with a simple and easy high level interface
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
DeepSeek Coder: Let the Code Write Itself
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
State-of-the-Art Text Embeddings
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
Implementation of Nougat Neural Optical Understanding for Academic Documents
Making Reddit data accessible to researchers, moderators and everyone else. Interact with the data through large dumps, an API or web interface.
A high-throughput and memory-efficient inference and serving engine for LLMs
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.