Highlights
- Pro
Stars
- All languages
- Assembly
- C
- C#
- C++
- CSS
- Clojure
- CoffeeScript
- Crystal
- Cuda
- Dart
- Dockerfile
- Elixir
- Erlang
- Fortran
- Gleam
- Go
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- MDX
- Makefile
- Markdown
- Mathematica
- Mustache
- OCaml
- Objective-C
- Objective-C++
- OpenEdge ABL
- PHP
- PLpgSQL
- Pascal
- Perl
- PowerShell
- Python
- RPM Spec
- Reason
- Ruby
- Rust
- SCSS
- Scala
- ShaderLab
- Shell
- Swift
- TypeScript
- Vala
- Vue
- WebAssembly
- XSLT
- Zig
Interpolate between embedding points with llm
LlamaVoice is a llama-based large voice generation model, providing inference and training ability.
Small, fast, modern HTTP server for Erlang/OTP.
⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Rust port of Spice, a low-overhead parallelization library
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Comparison of Language Model Inference Engines
A high-performance, zero-overhead, extensible Python compiler using LLVM
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
An extremely fast CSS parser, transformer, bundler, and minifier written in Rust.
A cross-platform inference engine for neural TTS models.
Things you can do with the token embeddings of an LLM
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Models (LLMs).
An OAI compatible exllamav2 API that's both lightweight and fast
Ultra-Lightweight Durable Execution in Python
Command-line tool to download all files from a MyUni course
Scrape Canvas content, assignments, etc. Forked from a gist at https://gist.github.com/Koenvh1/6386f8703766c432eb4dfa19acdb0244