Skip to content
View scottwey's full-sized avatar

Sponsoring

@lpil
@oscartbeaumont

Highlights

  • Pro

Block or report scottwey

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Interpolate between embedding points with llm

Python 29 Updated Jul 17, 2024

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Python 212 11 Updated Aug 26, 2024

Small, fast, modern HTTP server for Erlang/OTP.

Erlang 7,270 1,164 Updated Jul 15, 2024

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains

TypeScript 16,462 1,284 Updated Sep 29, 2024

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 7,401 395 Updated Sep 27, 2024

High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild

Zig 1,396 47 Updated Sep 27, 2024

Distributed LLM and StableDiffusion inference for mobile, desktop and server.

Rust 2,499 131 Updated Aug 30, 2024

Rust port of Spice, a low-overhead parallelization library

Rust 512 7 Updated Sep 23, 2024

Rust Kubernetes client and controller runtime

Rust 2,911 304 Updated Sep 25, 2024

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,850 175 Updated Sep 11, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,300 388 Updated Sep 28, 2024

Comparison of Language Model Inference Engines

186 5 Updated Sep 2, 2024

A high-performance, zero-overhead, extensible Python compiler using LLVM

C++ 14,996 516 Updated Sep 12, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,049 117 Updated Sep 24, 2024
Python 5,742 428 Updated Sep 27, 2024

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…

Python 3,620 311 Updated Sep 29, 2024

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 505 41 Updated Sep 28, 2024

An extremely fast CSS parser, transformer, bundler, and minifier written in Rust.

Rust 6,344 180 Updated Sep 11, 2024

Repair and secure untrusted HTML

Rust 508 42 Updated Sep 20, 2024

Self-hosted AI coding assistant

Rust 21,265 959 Updated Sep 29, 2024

A cross-platform inference engine for neural TTS models.

Rust 63 12 Updated Jul 27, 2024

A fast, local neural text to speech system

C++ 5,950 432 Updated Aug 7, 2024

Things you can do with the token embeddings of an LLM

Python 1,172 35 Updated Sep 25, 2024

Run Generative AI models directly on your hardware

Rust 16 Updated Aug 7, 2024

EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Models (LLMs).

Jupyter Notebook 130 16 Updated Sep 11, 2024

An OAI compatible exllamav2 API that's both lightweight and fast

Python 492 66 Updated Sep 25, 2024

Ultra-Lightweight Durable Execution in Python

Python 214 5 Updated Sep 27, 2024

Command-line tool to download all files from a MyUni course

Python 5 4 Updated Jun 14, 2021

Scrape Canvas content, assignments, etc. Forked from a gist at https://gist.github.com/Koenvh1/6386f8703766c432eb4dfa19acdb0244

Python 18 10 Updated Apr 26, 2024

Large-scale LLM inference engine

Python 991 111 Updated Sep 29, 2024
Next