Skip to content
View Qubitium's full-sized avatar

Block or report Qubitium

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Filament is a real-time physically based rendering engine for Android, iOS, Windows, Linux, macOS, and WebGL2

C++ 17,672 1,869 Updated Sep 28, 2024

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 179 15 Updated May 28, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 534 21 Updated Sep 21, 2024

Aidan Bench attempts to measure <big_model_smell> in LLMs.

Python 68 5 Updated Sep 28, 2024

High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild

Zig 1,397 47 Updated Sep 27, 2024

markdown parser and HTML renderer for Go

Go 1,377 173 Updated Jul 30, 2024

Shiva library: Implementation in Rust of a parser and generator for documents of any type

Rust 167 10 Updated Sep 25, 2024

Efficient Triton Kernels for LLM Training

Python 3,088 158 Updated Sep 25, 2024

An easy-to-use LLM quantization and inference toolkit based on GPTQ algorithm (weight-only quantization).

Python 92 19 Updated Sep 26, 2024

Control fault/locate indicators in disk slots in enclosures (SES devices)

Python 53 21 Updated Aug 27, 2024

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Python 2,045 144 Updated Aug 1, 2024

A framework for Privacy Preserving Machine Learning

Python 1,524 277 Updated Jul 18, 2024

Utilities intended for use with Llama models.

Python 4,212 749 Updated Sep 25, 2024

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

Cuda 165 5 Updated Sep 15, 2024

The Memory layer for your AI apps

Python 22,018 2,012 Updated Sep 27, 2024

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Python 285 47 Updated Sep 27, 2024

Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Python 348 27 Updated Aug 11, 2024

Multilingual Voice Understanding Model

Python 2,777 262 Updated Sep 25, 2024

Simple Python library/structure to ablate features in LLMs which are supported by TransformerLens

Python 296 38 Updated Jun 11, 2024

Implementation for MatMul-free LM.

Python 2,887 178 Updated Sep 19, 2024

Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"

Python 783 46 Updated Aug 21, 2024

NVIDIA Linux open GPU with P2P support

C 870 75 Updated Jun 7, 2024

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)

TypeScript 22,370 1,286 Updated Sep 28, 2024

QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.

Python 62 4 Updated Sep 12, 2024

Create images of a given character in different poses

Python 560 54 Updated Jun 5, 2024

Tensor library for machine learning

C++ 10,919 1,004 Updated Sep 27, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,773 390 Updated Aug 10, 2024

A massively parallel, high-level programming language

Rust 17,249 424 Updated Sep 27, 2024

Tools for merging pretrained large language models.

Python 4,561 406 Updated Sep 16, 2024
Next