66RING

😈

Chaos !ncoming

66RING

😈

Chaos !ncoming

Ultimate Super Badass. XDU | MSRA NRG | USTC. MLsys, Arch, Storage, blade learner and the CHAOTIC.

96 followers · 382 following

Achievements

Highlights

Organizations

Lists (8)

Sort

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

KEKE046 / mlir-tutorial

Hands-On Practical MLIR Tutorial

C++ 296 40 Updated Oct 20, 2023

checkpoint-restore / criu

Checkpoint/Restore tool

C 2,900 582 Updated Sep 26, 2024

66RING / CritiPrefill

Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".

Python 9 Updated Sep 15, 2024

microsoft / VPTQ

VPTQ, A Flexible and Extreme low-bit quantization algorithm

Python 55 3 Updated Sep 26, 2024

LeiWang1999 / Stream-k.tvm

Python 15 1 Updated Sep 28, 2024

FFY0 / AdaKV

The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference

Python 10 Updated Aug 16, 2024

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 533 21 Updated Sep 21, 2024

THUDM / LongWriter

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Python 1,070 92 Updated Sep 27, 2024

tonbo-io / tonbo

A portable embedded database using Arrow.

Rust 626 41 Updated Sep 28, 2024

weishengying / cutlass_flash_atten_fp8

使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention

Cuda 46 3 Updated Aug 12, 2024

microsoft / mimalloc

mimalloc is a compact general purpose allocator with excellent performance.

C 10,432 842 Updated Aug 22, 2024

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters

Python 518 46 Updated Sep 28, 2024

facebookresearch / segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 11,077 940 Updated Aug 21, 2024

utsaslab / squirrelfs

SquirrelFS: A crash-consistent Rust file system for persistent memory (OSDI 24)

C 42 2 Updated Aug 26, 2024

microsoft / vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

C 192 13 Updated Sep 24, 2024

HanGuo97 / flute

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

Cuda 164 5 Updated Sep 15, 2024

CMU-SAFARI / ramulator

A Fast and Extensible DRAM Simulator, with built-in support for modeling many different DRAM technologies including DDRx, LPDDRx, GDDRx, WIOx, HBMx, and various academic proposals. Described in the…

C++ 566 209 Updated Aug 29, 2023