C-TC

Tiancheng Chen C-TC

PhD Student @ Scalable Parallel Computing Lab, ETH Zürich

4 followers · 11 following

Zurich, Switzerland

Achievements

Highlights

Lists (8)

Sort

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

hzwer / shareOI

算法竞赛课件分享

3,901 765 Updated Aug 30, 2024

DefTruth / Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,559 175 Updated Sep 27, 2024

NVIDIA / nvbandwidth

A tool for bandwidth measurements on NVIDIA GPUs.

C++ 289 28 Updated Jun 14, 2024

openucx / ucc

Unified Collective Communication Library

C 193 95 Updated Sep 17, 2024

KEKE046 / mlir-tutorial

Hands-On Practical MLIR Tutorial

C++ 296 40 Updated Oct 20, 2023

te42kyfo / gpu-benches

collection of benchmarks to measure basic GPU capabilities

Jupyter Notebook 249 38 Updated Jun 21, 2024

aliyun / aicb

HTML 106 17 Updated Sep 23, 2024

microsoft / vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

C 192 13 Updated Sep 24, 2024

openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,123 421 Updated Sep 28, 2024

ofiwg / libfabric

Open Fabric Interfaces

C 554 374 Updated Sep 28, 2024

microsoft / NPKit

NCCL Profiling Kit

Python 104 11 Updated Jul 1, 2024

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 2,566 256 Updated Sep 23, 2024

ccfddl / ccf-deadlines

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 5,976 419 Updated Sep 27, 2024

InternLM / InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Python 285 47 Updated Sep 27, 2024

tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,566 2,215 Updated Jul 29, 2024

DefTruth / CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32, fp16, bf16, fp8/int8, flash_attn, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Cuda 1,193 127 Updated Sep 28, 2024

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 9,357 916 Updated Sep 22, 2024

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 7,752 937 Updated Sep 26, 2024