Skip to content
View 66RING's full-sized avatar
😈
Chaos !ncoming
😈
Chaos !ncoming

Highlights

  • Pro

Organizations

@ChaosDaily @LosersDelight @aovim

Block or report 66RING

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

llm

8 repositories

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,579 363 Updated Jul 11, 2024

An experimental parallel training platform

47 11 Updated Mar 25, 2024

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,467 249 Updated Aug 22, 2024

Fast inference from large lauguage models via speculative decoding

Python 524 51 Updated Aug 22, 2024

[ICLR 2024] Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding

Python 139 15 Updated Mar 1, 2024

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

Python 13,378 1,771 Updated Sep 27, 2024

Papers for database systems powered by artificial intelligence (machine learning for database)

628 84 Updated Sep 23, 2024

This repository contains tutorials and examples for Triton Inference Server

Python 534 91 Updated Sep 26, 2024