Lists (1)
Sort Name ascending (A-Z)
Stars
Python bindings for the Transformer models implemented in C/C++ using GGML library.
4 bits quantization of LLaMA using GPTQ
The Vulkan API Specification and related tools
General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for…
The reproduction and notes during learning llama3 from scratch. Some notes were added in the notebook and pure code file was made.
A step-by-step guide to building the complete architecture of the Llama 3 model from scratch and performing training and inferencing on a custom dataset.
Clean inference code for LLaMA-3 with lots of comments explaining every step
使用numpy从零开始实现llama3的推理流程,并对其进行封装,对比GPU,CPU上的表现以及Lora微调。llama3 implemented from scratch using numpy and lora fine-tune.。
LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.
llama3 implementation one matrix multiplication at a time
a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model
Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.
An LLM-powered advanced RAG pipeline built from scratch
仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, B…
A modern model graph visualizer and debugger
MTEB: Massive Text Embedding Benchmark
Easy usage of Rockchip's NPUs found in RK3588 and similar chips
Pelochus / ezrknn-llm
Forked from airockchip/rknn-llmEasier usage of LLMs in Rockchip's NPU on SBCs like Orange Pi 5 and Radxa Rock 5 series
Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
VLC media player - All pull requests are ignored, please use MRs on https://code.videolan.org/videolan/vlc