several types of attention modules written in PyTorch for learning purposes
-
Updated
Oct 1, 2024 - Python
several types of attention modules written in PyTorch for learning purposes
Image Captioning With MobileNet-LLaMA 3
(Unofficial) building Hugging Face SmolLM-blazingly fast and remarkably powerful small language model with PyTorch implementation of grouped query attention (GQA)
A single-file implementation of LLaMA 3, with support for jitting, KV caching and prompting
Decoder-only LLM trained on the Harry Potter books.
Add a description, image, and links to the grouped-query-attention topic page so that developers can more easily learn about it.
To associate your repository with the grouped-query-attention topic, visit your repo's landing page and select "manage topics."