Welcome to my LLM lab!

This is a repo of my experiments and notes while learning about LLMs. I'm starting with a decent theoretical understanding of neural networks, and hands on experience training large models on distributed systems. I'm very comfortable with data and ML engineering.

What's done

I've read many papers and a few books in the deep learning and LLM space, but have never committed to learning things deeply or hands on. I plan to change that.

What's up next

Here are all the things I'd like to do:

Guided tutorials and follow alongs:

Andrej Karpathy's Neural Networks: Zero to Hero guide.
Thoroughly read The Annotated Transformers paper and run the code side-by-side.
Explore the Tensor2Tensor repo.

Implementations:

Implement FlashAttention myself (in Cuda maybe?)
Implement FSDP myself (no idea how!?)

Experiments

Model efficiency experiments. Try out the following and benchmark performance changes:
- Speculative decoding
- Knowledge distillation
- Quantization
- Pruning
- Sparsity low ran compression
- etc
Play around with LLAMA models locally

Readings:

Reread: Flash Attention 1, 2 and Paged Attention
Flash Attention 3
Depthwise Seperable Convolutions for NMT
One Model To Learn Them All
Self-Attention with Relative Position Representations
Self-attention Does Not Need O(n2) Memory
Oneline softmax papers

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to my LLM lab!

What's done

What's up next

Guided tutorials and follow alongs:

Implementations:

Experiments

Readings:

About

Releases

Packages

AreelKhan/llm-lab

Folders and files

Latest commit

History

Repository files navigation

Welcome to my LLM lab!

What's done

What's up next

Guided tutorials and follow alongs:

Implementations:

Experiments

Readings:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages