Skip to content

AreelKhan/llm-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Welcome to my LLM lab!

This is a repo of my experiments and notes while learning about LLMs. I'm starting with a decent theoretical understanding of neural networks, and hands on experience training large models on distributed systems. I'm very comfortable with data and ML engineering.

What's done

I've read many papers and a few books in the deep learning and LLM space, but have never committed to learning things deeply or hands on. I plan to change that.

What's up next

Here are all the things I'd like to do:

Guided tutorials and follow alongs:

Implementations:

  • Implement FlashAttention myself (in Cuda maybe?)
  • Implement FSDP myself (no idea how!?)

Experiments

  • Model efficiency experiments. Try out the following and benchmark performance changes:
    • Speculative decoding
    • Knowledge distillation
    • Quantization
    • Pruning
    • Sparsity low ran compression
    • etc
  • Play around with LLAMA models locally

Readings:

  • Reread: Flash Attention 1, 2 and Paged Attention
  • Flash Attention 3
  • Depthwise Seperable Convolutions for NMT
  • One Model To Learn Them All
  • Self-Attention with Relative Position Representations
  • Self-attention Does Not Need O(n2) Memory
  • Online softmax papers
  • Some synthetic data papers or concepts

About

Documented LLM learning journey.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published