small experiments with AR image modeling with Mamba.
install with poetry lock && poetry install
or use conda and install things by hand.
mostly just the standard torch stuff + wandb.
Current best:
python main.py --batch-size 64 --lr 1e-3 --n-layer 16 --d-model 512 --clip-grad-val 1
logged in wandb.