JeremyΒ Bernstein β Β· β ArashΒ Vahdat β Β· β YisongΒ Yue β Β· β MingβYuΒ Liu
To get started with Fromage in your Pytorch code, copy the file fromage.py
into your project directory, then write:
from fromage import Fromage
optimizer = Fromage(net.parameters(), lr=0.01)
We found an initial learning rate of 0.01 worked well in all experiments except model fine-tuning, where we used 0.001. You may want to experiment with learning rate decay schedules.
We've written an academic paper that proposes an optimisation algorithm based on a new geometric characterisation of deep neural networks. The paper is called:
On the distance between two neural networks and the stability of learning.
We're putting this code here so that you can test out our optimisation algorithm in your own applications, and also so that you can attempt to reproduce the experiments in our paper.
If something isn't clear or isn't working, let us know in the Issues section or contact bernstein@caltech.edu.
Here is the structure of this repository.
.
βββ classify-cifar/ # CIFAR-10 classification experiments. β
βββ classify-imagenet/ # Imagenet classification experiments. Coming soon! π
βββ classify-mnist/ # MNIST classification experiments. β
βββ finetune-transformer/ # Transformer fine-tuning experiments. β
βββ generate-cifar/ # CIFAR-10 class-conditional GAN experiments. β
βββ make-plots/ # Code to reproduce the figures in the paper. β
βββ LICENSE # The license on our algorithm. β
βββ README.md # The very page you're reading now. β
βββ fromage.py # Pytorch code for the Fromage optimiser. β
Check back in a few days if the code you're after is missing. We're currently cleaning and posting it.
- This research was supported by Caltech and NVIDIA.
- Our code is written in Pytorch.
- Our GAN implementation is based on a codebase by Jiahui Yu.
- Our Transformer code is from π€ Transformers.
- Our CIFAR-10 classification code is orginally by kuangliu.
- Our MNIST code was originally forked from the Pytorch example.
- See here and here for closely related work by Yang You, Igor Gitman and Boris Ginsburg.
If you adore le fromage as much as we do, feel free to cite the paper:
@misc{fromage2020,
title={On the distance between two neural networks and the stability of learning},
author={Jeremy Bernstein and Arash Vahdat and Yisong Yue and Ming-Yu Liu},
year={2020},
eprint={arXiv:2002.03432}
}
We are making our algorithm available under a CC BY-NC-SA 4.0 license. The other code we have used obeys other license restrictions as indicated in the subfolders.