Open
Description
Paper
Implement Efficient Self-Supervised Vision Transformers (EsViT)
Link to the paper
TODOs
- develop the model in a scalable way (using
*Block
syntax andnn
modules) - draw the model on figma
- write doc
- test the model
- get (if possible) the pretrained weights