(Here is my 🪐 Blog)
Type | Paper | Description | Code | Blog | Recommend Reeading |
---|---|---|---|---|---|
Computer Vision | |||||
Image Classification | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Vision Transformers (ViT) apply Transformer models to image recognition by processing images as patch sequences | ViT Jupyter Notebook | ⭐⭐⭐⭐⭐ | |
Image Segmentation | U-Net: Convolutional Networks for Biomedical Image Segmentation | U-Net is a convolutional neural network architecture designed for biomedical image segmentation, excelling in capturing fine-grained details. | U-Net Jupyter Notebook | ⭐⭐⭐⭐⭐ | |
Attention U-Net: Learning Where to Look for the Pancreas | Attention U-Net enhances U-Net by integrating attention mechanisms, allowing the model to focus on relevant regions for improved image segmentation. | Attention U-Net Jupyter Notebook | ⭐⭐⭐ | ||
Row 2 | Regular Value | Regular Value | |||
Natural Langue Processing | |||||
Reinforcement Learning | |||||
Multi Model | |||||
Learning Transferable Visual Models From Natural Language Supervision | CLIP (Contrastive Language–Image Pretraining) aligns images and text by training on diverse datasets, enabling versatile visual-text understanding | CLIP Jupyter Notebook | ⭐⭐⭐⭐⭐ |