CVPR 2020 论文开源项目合集,同时欢迎各位大佬提交issue,分享CVPR 2020开源项目
- CNN
- 图像分类
- 目标检测
- 3D目标检测
- 视频目标检测
- 目标跟踪
- 语义分割
- 实例分割
- 全景分割
- 视频目标分割
- 超像素分割
- NAS
- GAN
- Re-ID
- 3D点云(分类/分割/配准等)
- 人脸(识别/检测/重建等)
- 人体姿态估计(2D/3D)
- 人体解析
- 场景文本检测
- 场景文本识别
- 超分辨率
- 模型压缩/剪枝
- 视频理解/行为识别
- 人群计数
- 深度估计
- 6D目标姿态估计
- 手势估计
- 显著性检测
- 去噪
- 去模糊
- 去雾
- 特征点检测与描述
- 视觉问答(VQA)
- 视频问答(VideoQA)
- 视觉语言导航
- 视频压缩
- 视频插值
- 风格迁移
- 车道线检测
- "人-物"交互(HOI)检测
- 行为轨迹预测
- 运动预测
- 虚拟试衣
- HDR
- 对抗样本
- 语义场景补全
- 数据集
- 其他
- 不确定中没中
Exploring Self-attention for Image Recognition
Improving Convolutional Networks with Self-Calibrated Convolutions
Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets
Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion
Spatially Attentive Output Layer for Image Classification
Dynamic Refinement Network for Oriented and Densely Packed Object Detection
Scale-Equalizing Pyramid Convolution for Object Detection
论文:https://arxiv.org/abs/2005.03101
代码:https://github.com/jshilong/SEPC
Revisiting the Sibling Head in Object Detector
Scale-equalizing Pyramid Convolution for Object Detection
- 论文:暂无
- 代码:https://github.com/jshilong/SEPC
Detection in Crowded Scenes: One Proposal, Multiple Predictions
Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
BiDet: An Efficient Binarized Object Detector
Harmonizing Transferability and Discriminability for Adapting Object Detectors
CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection
Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
EfficientDet: Scalable and Efficient Object Detection
Train in Germany, Test in The USA: Making 3D Object Detectors Generalize
MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
3DSSD: Point-based 3D Single Stage Object Detector
-
CVPR 2020 Oral
Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation
End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
DSGN: Deep Stereo Geometry Network for 3D Object Detection
LiDAR-based Online 3D Video Object Detection with Graph-based Message Passing and Spatiotemporal Transformer Attention
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud
Memory Enhanced Global-Local Aggregation for Video Object Detection
论文:https://arxiv.org/abs/2003.12063
代码:https://github.com/Scalsol/mega.pytorch
D3S -- A Discriminative Single Shot Segmentation Tracker
ROAM: Recurrently Optimizing Tracking Model
Siam R-CNN: Visual Tracking by Re-Detection
- 主页:https://www.vision.rwth-aachen.de/page/siamrcnn
- 论文:https://arxiv.org/abs/1911.12836
- 论文2:https://www.vision.rwth-aachen.de/media/papers/192/siamrcnn.pdf
- 代码:https://github.com/VisualComputingInstitute/SiamR-CNN
Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises
High-Performance Long-Term Tracking with Meta-Updater
AutoTrack: Towards High-Performance Visual Tracking for UAV with Automatic Spatio-Temporal Regularization
Probabilistic Regression for Visual Tracking
MAST: A Memory-Augmented Self-supervised Tracker
Siamese Box Adaptive Network for Visual Tracking
Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation
Single-Stage Semantic Segmentation from Image Labels
Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation
- 论文:https://arxiv.org/abs/2003.00867
- 代码:https://github.com/MyeongJin-Kim/Learning-Texture-Invariant-Representation
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation
CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement
Unsupervised Intra-domain Adaptation for Semantic Segmentation through Self-Supervision
Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
Temporally Distributed Networks for Fast Video Segmentation
Context Prior for Scene Segmentation
Strip Pooling: Rethinking Spatial Pooling for Scene Parsing
Cars Can't Fly up in the Sky: Improving Urban-Scene Segmentation via Height-driven Attention Networks
Learning Dynamic Routing for Semantic Segmentation
PolarMask: Single Shot Instance Segmentation with Polar Representation
- 论文:https://arxiv.org/abs/1909.13226
- 代码:https://github.com/xieenze/PolarMask
- 解读:https://zhuanlan.zhihu.com/p/84890413
CenterMask : Real-Time Anchor-Free Instance Segmentation
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
Deep Snake for Real-Time Instance Segmentation
Mask Encoding for Single Shot Instance Segmentation
Pixel Consensus Voting for Panoptic Segmentation
- 论文:https://arxiv.org/abs/2004.01849
- 代码:还未公布
BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation
论文:https://arxiv.org/abs/2003.14031
代码:https://github.com/Mooonside/BANet
A Transductive Approach for Video Object Segmentation
State-Aware Tracker for Real-Time Video Object Segmentation
Learning Fast and Robust Target Models for Video Object Segmentation
Learning Video Object Segmentation from Unlabeled Videos
Superpixel Segmentation with Fully Convolutional Networks
AOWS: Adaptive and optimal network width search with latency constraints
Densely Connected Search Space for More Flexible Neural Architecture Search
MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning
FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions
Neural Architecture Search for Lightweight Non-Local Networks
Rethinking Performance Estimation in Neural Architecture Search
- 论文:https://arxiv.org/abs/2005.09917
- 代码:https://github.com/zhengxiawu/rethinking_performance_estimation_in_NAS
- 解读1:https://www.zhihu.com/question/372070853/answer/1035234510
- 解读2:https://zhuanlan.zhihu.com/p/111167409
CARS: Continuous Evolution for Efficient Neural Architecture Search
Semantically Mutil-modal Image Synthesis
- 主页:http://seanseattle.github.io/SMIS
- 论文:https://arxiv.org/abs/2003.12697
- 代码:https://github.com/Seanseattle/SMIS
Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping
- 论文:https://yiranran.github.io/files/CVPR2020_Unpaired%20Portrait%20Drawing%20Generation%20via%20Asymmetric%20Cycle%20Mapping.pdf
- 代码:https://github.com/yiranran/Unpaired-Portrait-Drawing
Learning to Cartoonize Using White-box Cartoon Representations
-
论文:https://github.com/SystemErrorWang/White-box-Cartoonization/blob/master/paper/06791.pdf
-
主页:https://systemerrorwang.github.io/White-box-Cartoonization/
-
代码:https://github.com/SystemErrorWang/White-box-Cartoonization
GAN Compression: Efficient Architectures for Interactive Conditional GANs
Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Distributions
COCAS: A Large-Scale Clothes Changing Person Dataset for Re-identification
-
数据集:暂无
Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking
Pose-guided Visible Part Matching for Occluded Person ReID
Weakly supervised discriminative feature learning with state information for person identification
Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
Grid-GCN for Fast and Scalable Point Cloud Learning
FPConv: Learning Local Flattening for Point Convolution
PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
Weakly Supervised Semantic Point Cloud Segmentation:Towards 10X Fewer Labels
PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation
Learning to Segment 3D Point Clouds in 2D Image Space
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features
RPM-Net: Robust Point Matching using Learned Features
Cascaded Refinement Network for Point Cloud Completion
CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition
Learning Meta Face Recognition in Unseen Domains
- 论文:https://arxiv.org/abs/2003.07733
- 代码:https://github.com/cleardusk/MFR
- 解读:https://mp.weixin.qq.com/s/YZoEnjpnlvb90qSI3xdJqQ
Searching Central Difference Convolutional Networks for Face Anti-Spoofing
Suppressing Uncertainties for Large-Scale Facial Expression Recognition
Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images
AvatarMe: Realistically Renderable 3D Facial Reconstruction "in-the-wild"
FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
- 论文:https://arxiv.org/abs/1911.07524
- 代码:https://github.com/HuangJunJie2017/UDP-Pose
- 解读:https://zhuanlan.zhihu.com/p/92525039
Distribution-Aware Coordinate Representation for Human Pose Estimation
Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A Geometric Approach
Bodies at Rest: 3D Human Pose and Shape Estimation from a Pressure Image using Synthetic Data
Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis
Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation
Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose Estimation
Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS
- 论文:https://arxiv.org/abs/2003.03972
- 数据集:暂无
Correlating Edge, Pose with Parsing
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
- 论文:https://arxiv.org/abs/2002.10200
- 代码(即将开源):https://github.com/Yuliang-Liu/bezier_curve_text_spotting
- 代码(即将开源):https://github.com/aim-uofa/adet
Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection
SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition
Structure-Preserving Super Resolution with Gradient Guidance
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy
论文:https://arxiv.org/abs/2004.00448
代码:https://github.com/clovaai/cutblur
Space-Time-Aware Multi-Resolution Video Enhancement
- 主页:https://alterzero.github.io/projects/STAR.html
- 论文:http://arxiv.org/abs/2003.13170
- 代码:https://github.com/alterzero/STARnet
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
DMCP: Differentiable Markov Channel Pruning for Neural Networks
Forward and Backward Information Retention for Accurate Binary Neural Networks
Towards Efficient Model Compression via Learned Global Ranking
HRank: Filter Pruning using High-Rank Feature Map
GAN Compression: Efficient Architectures for Interactive Conditional GANs
Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression
Intra- and Inter-Action Understanding via Temporal Action Parsing
3DV: 3D Dynamic Voxel for Action Recognition in Depth Video
FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding
TEA: Temporal Excitation and Aggregation for Action Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Temporal Pyramid Network for Action Recognition
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
Focus on defocus: bridging the synthetic to real domain gap for depth estimation
Bi3D: Stereo Depth Estimation via Binary Classifications
AANet: Adaptive Aggregation Network for Efficient Stereo Matching
Towards Better Generalization: Joint Depth-Pose Learning without PoseNet
On the uncertainty of self-supervised monocular depth estimation
3D Packing for Self-Supervised Monocular Depth Estimation
- 论文:https://arxiv.org/abs/1905.02693
- 代码:https://github.com/TRI-ML/packnet-sfm
- Demo视频:https://www.bilibili.com/video/av70562892/
Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real Domain Shift and Improve Depth Estimation
MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion
EPOS: Estimating 6D Pose of Objects with Symmetries
主页:http://cmp.felk.cvut.cz/epos
论文:https://arxiv.org/abs/2004.00605
G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation
Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data
JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection
UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders
A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising
CycleISP: Real Image Restoration via Improved Data Synthesis
Multi-Scale Progressive Fusion Network for Single Image Deraining
Cascaded Deep Video Deblurring Using Temporal Sharpness Prior
- 主页:https://csbhr.github.io/projects/cdvd-tsp/index.html
- 论文:https://arxiv.org/abs/2004.02501
- 代码:https://github.com/csbhr/CDVD-TSP
Multi-Scale Boosted Dehazing Network with Dense Feature Fusion
ASLFeat: Learning Local Features of Accurate Shape and Localization
VC R-CNN:Visual Commonsense R-CNN
Hierarchical Conditional Relation Networks for Video Question Answering
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement
Space-Time-Aware Multi-Resolution Video Enhancement
- 主页:https://alterzero.github.io/projects/STAR.html
- 论文:http://arxiv.org/abs/2003.13170
- 代码:https://github.com/alterzero/STARnet
Scene-Adaptive Video Frame Interpolation via Meta-Learning
Softmax Splatting for Video Frame Interpolation
- 主页:http://sniklaus.com/papers/softsplat
- 论文:https://arxiv.org/abs/2003.05534
- 代码:https://github.com/sniklaus/softmax-splatting
Diversified Arbitrary Style Transfer via Deep Feature Perturbation
Collaborative Distillation for Ultra-Resolution Universal Style Transfer
Inter-Region Affinity Distillation for Road Marking Segmentation
Detailed 2D-3D Joint Representation for Human-Object Interaction
Cascaded Human-Object Interaction Recognition
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
Collaborative Motion Prediction via Neural Motion Message Passing
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps
Towards Photo-Realistic Virtual Try-On by Adaptively Generating↔Preserving Image Content
Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline
Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance
3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior
Intra- and Inter-Action Understanding via Temporal Action Parsing
Dynamic Refinement Network for Oriented and Densely Packed Object Detection
COCAS: A Large-Scale Clothes Changing Person Dataset for Re-identification
-
数据集:暂无
KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation
AvatarMe: Realistically Renderable 3D Facial Reconstruction "in-the-wild"
Learning to Autofocus
- 论文:https://arxiv.org/abs/2004.12260
- 数据集:暂无
FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
Bodies at Rest: 3D Human Pose and Shape Estimation from a Pressure Image using Synthetic Data
FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding
A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
Deep Homography Estimation for Dynamic Scenes
Assessing Image Quality Issues for Real-World Problems
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World
PANDA: A Gigapixel-level Human-centric Video Dataset
IntrA: 3D Intracranial Aneurysm Dataset for Deep Learning
Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS
- 论文:https://arxiv.org/abs/2003.03972
- 数据集:暂无
Equalization Loss for Long-Tailed Object Recognition
Instance-aware Image Colorization
- 主页:https://ericsujw.github.io/InstColorization/
- 论文:https://arxiv.org/abs/2005.10825
- 代码:https://github.com/ericsujw/InstColorization
Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching
Epipolar Transformers
Bringing Old Photos Back to Life
MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask
Self-Supervised Viewpoint Learning from Image Collections
- 论文:https://arxiv.org/abs/2004.01793
- 论文2:https://research.nvidia.com/sites/default/files/pubs/2020-03_Self-Supervised-Viewpoint-Learning/SSV-CVPR2020.pdf
- 代码:https://github.com/NVlabs/SSV
Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations
Towards Learning Structure via Consensus for Face Segmentation and Parsing
Plug-and-Play Algorithms for Large-scale Snapshot Compressive Imaging
Lightweight Photometric Stereo for Facial Details Recovery
Footprints and Free Space from a Single Color Image
Self-Supervised Monocular Scene Flow Estimation
Quasi-Newton Solver for Robust Non-Rigid Registration
A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
DeepFLASH: An Efficient Network for Learning-based Medical Image Registration
Self-Supervised Scene De-occlusion
- 主页:https://xiaohangzhan.github.io/projects/deocclusion/
- 论文:https://arxiv.org/abs/2004.02788
- 代码:https://github.com/XiaohangZhan/deocclusion
Polarized Reflection Removal with Perfect Alignment in the Wild
- 主页:https://leichenyang.weebly.com/project-polarized.html
- 代码:https://github.com/ChenyangLEI/CVPR2020-Polarized-Reflection-Removal-with-Perfect-Alignment
Background Matting: The World is Your Green Screen
What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective
Look-into-Object: Self-supervised Structure Modeling for Object Recognition
- 论文:暂无
- 代码:https://github.com/JDAI-CV/LIO
Video Object Grounding using Semantic Roles in Language Description
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives
SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization
- 论文:http://www.cs.umd.edu/~yuejiang/papers/SDFDiff.pdf
- 代码:https://github.com/YueJiang-nj/CVPR2020-SDFDiff
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
GhostNet: More Features from Cheap Operations
AdderNet: Do We Really Need Multiplications in Deep Learning?
Deep Image Harmonization via Domain Verification
Blurry Video Frame Interpolation
Extremely Dense Point Correspondences using a Learned Feature Descriptor
- 论文:https://arxiv.org/abs/2003.00619
- 代码:https://github.com/lppllppl920/DenseDescriptorLearning-Pytorch
Filter Grafting for Deep Neural Networks
- 论文:https://arxiv.org/abs/2001.05868
- 代码:https://github.com/fxmeng/filter-grafting
- 论文解读:https://www.zhihu.com/question/372070853/answer/1041569335
Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
Detecting Attended Visual Targets in Video
Deep Image Spatial Transformation for Person Image Generation
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
https://github.com/charlesCXK/3D-SketchAware-SSC
https://github.com/Anonymous20192020/Anonymous_CVPR5767
https://github.com/avirambh/ScopeFlow
https://github.com/csbhr/CDVD-TSP
https://github.com/ymcidence/TBH
https://github.com/yaoyao-liu/mnemonics
https://github.com/meder411/Tangent-Images
https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch
https://github.com/sjmoran/deep_local_parametric_filters
https://github.com/charlesCXK/3D-SketchAware-SSC
https://github.com/bermanmaxim/AOWS
https://github.com/dc3ea9f/look-into-object
FADNet: A Fast and Accurate Network for Disparity Estimation
- 论文:还没出来
- 代码:https://github.com/HKBU-HPML/FADNet
https://github.com/rFID-submit/RandomFID:不确定中没中
https://github.com/JackSyu/AE-MSR:不确定中没中
https://github.com/fastconvnets/cvpr2020:不确定中没中
https://github.com/aimagelab/meshed-memory-transformer:不确定中没中
https://github.com/TWSFar/CRGNet:不确定中没中
https://github.com/CVPR-2020/CDARTS:不确定中没中
https://github.com/anucvml/ddn-cvprw2020:不确定中没中
https://github.com/dl-model-recommend/model-trust:不确定中没中
https://github.com/apratimbhattacharyya18/CVPR-2020-Corr-Prior:不确定中没中
https://github.com/onetcvpr/O-Net:不确定中没中
https://github.com/502463708/Microcalcification_Detection:不确定中没中
https://github.com/anonymous-for-review/cvpr-2020-deep-smoke-machine:不确定中没中
https://github.com/anonymous-for-review/cvpr-2020-smoke-recognition-dataset:不确定中没中
https://github.com/cvpr-nonrigid/dataset:不确定中没中
https://github.com/theFool32/PPBA:不确定中没中
https://github.com/Realtime-Action-Recognition/Realtime-Action-Recognition