Swin transformer: Hierarchical vision transformer using shifted windows Z Liu, Y Lin, Y Cao, H Hu, Y Wei, Z Zhang, S Lin, B Guo Proceedings of the IEEE/CVF international conference on computer vision …, 2021 | 12238 | 2021 |
Swin transformer v2: Scaling up capacity and resolution Z Liu, H Hu, Y Lin, Z Yao, Z Xie, Y Wei, J Ning, Y Cao, Z Zhang, L Dong, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 850 | 2022 |
Simmim: A simple framework for masked image modeling Z Xie, Z Zhang, Y Cao, Y Lin, J Bao, Z Yao, Q Dai, H Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 692 | 2022 |
Propagate yourself: Exploring pixel-level consistency for unsupervised visual representation learning Z Xie, Y Lin, Z Zhang, Y Cao, S Lin, H Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 354 | 2021 |
Negative margin matters: Understanding margin in few-shot classification B Liu, Y Cao, Y Lin, Q Li, Z Zhang, M Long, H Hu Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 282 | 2020 |
Swin transformer: Hierarchical vision transformer using shifted windows, 2021 Z Liu, Y Lin, Y Cao, H Hu, Y Wei, Z Zhang, S Lin, B Guo arXiv preprint arXiv:2103.14030, 2021 | 176 | 2021 |
Self-supervised learning with swin transformers Z Xie, Y Lin, Z Yao, Z Zhang, Q Dai, Y Cao, H Hu arXiv preprint arXiv:2105.04553, 2021 | 134 | 2021 |
A simple baseline for zeroshot semantic segmentation with pre-trained vision-language model M Xu, Z Zhang, F Wei, Y Lin, Y Cao, H Hu, X Bai arXiv preprint arXiv:2112.14757 3, 2021 | 75 | 2021 |
A simple baseline for open-vocabulary semantic segmentation with pre-trained vision-language model M Xu, Z Zhang, F Wei, Y Lin, Y Cao, H Hu, X Bai European Conference on Computer Vision, 736-753, 2022 | 73 | 2022 |
Parametric instance classification for unsupervised visual feature learning Y Cao, Z Xie, B Liu, Y Lin, Z Zhang, H Hu Advances in neural information processing systems 33, 15614-15624, 2020 | 54 | 2020 |
Leveraging batch normalization for vision transformers Z Yao, Y Cao, Y Lin, Z Liu, Z Zhang, H Hu Proceedings of the IEEE/CVF International Conference on Computer Vision, 413-422, 2021 | 27 | 2021 |
On data scaling in masked image modeling Z Xie, Z Zhang, Y Cao, Y Lin, Y Wei, Q Dai, H Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 17 | 2023 |
Bootstrap your object detector via mixed training M Xu, Z Zhang, F Wei, Y Lin, Y Cao, S Lin, H Hu, X Bai Advances in Neural Information Processing Systems 34, 11315-11325, 2021 | 6 | 2021 |
Could Giant Pre-trained Image Models Extract Universal Representations? Y Lin, Z Liu, Z Zhang, H Hu, N Zheng, S Lin, Y Cao Advances in Neural Information Processing Systems 35, 8332-8346, 2022 | 5 | 2022 |
V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection Y Shen, Z Geng, Y Yuan, Y Lin, Z Liu, C Wang, H Hu, N Zheng, B Guo arXiv preprint arXiv:2308.04409, 2023 | 3 | 2023 |
Detr does not need multi-scale or locality design Y Lin, Y Yuan, Z Zhang, C Li, N Zheng, H Hu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 2 | 2023 |
A Simple Approach and Benchmark for 21,000-Category Object Detection Y Lin, C Li, Y Cao, Z Zhang, J Wang, L Wang, Z Liu, H Hu European Conference on Computer Vision, 1-18, 2022 | | 2022 |
Supplementary Materials for DETR Does Not Need Multi-Scale or Locality Design Y Lin, Y Yuan, Z Zhang, C Li, N Zheng, H Hu | | |
Supplementary Materials for SimMIM: A Simple Framework for Masked Image Modeling Z Xie, Z Zhang, Y Cao, Y Lin, J Bao, Z Yao, Q Dai, H Hu | | |