Licheng Yu 虞立成

Cited by

	All	Since 2019
Citations	8129	7718
h-index	25	24
i10-index	35	34

2400

1200

600

1800

201520162017201820192020202120222023202421 51 105 193 352 619 1169 1824 2344 1406

Public access

View all

11 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Tamara L BergAssociate Professor, Computer Science, UNC Chapel HillVerified email at cs.unc.edu
Mohit BansalParker Distinguished Professor, Computer Science, UNC Chapel HillVerified email at cs.unc.edu
Zhe GanResearch Scientist, AppleVerified email at apple.com
Yu ChengThe Chinese University of Hong KongVerified email at cse.cuhk.edu.hk
Yen-Chun ChenResearcher, MicrosoftVerified email at microsoft.com
Jie Lei 雷杰Research Scientist, Meta AIVerified email at fb.com
Alexander C BergProfessor of Computer Science, University of California IrvineVerified email at uci.edu
Shan YangSr. Applied Scientist, Amazon A9 | XGoolgerVerified email at amazon.com
Hongteng Xu 许洪腾Associate Professor, Renmin University of ChinaVerified email at ruc.edu.cn
Hao TanAdobe ResearchVerified email at adobe.com
Xiaohui ShenByteDance ResearchVerified email at bytedance.com

Licheng Yu 虞立成

Research Scientist and Manager, Facebook AI

Verified email at fb.com - Homepage

Computer Vision Natural Language Processing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
UNITER: UNiversal Image-TExt Representation Learning YC Chen, L Li, L Yu*, A El Kholy, F Ahmed, Z Gan, Y Cheng, J Liu ECCV, 2020	2401*	2020
Modeling context in referring expressions L Yu, P Poirson, S Yang, AC Berg, TL Berg Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The …, 2016	1083	2016
Mattnet: Modular attention network for referring expression comprehension L Yu, Z Lin, X Shen, J Yang, X Lu, M Bansal, TL Berg Proceedings of the IEEE conference on computer vision and pattern …, 2018	825	2018
Tvqa: Localized, compositional video question answering J Lei, L Yu, M Bansal, TL Berg arXiv preprint arXiv:1809.01696, 2018	638	2018
Hero: Hierarchical encoder for video+ language omni-representation pre-training L Li, YC Chen, Y Cheng, Z Gan, L Yu, J Liu arXiv preprint arXiv:2005.00200, 2020	499	2020
Learning to navigate unseen environments: Back translation with environmental dropout H Tan, L Yu, M Bansal arXiv preprint arXiv:1904.04195, 2019	312	2019
Visual madlibs: Fill in the blank description generation and question answering L Yu, E Park, AC Berg, TL Berg Proceedings of the ieee international conference on computer vision, 2461-2469, 2015	304*	2015
A joint speaker-listener-reinforcer model for referring expressions L Yu, H Tan, M Bansal, TL Berg Proceedings of the IEEE conference on computer vision and pattern …, 2017	294	2017
Tvr: A large-scale dataset for video-subtitle moment retrieval J Lei, L Yu, TL Berg, M Bansal Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020	247	2020
Tvqa+: Spatio-temporal grounding for video question answering J Lei, L Yu, TL Berg, M Bansal arXiv preprint arXiv:1904.11574, 2019	231	2019
Physics-inspired garment recovery from a single-view image S Yang, Z Pan, T Amert, K Wang, L Yu, T Berg, MC Lin ACM Transactions on Graphics (TOG) 37 (5), 1-14, 2018	152*	2018
Vector sparse representation of color image using quaternion matrix analysis Y Xu, L Yu, H Xu, H Zhang, T Nguyen IEEE Transactions on image processing 24 (4), 1315-1329, 2015	152	2015
Behind the scene: Revealing the secrets of pre-trained vision-and-language models J Cao, Z Gan, Y Cheng, L Yu, YC Chen, J Liu Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020	143	2020
Value: A multi-task benchmark for video-and-language understanding evaluation L Li, J Lei, Z Gan, L Yu, YC Chen, R Pillai, Y Cheng, L Zhou, XE Wang, ... arXiv preprint arXiv:2106.04632, 2021	103	2021
Multi-target embodied question answering L Yu, X Chen, G Gkioxari, M Bansal, TL Berg, D Batra Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019	100	2019
Hierarchically-attentive rnn for album summarization and storytelling L Yu, M Bansal, TL Berg arXiv preprint arXiv:1708.02977, 2017	80	2017
Violin: A large-scale dataset for video-and-language inference J Liu, W Chen, Y Cheng, Z Gan, L Yu, Y Yang, J Liu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020	70	2020
What is more likely to happen next? video-and-language future event prediction J Lei, L Yu, TL Berg, M Bansal arXiv preprint arXiv:2010.07999, 2020	60	2020
Bachgan: High-resolution image synthesis from salient object layout Y Li, Y Cheng, Z Gan, L Yu, L Wang, J Liu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020	49	2020
Fashionvil: Fashion-focused vision-and-language representation learning X Han, L Yu, X Zhu, L Zhang, YZ Song, T Xiang European conference on computer vision, 634-651, 2022	37	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors