Deyao Zhu

Citado por

	Todos	Desde 2019
Citações	1913	1913
Índice h	9	9
Índice i10	8	8

1400

700

350

1050

20222023202410 574 1327

Coautores

Mohamed Elhoseiny, Ph.D.Associate Professor, KAUST (hiring postdocs & grad students)Email confirmado em kaust.edu.sa
Jun ChenKAUSTEmail confirmado em kaust.edu.sa
Xiaoqian ShenCS PhD @ KAUSTEmail confirmado em kaust.edu.sa
Xiang LiKAUSTEmail confirmado em kaust.edu.sa
Li Erran LiIEEE Fellow and ACM Fellow, AWS AI, AmazonEmail confirmado em cs.columbia.edu
Abduallah MohamedApplied Research Scientist, Meta Reality LabsEmail confirmado em fb.com
Mohamed ZahranUdacityEmail confirmado em udacity.com

Seguir

Deyao Zhu

Research Scientist, ByteDance

Email confirmado em bytedance.com - Página inicial

AGI Vision Language Models AI Agents


Título Ordenar por citações Ordenar por ano Ordenar por título	Citado por Citado por	Ano
MiniGPT-4: Enhancing vision-language understanding with advanced large language models D Zhu, J Chen, X Shen, X Li, M Elhoseiny International Conference on Learning Representations 2024, 2023	1409	2023
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning J Chen, D Zhu, X Shen, X Li, Z Liu, P Zhang, R Krishnamoorthi, ... 2nd MMFM Workshop in CVPR2024, 2023	270	2023
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions D Zhu, J Chen, K Haydarov, X Shen, W Zhang, M Elhoseiny Transactions on Machine Learning Research (TMLR), 2023	73	2023
Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation A Mohamed, D Zhu, W Vu, M Elhoseiny, C Claudel European Conference on Computer Vision (ECCV) 2022, 2022	51	2022
Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions J Chen, D Zhu, K Haydarov, X Li, M Elhoseiny arXiv preprint arXiv:2304.04227, 2023	26	2023
Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only J Chen, D Zhu, G Qian, B Ghanem, Z Yan, C Zhu, F Xiao, SC Culatana, ... Proceedings of the IEEE/CVF International Conference on Computer Vision, 699-710, 2023	23*	2023
Motion forecasting with unlikelihood training in continuous space D Zhu, M Zahran, LE Li, M Elhoseiny Conference on Robot Learning, 1003-1012, 2022	15	2022
RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition J Chen, A Agarwal, S Abdelkarim, D Zhu, M Elhoseiny Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	15*	2022
Minigpt4-video: Advancing multimodal llms for video understanding with interleaved visual-textual tokens K Ataallah, X Shen, E Abdelrahman, E Sleiman, D Zhu, J Ding, ... 2nd MMFM Workshop in CVPR2024, 2024	9	2024
HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents D Zhu, M Zahran, LE Li, M Elhoseiny International Conference on Learning Representations, 2021, 2021	7	2021
Guiding Online Reinforcement Learning with Action-Free Offline Pretraining D Zhu, Y Wang, J Schmidhuber, M Elhoseiny arXiv preprint arXiv:2301.12876, 2023	6	2023
Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning D Zhu, LE Li, M Elhoseiny International Conference on Learning Representations 2023, 2022	5	2022
Learning to disentangle latent physical factors for video prediction D Zhu, M Munderloh, B Rosenhahn, J Stückler Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Dortmund …, 2019	4	2019
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos K Ataallah, X Shen, E Abdelrahman, E Sleiman, M Zhuge, J Ding, D Zhu, ... European Conference on Computer Vision (ECCV) 2024, 2024		2024
MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis A Alkhaldi, R Alnajim, L Alabdullatef, R Alyahya, J Chen, D Zhu, A Alsinan, ... arXiv preprint arXiv:2407.04106, 2024		2024

O sistema não pode efectuar a operação agora. Tente mais tarde.

Artigos 1–15

Citações por ano

Citações duplicadas

Citações unidas

Adicionar coautoresCoautores

Seguir

Citado por

Coautores