Look, listen, and attend: Co-attention network for self-supervised audio-visual representation learning Y Cheng, R Wang, Z Pan, R Feng, Y Zhang Proceedings of the 28th ACM International Conference on Multimedia, 3884-3892, 2020 | 122 | 2020 |
Mm-pyramid: Multimodal pyramid attentional network for audio-visual event localization and video parsing J Yu, Y Cheng, RW Zhao, R Feng, Y Zhang Proceedings of the 30th ACM international conference on multimedia, 6241-6249, 2022 | 54 | 2022 |
Modality-aware contrastive instance learning with self-distillation for weakly-supervised audio-visual violence detection J Yu, J Liu, Y Cheng, R Feng, Y Zhang Proceedings of the 30th ACM international conference on multimedia, 6278-6287, 2022 | 39 | 2022 |
Mpn: Multimodal parallel network for audio-visual event localization J Yu, Y Cheng, R Feng 2021 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2021 | 24 | 2021 |
Keep it consistent: Topic-aware storytelling from an image stream via iterative multi-agent communication R Wang, Z Wei, Y Cheng, P Li, H Shan, J Zhang, Q Zhang, X Huang arXiv preprint arXiv:1911.04192, 2019 | 15 | 2019 |
Idea: Increasing text diversity via online multi-label recognition for vision-language pre-training X Huang, Y Zhang, Y Cheng, W Tian, R Zhao, R Feng, Y Zhang, Y Li, ... Proceedings of the 30th ACM International Conference on Multimedia, 4573-4583, 2022 | 14 | 2022 |
Exploring logical reasoning for referring expression comprehension Y Cheng, R Wang, J Yu, RW Zhao, Y Zhang, R Feng Proceedings of the 29th ACM International Conference on Multimedia, 5047-5055, 2021 | 13 | 2021 |
Improving multimodal speech enhancement by incorporating self-supervised and curriculum learning Y Cheng, M He, J Yu, R Feng ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 6 | 2021 |
Learning Music-Dance Representations through Explicit-Implicit Rhythm Synchronization J Yu, J Pu, Y Cheng, R Feng, Y Shan IEEE Transactions on Multimedia, 2023 | 5 | 2023 |
Self-supervised learning of music-dance representation through explicit-implicit rhythm synchronization J Yu, J Pu, Y Cheng, R Feng, Y Shan arXiv preprint arXiv:2207.03190, 2022 | 5 | 2022 |
Self-Supervised Video Representation Learning with Motion-Contrastive Perception J Liu, Y Cheng, Y Zhang, RW Zhao, R Feng 2022 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2022 | 3 | 2022 |
FineMedLM-o1: Enhancing the Medical Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training H Yu, T Cheng, Y Cheng, R Feng arXiv preprint arXiv:2501.09213, 2025 | | 2025 |
CT2C-QA: Multimodal Question Answering over Chinese Text, Table and Chart B Zhao, T Cheng, Y Zhang, Y Cheng, R Feng, X Zhang Proceedings of the 32nd ACM International Conference on Multimedia, 3897-3906, 2024 | | 2024 |
DeepPointMap2: Accurate and Robust LiDAR-Visual SLAM with Neural Descriptors X Zhang, Z Ding, Q Jing, Y Cheng, W Ding, R Feng Proceedings of the 32nd ACM International Conference on Multimedia, 9475-9484, 2024 | | 2024 |
ADSNet: Cross-Domain LTV Prediction with an Adaptive Siamese Network in Advertising R Wang, H Xu, Y Cheng, Q He, X Zhou, R Feng, W Xu, L Huang, J Jiang Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and …, 2024 | | 2024 |
Mixtures of Experts for Audio-Visual Learning Y Cheng, Y Li, J He, R Feng The Thirty-eighth Annual Conference on Neural Information Processing Systems, 0 | | |