Cmmlu: Measuring massive multitask language understanding in chinese H Li, Y Zhang, F Koto, Y Yang, H Zhao, Y Gong, N Duan, T Baldwin arXiv preprint arXiv:2306.09212, 2023 | 82 | 2023 |
Can Large Language Model Comprehend Ancient Chinese? A Preliminary Test on ACLUE Y Zhang, H Li Ancient Language Processing Workshop, 2023, 2023 | 4 | 2023 |
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models L Li, G Chen, Y Su, Z Chen, Y Zhang, E Xing, K Zhang arXiv preprint arXiv:2402.12563, 2024 | 1 | 2024 |
Against The Achilles' Heel: A Survey on Red Teaming for Generative Models L Lin, H Mu, Z Zhai, M Wang, Y Wang, R Wang, J Gao, Y Zhang, W Che, ... arXiv preprint arXiv:2404.00629, 2024 | | 2024 |
Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents R Wang, H Li, X Han, Y Zhang, T Baldwin arXiv preprint arXiv:2402.11651, 2024 | | 2024 |