Follow
Ying Sheng
Title
Cited by
Cited by
Year
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality
WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ...
See https://vicuna. lmsys. org (accessed 14 April 2023) 2 (3), 6, 2023
1082*2023
Judging llm-as-a-judge with mt-bench and chatbot arena
L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ...
Advances in Neural Information Processing Systems 36, 2024
910*2024
cvc5: A versatile and industrial-strength SMT solver
H Barbosa, C Barrett, M Brain, G Kremer, H Lachnitt, M Mann, ...
International Conference on Tools and Algorithms for the Construction and …, 2022
3062022
Efficient memory management for large language model serving with pagedattention
W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ...
Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023
2492023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Re, ...
International Conference on Machine Learning, 2023
1192023
H2o: Heavy-hitter oracle for efficient generative inference of large language models
Z Zhang, Y Sheng, T Zhou, T Chen, L Zheng, R Cai, Z Song, Y Tian, C Ré, ...
Advances in Neural Information Processing Systems 36, 2024
482024
{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving
Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
482023
How Long Can Context Length of Open-Source LLMs truly Promise?
D Li, R Shao, A Xie, Y Sheng, L Zheng, J Gonzalez, I Stoica, X Ma, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023
46*2023
Subspace embedding and linear regression with orlicz norm
A Andoni, C Lin, Y Sheng, P Zhong, R Zhong
International Conference on Machine Learning, 224-233, 2018
362018
Distribution-free junta testing
X Chen, Z Liu, RA Servedio, Y Sheng, J Xie
STOC 2018, 2018
31*2018
Lmsys-chat-1m: A large-scale real-world llm conversation dataset
L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ...
arXiv preprint arXiv:2309.11998, 2023
262023
S-lora: Serving thousands of concurrent lora adapters
Y Sheng, S Cao, D Li, C Hooper, N Lee, S Yang, C Chou, B Zhu, L Zheng, ...
arXiv preprint arXiv:2311.03285, 2023
172023
Towards Optimal Caching and Model Selection for Large Model Inference
B Zhu, Y Sheng, L Zheng, C Barrett, M Jordan, J Jiao
Advances in Neural Information Processing Systems 36, 2024
9*2024
Efficiently programming large language models using sglang
L Zheng, L Yin, Z Xie, J Huang, C Sun, CH Yu, S Cao, C Kozyrakis, ...
arXiv preprint arXiv:2312.07104, 2023
92023
Politeness for the theory of algebraic datatypes
Y Sheng, Y Zohar, C Ringeissen, J Lange, P Fontaine, C Barrett
International Joint Conference on Automated Reasoning, 238-255, 2020
9*2020
On the approximation of Nash equilibria in sparse win-lose games
Z Liu, Y Sheng
Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018
92018
Politeness and stable infiniteness: Stronger together
Y Sheng, Y Zohar, C Ringeissen, A Reynolds, C Barrett, C Tinelli
Automated Deduction–CADE 28, 148, 2021
6*2021
Clover: Closed-Loop Verifiable Code Generation
C Sun, Y Sheng, O Padon, C Barrett
arXiv preprint arXiv:2310.17807, 2023
52023
Reasoning about vectors using an SMT theory of sequences
Y Sheng, A Nötzli, A Reynolds, Y Zohar, D Dill, W Grieskamp, J Park, ...
International Joint Conference on Automated Reasoning, 125-143, 2022
52022
Fairness in serving large language models
Y Sheng, S Cao, D Li, B Zhu, Z Li, D Zhuo, JE Gonzalez, I Stoica
arXiv preprint arXiv:2401.00588, 2023
42023
The system can't perform the operation now. Try again later.
Articles 1–20