Seguir
Minchen Yu
Minchen Yu
The Chinese University of Hong Kong, Shenzhen
Email confirmado em cuhk.edu.cn - Página inicial
Título
Citado por
Citado por
Ano
{MArk}: Exploiting cloud services for {Cost-Effective},{SLO-Aware} machine learning inference serving
C Zhang, M Yu, W Wang, F Yan
2019 USENIX Annual Technical Conference (USENIX ATC 19), 1049-1062, 2019
3152019
Following the data, not the function: Rethinking function orchestration in serverless computing
M Yu, T Cao, W Wang, R Chen
20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023
67*2023
Gillis: Serving large neural networks in serverless functions with automatic model partitioning
M Yu, Z Jiang, HC Ng, W Wang, R Chen, B Li
2021 IEEE 41st International Conference on Distributed Computing Systems …, 2021
632021
Continuum: A platform for cost-aware, low-latency continual learning
H Tian, M Yu, W Wang
Proceedings of the ACM Symposium on Cloud Computing, 26-40, 2018
402018
Enabling cost-effective, slo-aware machine learning inference serving on public cloud
C Zhang, M Yu, W Wang, F Yan
IEEE Transactions on Cloud Computing 10 (3), 1765-1779, 2020
312020
FaaSwap: SLO-Aware, GPU-Efficient Serverless Inference via Model Swapping
M Yu, A Wang, D Chen, H Yu, X Luo, Z Li, W Wang, R Chen, D Nie, ...
arXiv preprint arXiv:2306.03622, 2023
102023
CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference
S Li, H Lu, T Wu, M Yu, Q Weng, X Chen, Y Shan, B Yuan, W Wang
arXiv preprint arXiv:2401.11240, 2024
62024
{CrystalPerf}: Learning to Characterize the Performance of Dataflow Computation through Code Analysis
H Tian, M Yu, W Wang
2021 USENIX Annual Technical Conference (USENIX ATC 21), 253-267, 2021
52021
RepBun: Load-balanced, shuffle-free cluster caching for structured data
M Yu, Y Yu, Y Zheng, B Yang, W Wang
IEEE INFOCOM 2020-IEEE Conference on Computer Communications, 954-963, 2020
52020
Pheromone: Restructuring Serverless Computing With Data-Centric Function Orchestration
M Yu, T Cao, W Wang, R Chen
IEEE/ACM Transactions on Networking, 2024
2024
FaaSTube: Optimizing GPU-oriented Data Transfer for Serverless Computing
H Wu, J Deng, M Yu, Y Yu, Y Liu, H Fan, S Wu, W Wang
arXiv preprint arXiv:2411.01830, 2024
2024
Towards Usable, Efficient Serverless Computing Systems
M Yu
PQDT-Global, 2023
2023
O sistema não pode efectuar a operação agora. Tente mais tarde.
Artigos 1–12