Shijie Cao

Citado por

	Todos	Desde 2019
Citações	479	479
Índice h	6	6
Índice i10	6	6

120

20192020202120222023202423 65 104 109 115 61

Acesso público

Ver tudo

3 artigos

0 artigos

disponível

não disponível

Com base em autorizações de financiamento

Coautores

Lingxiao MaSenior Researcher, Microsoft ResearchEmail confirmado em pku.edu.cn
Chen ZhangShanghai Jiao Tong UniversityEmail confirmado em sjtu.edu.cn
Wencong XiaoAlibaba GroupEmail confirmado em alibaba-inc.com
Lintao ZhangMicrosoft Research AsiaEmail confirmado em microsoft.com
Zhuliang YaoTsinghua UniversityEmail confirmado em mails.tsinghua.edu.cn
Fan YangMicrosoft ResearchEmail confirmado em microsoft.com
Ranggi HwangKAISTEmail confirmado em kaist.ac.kr
Derek ChiouProfessor, ECE, UT Austin and Partner Architect, Microsoft AzureEmail confirmado em ece.utexas.edu
Xu NingyiMicrosoft Research

Seguir

Shijie Cao

Microsoft Research Asia

Email confirmado em microsoft.com - Página inicial

Efficient Deep Learning Deep Learning System Computer Architecture


Título Ordenar por citações Ordenar por ano Ordenar por título	Citado por Citado por	Ano
Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity S Cao, C Zhang, Z Yao, W Xiao, L Nie, D Zhan, Y Liu, M Wu, L Zhang Proceedings of the 2019 ACM/SIGDA International Symposium on Field …, 2019	196	2019
Balanced sparsity for efficient dnn inference on gpu Z Yao, S Cao, W Xiao, C Zhang, L Nie Proceedings of the AAAI conference on artificial intelligence 33 (01), 5676-5683, 2019	120	2019
Seernet: Predicting convolutional neural network feature-map sparsity through low-bit quantization S Cao, L Ma, W Xiao, C Zhang, Y Liu, L Zhang, L Nie, Z Yang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019	83	2019
Dense-to-sparse gate for mixture-of-experts X Nie, S Cao, X Miao, L Ma, J Xue, Y Miao, Z Yang, Z Yang, CUI Bin	22	2021
Evomoe: An evolutional mixture-of-experts training framework via dense-to-sparse gate X Nie, X Miao, S Cao, L Ma, Q Liu, J Xue, Y Miao, Y Liu, Z Yang, B Cui arXiv preprint arXiv:2112.14397, 2021	21	2021
Integer or floating point? new outlooks for low-bit quantization on large language models Y Zhang, L Zhao, S Cao, W Wang, T Cao, F Yang, M Yang, S Zhang, N Xu arXiv preprint arXiv:2305.12356, 2023	12	2023
Bitdistiller: Unleashing the potential of sub-4-bit llms via self-distillation D Du, Y Zhang, S Cao, J Guo, T Cao, X Chu, N Xu arXiv preprint arXiv:2402.10631, 2024	5	2024
Efficient gpu kernels for n: m-sparse weights in deep learning B Lin, N Zheng, L Wang, S Cao, L Ma, Q Zhang, Y Zhu, T Cao, J Xue, ... Proceedings of Machine Learning and Systems 5, 513-525, 2023	5	2023
Pre-gated moe: An algorithm-system co-design for fast and scalable mixture-of-expert inference R Hwang, J Wei, S Cao, C Hwang, X Tang, T Cao, M Yang, M Rhu arXiv preprint arXiv:2308.12066, 2023	4	2023
Afpq: Asymmetric floating point quantization for llms Y Zhang, S Zhang, S Cao, D Du, J Wei, T Cao, N Xu arXiv preprint arXiv:2311.01792, 2023	3	2023
Nn-stretch: Automatic neural network branching for parallel inference on heterogeneous multi-processors J Wei, T Cao, S Cao, S Jiang, S Fu, M Yang, Y Zhang, Y Liu Proceedings of the 21st Annual International Conference on Mobile Systems …, 2023	3	2023
Accurate and structured pruning for efficient automatic speech recognition H Jiang, LL Zhang, Y Li, Y Wu, S Cao, T Cao, Y Yang, J Li, M Yang, L Qiu arXiv preprint arXiv:2305.19549, 2023	3	2023
Ladder: Enabling Efficient {Low-Precision} Deep Learning Computing through Hardware-aware Tensor Transformation L Wang, L Ma, S Cao, Q Zhang, J Xue, Y Shi, N Zheng, Z Miao, F Yang, ... 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024	1	2024
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training Y Zhang, Y Han, S Cao, G Dai, Y Miao, T Cao, F Yang, N Xu arXiv preprint arXiv:2305.19982, 2023	1	2023
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge J Wei, S Cao, T Cao, L Ma, L Wang, Y Zhang, M Yang arXiv preprint arXiv:2407.00088, 2024		2024
FlexSaaS: A Reconfigurable Accelerator for Web Search Selection S Cao, L Nie, D Zhan, W Wang, N Xu, R Das, M Wu, L Zhang, D Chiou ACM Transactions on Reconfigurable Technology and Systems (TRETS) 12 (1), 1-20, 2019		2019
The Case for Learning Machine Language G Liu, CJM Liang, S Cao, S Lu, L van Doorn

O sistema não pode efectuar a operação agora. Tente mais tarde.

Artigos 1–17

Citações por ano

Citações duplicadas

Citações unidas

Adicionar coautoresCoautores

Seguir

Citado por

Coautores