Seguir
Yang Wang
Título
Citado por
Citado por
Ano
Dual-side sparse tensor core
Y Wang, C Zhang, Z Xie, C Guo, Y Liu, J Leng
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021
912021
Ladabert: Lightweight adaptation of bert through hybrid model compression
Y Mao, Y Wang, C Wu, C Zhang, Y Wang, Y Yang, Q Zhang, Y Tong, J Bai
Proceedings of the 28th International Conference on Computational …, 2020
712020
Towards efficient vision transformer inference: A first study of transformers on mobile devices
X Wang, LL Zhang, Y Wang, M Yang
Proceedings of the 23rd annual international workshop on mobile computing …, 2022
522022
{SparTA}:{Deep-Learning} Model Sparsity via {Tensor-with-Sparsity-Attribute}
N Zheng, B Lin, Q Zhang, L Ma, Y Yang, F Yang, Y Wang, M Yang, L Zhou
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
452022
MOSC: A method to assign the outsourcing of service function chain across multiple clouds
H Chen, X Wang, Y Zhao, T Song, Y Wang, S Xu, L Li
Computer Networks 133, 166-182, 2018
422018
Adaptive page migration policy with huge pages in tiered memory systems
T Heo, Y Wang, W Cui, J Huh, L Zhang
IEEE Transactions on Computers 71 (1), 53-68, 2020
252020
Towards optimal outsourcing of service function chain across multiple clouds
H Chen, S Xu, X Wang, Y Zhao, K Li, Y Wang, W Wang
2016 IEEE International Conference on Communications (ICC), 1-7, 2016
212016
LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup
X Tang, Y Wang, T Cao, LL Zhang, Q Chen, D Cai, Y Liu, M Yang
MobiCom '23: Proceedings of the 29th Annual International Conference on …, 2023
18*2023
Romou: Rapidly Generate High-Performance Tensor Kernels for Mobile GPUs
R Liang, T Cao, J Wen, M Wang, Y Wang, J Zou, Y Liu
MobiCom '22: Proceedings of the 28th Annual International Conference on …, 2022
152022
FlexMon: A flexible and fine-grained traffic monitor for programmable networks
Y Wang, X Wang, S Xu, C He, Y Zhang, J Ren, S Yu
Journal of Network and Computer Applications 201, 103344, 2022
82022
PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization
C Li, Z Zhou, Y Wang, F Yang, T Cao, M Yang, Y Liang, G Sun
International Conference on Architectural Support for Programming Languages …, 2024
42024
Low complexity hierarchical scheduling for diverse datacenter jobs
C You, Y Wang, S Xu, L Luo, MH Chen
IEEE Communications Letters 23 (1), 48-51, 2018
32018
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Y Liu, J Wen, Y Wang, S Ye, LL Zhang, T Cao, C Li, M Yang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024
22024
NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering
Z Zhou, Y Chen, T Zhang, Y Wang, R Shu, S Xu, P Cheng, L Qu, Y Xiong, ...
arXiv preprint arXiv:2403.18702, 2024
2*2024
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
Y Chen, AF AbouElhamayed, X Dai, Y Wang, M Andronic, ...
arXiv preprint arXiv:2411.11745, 2024
12024
DSTC: Dual-Side Sparsity Tensor Core for DNNs Acceleration on Modern GPU Architectures
C Zhang, Y Wang, Z Xie, C Guo, Y Liu, J Leng, G Sun, Z Ji, R Wang, Y Xie, ...
IEEE Transactions on Computers, 2024
12024
NeuralMon: Graph neural network for flow measurement allocation
Y Wang, X Wang, Z Huang, C He, Y Zhang, S Xu
2021 IEEE Global Communications Conference (GLOBECOM), 1-6, 2021
12021
LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator
G Li, S Ye, C Chen, Y Wang, F Yang, T Cao, C Liu, MM Sabry, M Yang
IEEE International Symposium on High-Performance Computer Architecture, 2025
2025
O sistema não pode efectuar a operação agora. Tente mais tarde.
Artigos 1–18