Follow
Vijay Anand Korthikanti
Vijay Anand Korthikanti
Principal Research Scientist, Nvidia
Verified email at uiuc.edu
Title
Cited by
Cited by
Year
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 2022
4872022
Efficient large-scale language model training on gpu clusters using megatron-lm
D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ...
Proceedings of the International Conference for High Performance Computing …, 2021
4022021
Synthesizing geometry constructions
S Gulwani, VA Korthikanti, A Tiwari
ACM SIGPLAN Notices 46 (6), 50-61, 2011
1672011
Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, and Bryan Catanzaro
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
Using deepspeed and megatron to train megatron-turing nlg 530b, a large …, 2022
1192022
Reducing activation recomputation in large transformer models
VA Korthikanti, J Casper, S Lym, L McAfee, M Andersch, M Shoeybi, ...
Proceedings of Machine Learning and Systems 5, 2023
892023
Towards optimizing energy costs of algorithms for shared memory architectures
VA Korthikanti, G Agha
Proceedings of the twenty-second annual ACM symposium on Parallelism in …, 2010
752010
Analysis of parallel algorithms for energy conservation in scalable multicore architectures
VA Korthikanti, G Agha
2009 International Conference on Parallel Processing, 212-219, 2009
562009
Reasoning about MDPs as transformers of probability distributions
VA Korthikanti, M Viswanathan, G Agha, YM Kwon
2010 Seventh International Conference on the Quantitative Evaluation of …, 2010
502010
Model checking MDPs with a unique compact invariant set of distributions
R Chadha, VA Korthikanti, M Viswanathan, G Agha, YM Kwon
2011 Eighth International Conference on Quantitative Evaluation of SysTems …, 2011
232011
Fair k mutual exclusion algorithm for peer to peer systems
VA Reddy, P Mittal, I Gupta
2008 The 28th International Conference on Distributed Computing Systems, 655-662, 2008
202008
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv 2022
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 0
15
Re-vilm: Retrieval-augmented visual language model for zero and few-shot image captioning
Z Yang, W Ping, Z Liu, V Korthikanti, W Nie, DA Huang, L Fan, Z Yu, S Lan, ...
arXiv preprint arXiv:2302.04858, 2023
142023
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
Preprint published online January 28, 2022
132022
Energy-performance trade-off analysis of parallel algorithms
VA Korthikanti, G Agha
USENIX Workshop on Hot Topics in Parallelism (HotPar), 2010
122010
On the energy complexity of parallel algorithms
VA Korthikanti, G Agha, M Greenstreet
2011 International Conference on Parallel Processing, 562-570, 2011
112011
Avoiding energy wastage in parallel applications
VA Korthikanti, G Agha
International Conference on Green Computing, 149-163, 2010
112010
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
A Large-Scale Generative Language Model. arXiv 2201, 2022
102022
An efficient algorithm to reduce test power consumption by scan cell and scan vector reordering
KVA Reddy, S Chattopadahyay
Proceedings of the IEEE INDICON 2004. First India Annual Conference, 2004 …, 2004
102004
Energy bounded scalability analysis of parallel algorithms
VA Korthikanti, GA Agha
92009
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A large-scale generative language model (arXiv: 2201.11990). arXiv
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
72022
The system can't perform the operation now. Try again later.
Articles 1–20