Follow
Ankit Gupta
Ankit Gupta
IBM Research
Verified email at ibm.com - Homepage
Title
Cited by
Cited by
Year
Injecting numerical reasoning skills into language models
M Geva*, A Gupta*, J Berant
Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020
1912020
Arithmetic circuits: A chasm at depth 3
A Gupta, P Kamath, N Kayal, R Saptharishi
SIAM Journal on Computing 45 (3), 1064-1079, 2016
182*2016
Break It Down: A Question Understanding Benchmark
T Wolfson, M Geva, A Gupta, M Gardner, Y Goldberg, D Deutch, J Berant
Transactions of the Association for Computational Linguistics 8, 183-198, 2020
1672020
Approaching the chasm at depth four
A Gupta, P Kamath, N Kayal, R Saptharishi
Journal of the ACM (JACM) 61 (6), 1-16, 2014
1362014
On the parameterization and initialization of diagonal state space models
A Gu, A Gupta, K Goel, C Ré
Advances in Neural Information Processing Systems 35, 35971-35983, 2022
972022
Diagonal state spaces are as effective as structured state spaces
A Gupta, A Gu, J Berant
Advances in Neural Information Processing Systems 35, 22982-22994, 2022
902022
Long range language modeling via gated state spaces
H Mehta, A Gupta, A Cutkosky, B Neyshabur
The Eleventh International Conference on Learning Representations, 2023
752023
Scrolls: Standardized comparison over long language sequences
U Shaham, E Segal, M Ivgi, A Efrat, O Yoran, A Haviv, A Gupta, W Xiong, ...
Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022
702022
Analyzing transformers in embedding space
G Dar, M Geva, A Gupta, J Berant
Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023
522023
Gmat: Global memory augmentation for transformers
A Gupta, J Berant
arXiv preprint arXiv:2006.03274, 2020
432020
Reconstruction of depth-4 multilinear circuits with top fan-in 2
A Gupta, N Kayal, S Lokam
Proceedings of the forty-fourth annual ACM symposium on Theory of computing …, 2012
292012
Algebraic geometric techniques for depth-4 PIT & sylvester-gallai conjectures for varieties
A Gupta
Electronic Colloquium on Computational Complexity (ECCC) 21 (130), 1, 2014
262014
Random arithmetic formulas can be reconstructed efficiently
A Gupta, N Kayal, Y Qiao
computational complexity 23, 207-303, 2014
212014
Memory-efficient Transformers via Top-k Attention
A Gupta, G Dar, S Goodman, D Ciprut, J Berant
Proceedings of the Second Workshop on Simple and Efficient Natural Language …, 2021
192021
Efficient reconstruction of random multilinear formulas
A Gupta, N Kayal, S Lokam
2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, 778-787, 2011
182011
Diagonal state space augmented transformers for speech recognition
G Saon, A Gupta, X Cui
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
132023
Simplifying and understanding state space models with diagonal linear rnns
A Gupta, H Mehta, J Berant
arXiv preprint arXiv:2212.00768, 2022
122022
Value-aware Approximate Attention
A Gupta, J Berant
Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021
42021
Exploring the limits of decoder-only models trained on public speech recognition corpora
A Gupta, G Saon, B Kingsbury
arXiv preprint arXiv:2402.00235, 2024
12024
Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors
I Amos, J Berant, A Gupta
arXiv preprint arXiv:2310.02980, 2023
12023
The system can't perform the operation now. Try again later.
Articles 1–20