Rethinking attention with performers K Choromanski, V Likhosherstov, D Dohan, X Song, A Gane, T Sarlos, ... arXiv preprint arXiv:2009.14794, 2020 | 1292 | 2020 |
Model-based reinforcement learning for atari L Kaiser, M Babaeizadeh, P Milos, B Osinski, RH Campbell, ... arXiv preprint arXiv:1903.00374, 2019 | 878 | 2019 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 511 | 2023 |
Scaling up models and data with t5x and seqio A Roberts, HW Chung, G Mishra, A Levskaya, J Bradbury, D Andor, ... Journal of Machine Learning Research 24 (377), 1-8, 2023 | 114 | 2023 |
Sparse is enough in scaling transformers S Jaszczur, A Chowdhery, A Mohiuddin, L Kaiser, W Gajewski, ... Advances in Neural Information Processing Systems 34, 9895-9907, 2021 | 69 | 2021 |
Answer to question neural networks N Lao, LM Kaiser, N Gupta, A Mohiuddin, P Popat US Patent 11,093,813, 2021 | 23 | 2021 |
Model-based reinforcement learning for atari (2019) L Kaiser, M Babaeizadeh, P Milos, B Osinski, RH Campbell, ... arXiv preprint arXiv:1903.00374, 1903 | 15 | 1903 |
Generating elements of answer-seeking queries and elements of answers Y Liu, P Popat, N Gupta, A Mohiuddin US Patent 10,592,540, 2020 | 11 | 2020 |
Deciphering clinical abbreviations with a privacy protecting machine learning system A Rajkomar, E Loreaux, Y Liu, J Kemp, B Li, MJ Chen, Y Zhang, ... Nature Communications 13 (1), 7456, 2022 | 9 | 2022 |
Q-value weighted regression: Reinforcement learning with limited data P Kozakowski, L Kaiser, H Michalewski, A Mohiuddin, K Kańska 2022 International Joint Conference on Neural Networks (IJCNN), 1-8, 2022 | 4 | 2022 |
Reformer: the efficient transformer arXiv preprint arXiv: 200104451 N Kitaev, Ł Kaiser, A Levskaya | 3* | 2020 |
Model-based reinforcement learning for atari B Osinski, C Finn, D Erhan, G Tucker, H Michalewski, K Czechowski, ... ICLR 1, 2, 2020 | 1 | 2020 |
Sparse attention neural networks A Chowdhery, A Mohiuddin, H Michalewski, JM Kanerva, LM Kaiser, ... US Patent App. 17/666,400, 2022 | | 2022 |
Forecasting Deep Learning Dynamics with Applications to Hyperparameter Tuning P Kozakowski, Ł Kaiser, A Mohiuddin | | 2019 |