The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 1317 | 2024 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 1177 | 2022 |
Compositionality decomposed: How do neural networks generalise? D Hupkes, V Dankers, M Mul, E Bruni Journal of Artificial Intelligence Research 67, 757-795, 2020 | 376* | 2020 |
Visualisation and'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure D Hupkes, S Veldhoen, W Zuidema Journal of Artificial Intelligence Research 61, 907-926, 2018 | 317 | 2018 |
Masked language modeling and the distributional hypothesis: Order word matters pre-training for little K Sinha, R Jia, D Hupkes, J Pineau, A Williams, D Kiela arXiv preprint arXiv:2104.06644, 2021 | 251 | 2021 |
The emergence of number and syntax units in LSTM language models Y Lakretz, G Kruszewski, T Desbordes, D Hupkes, S Dehaene, M Baroni arXiv preprint arXiv:1903.07435, 2019 | 203 | 2019 |
Under the hood: Using diagnostic classifiers to investigate and improve how language models track agreement information M Giulianelli arXiv preprint arXiv:1808.08079, 2018 | 194 | 2018 |
A taxonomy and review of generalization research in NLP D Hupkes, M Giulianelli, V Dankers, M Artetxe, Y Elazar, T Pimentel, ... Nature Machine Intelligence 5 (10), 1161-1174, 2023 | 112* | 2023 |
Mechanisms for handling nested dependencies in neural-network language models and humans Y Lakretz, D Hupkes, A Vergallito, M Marelli, M Baroni, S Dehaene Cognition 213, 104699, 2021 | 85 | 2021 |
The paradox of the compositionality of natural language: A neural machine translation case study V Dankers, E Bruni, D Hupkes arXiv preprint arXiv:2108.05885, 2021 | 76 | 2021 |
The llama 3 herd of models, 2024 A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... URL https://arxiv. org/abs/2407.21783 2407, 21783, 0 | 75 | |
Do language models understand anything? On the ability of LSTMs to understand negative polarity items J Jumelet, D Hupkes arXiv preprint arXiv:1808.10627, 2018 | 67 | 2018 |
Diagnostic Classifiers Revealing how Neural Networks Process Hierarchical Structure. S Veldhoen, D Hupkes, WH Zuidema CoCo@ NIPS, 69-77, 2016 | 55 | 2016 |
Analysing neural language models: Contextual decomposition reveals default reasoning in number and gender assignment J Jumelet, W Zuidema, D Hupkes arXiv preprint arXiv:1909.08975, 2019 | 47 | 2019 |
Co-evolution of language and agents in referential games G Dagan, D Hupkes, E Bruni arXiv preprint arXiv:2001.03361, 2020 | 43 | 2020 |
Location attention for extrapolation to longer sequences Y Dubois, G Dagan, D Hupkes, E Bruni arXiv preprint arXiv:1911.03872, 2019 | 38 | 2019 |
Learning compositionally through attentive guidance D Hupkes, A Singh, K Korrel, G Kruszewski, E Bruni arXiv preprint arXiv:1805.09657, 2018 | 33 | 2018 |
Transcoding compositionally: Using attention to find more generalizable solutions K Korrel, D Hupkes, V Dankers, E Bruni arXiv preprint arXiv:1906.01234, 2019 | 32 | 2019 |
How bpe affects memorization in transformers E Kharitonov, M Baroni, D Hupkes arXiv preprint arXiv:2110.02782, 2021 | 30 | 2021 |
Language models use monotonicity to assess NPI licensing J Jumelet, M Denić, J Szymanik, D Hupkes, S Steinert-Threlkeld arXiv preprint arXiv:2105.13818, 2021 | 30 | 2021 |