Are Sixteen Heads Really Better than One? P Michel, O Levy, G Neubig NeurIPS 2019, 2019 | 941 | 2019 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 545 | 2023 |
Dynet: The dynamic neural network toolkit G Neubig, C Dyer, Y Goldberg, A Matthews, W Ammar, A Anastasopoulos, ... arXiv preprint arXiv:1701.03980, 2017 | 443* | 2017 |
Weight Poisoning Attacks on Pre-trained Models K Kurita, P Michel, G Neubig ACL 2020, 2020 | 331 | 2020 |
MTNT: A Testbed for Machine Translation of Noisy Text P Michel, G Neubig EMNLP 2018, 2018 | 138 | 2018 |
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models P Michel, X Li, G Neubig, JM Pino NAACL 2019, 2019 | 137 | 2019 |
compare-mt: A Tool for Holistic Comparison of Language Generation Systems G Neubig, ZY Dou, J Hu, P Michel, D Pruthi, X Wang NAACL 2019 Demo, 2019 | 124 | 2019 |
Extreme Adaptation for Personalized Neural Machine Translation P Michel, G Neubig ACL 2018, 2018 | 108 | 2018 |
Findings of the first shared task on machine translation robustness X Li, P Michel, A Anastasopoulos, Y Belinkov, N Durrani, O Firat, P Koehn, ... WMT 2019, 2019 | 65 | 2019 |
Examining and Combating Spurious Features under Distribution Shift C Zhou, X Ma, P Michel, G Neubig ICML 2021, 2021 | 57 | 2021 |
Optimizing data usage via differentiable rewards X Wang, H Pham, P Michel, A Anastasopoulos, J Carbonell, G Neubig International Conference on Machine Learning, 9983-9995, 2020 | 54 | 2020 |
Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 49 | 2024 |
Modeling the Second Player in Distributionally Robust Optimization P Michel, T Hashimoto, G Neubig ICLR 2021, 2021 | 30 | 2021 |
Findings of the WMT 2020 shared task on machine translation robustness L Specia, Z Li, J Pino, V Chaudhary, F Guzmán, G Neubig, N Durrani, ... Proceedings of the Fifth Conference on Machine Translation, 76-91, 2020 | 29 | 2020 |
Blind phoneme segmentation with temporal prediction errors P Michel, O Räsänen, R Thiolliere, E Dupoux ACL SRW 2017, 2016 | 27* | 2016 |
Should we be pre-training? an argument for end-task aware training as an alternative LM Dery, P Michel, A Talwalkar, G Neubig ICLR 2022, 2021 | 24 | 2021 |
Distributionally Robust Models with Parametric Likelihood Ratios P Michel, T Hashimoto, G Neubig ICLR 2022, 2022 | 18 | 2022 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 16 | 2024 |
Emergent communication: Generalization and overfitting in lewis games M Rita, C Tallec, P Michel, JB Grill, O Pietquin, E Dupoux, F Strub Advances in neural information processing systems 35, 1389-1404, 2022 | 15 | 2022 |
Does the Geometry of Word Embeddings Help Document Classification? A Case Study on Persistent Homology Based Representations P Michel, A Ravichander, S Rijhwani Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017 | 13 | 2017 |