Johannes von Oswald

Cited by

	All	Since 2019
Citations	1027	1027
h-index	12	12
i10-index	12	12

400

200

100

300

2019202020212022202320244 31 115 177 382 315

Public access

View all

8 articles

0 articles

available

not available

Based on funding mandates

Co-authors

João SacramentoGoogleVerified email at joaosacramento.com
Seijin KobayashiETHZVerified email at ethz.ch
Christian HenningEthonAI AGVerified email at ethon.ai
Nicolas ZucchetPhD student, ETH ZurichVerified email at ethz.ch
Max VladymyrovGoogle DeepMindVerified email at google.com
Simon SchugETH ZurichVerified email at ethz.ch
Dominic ZhaoETH ZurichVerified email at student.ethz.ch
Eyvind NiklassonCornell UniversityVerified email at cornell.edu
Alexander MeulemansGoogleVerified email at google.com
Alexander MordvintsevGoogleVerified email at google.com
Andrey ZhmoginovGoogle DeepMindVerified email at google.com
Ettore RandazzoGoogleVerified email at google.com
Angelika StegerETH ZurichVerified email at inf.ethz.ch
Jean-Pascal PfisterProfessor of Theoretical Neuroscience, Institute of Neuroinformatics, University of Zurich and ETHVerified email at ini.uzh.ch
Nino ScherrerGoogleVerified email at google.com
Laurence AitchisonUniversity of BristolVerified email at bristol.ac.uk
Francesco D'AngeloEPFLVerified email at epfl.ch
Blaise Aguera y ArcasVP Engineering Fellow, Google ResearchVerified email at google.com

Johannes von Oswald

Research Scientist, Google Research

Verified email at google.com - Homepage

Deep Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Continual learning with hypernetworks J von Oswald, C Henning, BF Grewe, J Sacramento International Conference on Learning Representation (ICLR 2020), 2019	379	2019
Transformers learn in-context by gradient descent J Von Oswald, E Niklasson, E Randazzo, J Sacramento, A Mordvintsev, ... International Conference on Machine Learning, 35151-35174, 2023	288	2023
Learning where to learn: Gradient sparsity in meta and continual learning J Von Oswald, D Zhao, S Kobayashi, S Schug, M Caccia, N Zucchet, ... Advances in Neural Information Processing Systems 34, 5250-5263, 2021	54	2021
Posterior meta-replay for continual learning C Henning, M Cervera, F D'Angelo, J Von Oswald, R Traber, B Ehret, ... Advances in neural information processing systems 34, 14135-14149, 2021	52	2021
Continual learning in recurrent neural networks B Ehret, C Henning, MR Cervera, A Meulemans, J Von Oswald, BF Grewe arXiv preprint arXiv:2006.12109, 2020	47	2020
Meta-Learning via Hypernetworks D Zhao, S Kobayashi, J Sacramento, J von Oswald 4th Workshop on Meta-Learning at NeurIPS 2020, Vancouver, Canada, 2020	46	2020
Neural networks with late-phase weights J von Oswald, S Kobayashi, A Meulemans, C Henning, BF Grewe, ... International Conference on Learning Representation (ICLR 2021), arXiv: 2007 …, 2020	33	2020
Uncovering mesa-optimization algorithms in transformers J Von Oswald, E Niklasson, M Schlegel, S Kobayashi, N Zucchet, ... arXiv preprint arXiv:2309.05858, 2023	24	2023
Approximating the predictive distribution via adversarially-trained hypernetworks C Henning, J von Oswald, J Sacramento, SC Surace, JP Pfister, ... Yarin, 2018	24	2018
Random initialisations performing above chance and how to find them F Benzing, S Schug, R Meier, J Von Oswald, Y Akram, N Zucchet, ... arXiv preprint arXiv:2209.07509, 2022	21	2022
A contrastive rule for meta-learning N Zucchet, S Schug, J Von Oswald, D Zhao, J Sacramento Advances in neural information processing systems 35, 25921-25936, 2022	20	2022
The least-control principle for local learning at equilibrium A Meulemans, N Zucchet, S Kobayashi, J Von Oswald, J Sacramento Advances in Neural Information Processing Systems 35, 33603-33617, 2022	19	2022
Gated recurrent neural networks discover attention N Zucchet, S Kobayashi, Y Akram, J Von Oswald, M Larcher, A Steger, ... arXiv preprint arXiv:2309.01775, 2023	8	2023
On the reversed bias-variance tradeoff in deep ensembles S Kobayashi, J von Oswald, BF Grewe ICML, 2021	7	2021
Discovering modular solutions that generalize compositionally S Schug, S Kobayashi, Y Akram, M Wołczyk, A Proca, J Von Oswald, ... arXiv preprint arXiv:2312.15001, 2023	4	2023
Linear Transformers are Versatile In-Context Learners M Vladymyrov, J Von Oswald, M Sandler, R Ge arXiv preprint arXiv:2402.14180, 2024	1	2024
When can transformers compositionally generalize in-context? S Kobayashi, S Schug, Y Akram, F Redhardt, J von Oswald, R Pascanu, ... arXiv preprint arXiv:2407.12275, 2024		2024
State Soup: In-Context Skill Learning, Retrieval and Mixing M Pióro, M Wołczyk, R Pascanu, J von Oswald, J Sacramento arXiv preprint arXiv:2406.08423, 2024		2024
Interpretability of Learning Algorithms Encoded in Deep Neural Networks J von Oswald ETH Zurich, 2024		2024
A complementary systems theory of meta-learning S Schug, N Zucchet, J von Oswald, J Sacramento Cosyne 2023, 2023		2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors