Kefan Dong
Kefan Dong
Verified email at - Homepage
Cited by
Cited by
Q-learning with ucb exploration is sample efficient for infinite-horizon mdp
K Dong, Y Wang, X Chen, L Wang
arXiv preprint arXiv:1901.09311, 2019
Exploration via hindsight goal generation
Z Ren, K Dong, Y Zhou, Q Liu, J Peng
arXiv preprint arXiv:1906.04279, 2019
Root-n-regret for learning in markov decision processes with function approximation and low bellman rank
K Dong, J Peng, Y Wang, Y Zhou
Conference on Learning Theory, 1554-1557, 2020
On the expressivity of neural networks for deep reinforcement learning
K Dong, Y Luo, T Yu, C Finn, T Ma
International Conference on Machine Learning, 2627-2637, 2020
Multinomial logit bandit with low switching cost
K Dong, Y Li, Q Zhang, Y Zhou
International Conference on Machine Learning, 2607-2615, 2020
Provable model-based nonlinear bandit and reinforcement learning: Shelve optimism, embrace virtual curvature
K Dong, J Yang, T Ma
arXiv preprint arXiv:2102.04168, 2021
Design of Experiments for Stochastic Contextual Linear Bandits
A Zanette, K Dong, J Lee, E Brunskill
arXiv preprint arXiv:2107.09912, 2021
Model-based Offline Reinforcement Learning with Local Misspecification
K Dong, R Keramati, E Brunskill
Refined Analysis of FPL for Adversarial Markov Decision Processes
Y Wang, K Dong
arXiv preprint arXiv:2008.09251, 2020
The system can't perform the operation now. Try again later.
Articles 1–9