Observational Overfitting in Reinforcement Learning.
Xingyou SongYiding JiangStephen TuYilun DuBehnam NeyshaburPublished in: ICLR (2020)
Keyphrases
- reinforcement learning
- function approximation
- markov decision processes
- cross validation
- state space
- reinforcement learning algorithms
- decision trees
- multi agent
- dynamic programming
- neural network
- temporal difference
- action selection
- model free
- optimal control
- supervised learning
- optimal policy
- transfer learning
- case study
- control problems
- robotic control
- learning process
- website
- decision making
- machine learning
- data sets
- action space
- learning agents
- policy search
- direct policy search