Is Long Horizon RL More Difficult Than Short Horizon RL?
Ruosong WangSimon S. DuLin F. YangSham M. KakadePublished in: NeurIPS (2020)
Keyphrases
- reinforcement learning
- markov decision processes
- function approximation
- state space
- learning process
- optimal policy
- error prone
- model free
- autonomous learning
- action space
- neural network
- transfer learning
- partially observable domains
- policy iteration
- dynamic programming
- learning algorithm
- machine learning
- data mining