On the Sample Complexity of Reinforcement Learning with Policy Space Generalization.
Wenlong MouZheng WenXi ChenPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- action space
- optimal policy
- markov decision process
- policy search
- state space
- search space
- approximate dynamic programming
- markov decision processes
- neural network
- actor critic
- parameter space
- action selection
- machine learning
- state action
- reinforcement learning algorithms
- optimal control
- function approximation
- temporal difference
- infinite horizon
- policy makers
- policy iteration
- dynamical systems
- control policies
- continuous state
- data points
- high dimensional
- continuous state spaces
- learning algorithm
- state and action spaces
- partially observable environments