Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling.
Alekh AgarwalTong ZhangPublished in: CoRR (2022)
Keyphrases
- action space
- reinforcement learning
- state space
- state and action spaces
- continuous state
- markov decision processes
- real valued
- continuous state spaces
- reinforcement learning methods
- sample size
- sufficient conditions
- state action
- reinforcement learning algorithms
- control policies
- stochastic processes
- function approximation
- markov decision process
- model free
- skill learning
- machine learning
- optimal policy
- probability distribution
- dynamic programming
- function approximators
- probabilistic model
- reinforcement learning problems
- computational complexity
- learning algorithm