Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling.

Alekh Agarwal Tong Zhang

Published in: COLT (2022)

Keyphrases

action space
reinforcement learning
state space
markov decision processes
continuous state
state and action spaces
real valued
sample size
control policies
stochastic processes
reinforcement learning methods
function approximation
action selection
state action
sufficient conditions
reinforcement learning algorithms
dynamic programming
function approximators
skill learning
continuous state spaces
reinforcement learning problems
learning algorithm
markov decision process
optimal policy
single agent
dynamical systems
computational complexity
decision making
machine learning