Polynomial Time Reinforcement Learning in Factored State MDPs with Linear Value Functions.
Zihao DengSiddartha DevicBrendan JubaPublished in: AISTATS (2022)
Keyphrases
- state space
- reinforcement learning
- markov decision processes
- factored markov decision processes
- optimal policy
- action space
- markov decision process
- state variables
- special case
- function approximation
- dynamic programming
- basis functions
- reinforcement learning algorithms
- planning problems
- continuous state spaces
- markov chain
- dynamical systems
- partially observable
- initial state
- model free
- policy iteration
- state abstraction
- deterministic domains
- action selection
- learning agent
- linear programming
- state transitions
- control policy
- state action
- finite state
- policy search
- model based reinforcement learning