Near-optimal Reinforcement Learning in Factored MDPs.
Ian OsbandBenjamin Van RoyPublished in: NIPS (2014)
Keyphrases
- factored mdps
- reinforcement learning
- markov decision processes
- state space
- approximate dynamic programming
- policy iteration
- markov decision problems
- reinforcement learning algorithms
- optimal policy
- algebraic decision diagrams
- context specific
- function approximation
- model free
- dynamic programming
- action space
- transition model
- finite state
- temporal difference
- basis functions
- infinite horizon
- partially observable
- linear program
- markov chain
- markov decision process
- decision processes
- optimal control
- planning under uncertainty
- initial state
- stochastic processes
- reward function
- function approximators
- control policy
- fixed point
- single agent