Online Reinforcement Learning for Real-Time Exploration in Continuous State and Action Markov Decision Processes.
Ludovic HoferHugo GimbertPublished in: CoRR (2016)
Keyphrases
- action space
- markov decision processes
- continuous state
- reinforcement learning
- model based reinforcement learning
- action selection
- state and action spaces
- state space
- finite state
- state action
- optimal policy
- policy search
- control policies
- reinforcement learning algorithms
- dynamic programming
- continuous state spaces
- average reward
- markov decision process
- finite horizon
- infinite horizon
- average cost
- partially observable
- function approximators
- discounted reward
- learning algorithm
- stochastic games
- partially observable markov decision processes
- multi agent
- policy iteration
- mobile robot
- domain independent
- steady state