B^2RTDP: An Efficient Solution for Bounded-Parameter Markov Decision Process.
Fernando L. FussumaKarina Valdivia DelgadoLeliane Nunes de BarrosPublished in: BRACIS (2014)
Keyphrases
- markov decision processes
- markov decision process
- initial state
- state space
- upper bound
- heuristic search
- dynamic programming
- optimal policy
- control theory
- reinforcement learning
- policy iteration
- finite state
- infinite horizon
- action space
- reinforcement learning methods
- markov chain
- situation calculus
- reinforcement learning algorithms
- modal logic
- decision problems
- hidden markov models
- optimal solution