Model-Free Reinforcement Learning for Branching Markov Decision Processes.
Ernst Moritz HahnMateo PerezSven ScheweFabio SomenziAshutosh TrivediDominik WojtczakPublished in: CAV (2) (2021)
Keyphrases
- markov decision processes
- model free reinforcement learning
- reinforcement learning
- policy gradient
- state space
- reinforcement learning algorithms
- optimal policy
- finite state
- average reward
- decision theoretic planning
- transition matrices
- partially observable markov decision processes
- policy iteration
- reward function
- dynamic programming
- multi agent
- action space
- average cost
- partially observable
- function approximation
- markov decision process
- temporal difference
- model free
- control problems
- search algorithm
- stochastic games
- infinite horizon
- real valued