On the convergence of projective-simulation-based reinforcement learning in Markov decision processes.
Jens ClausenWalter L. BoyajianLea M. TrenkwalderVedran DunjkoHans J. BriegelPublished in: CoRR (2019)
Keyphrases
- markov decision processes
- reinforcement learning
- optimal policy
- reinforcement learning algorithms
- state space
- stochastic shortest path
- finite state
- policy iteration
- dynamic programming
- model based reinforcement learning
- state and action spaces
- partially observable
- factored mdps
- infinite horizon
- planning under uncertainty
- transition matrices
- action space
- action sets
- stationary policies
- markov decision process
- machine learning
- average reward
- multi agent
- model free
- state abstraction
- function approximation
- convergence rate
- reachability analysis
- finite horizon
- control problems
- temporal difference
- convergence speed
- decision theoretic planning
- real time dynamic programming
- decision problems
- continuous state
- decision processes
- policy iteration algorithm
- learning algorithm