Online Learning in Weakly Coupled Markov Decision Processes: A Convergence Time Study.
Xiaohan WeiHao YuMichael J. NeelyPublished in: SIGMETRICS (Abstracts) (2018)
Keyphrases
- markov decision processes
- online learning
- state space
- reinforcement learning
- finite state
- reachability analysis
- dynamic programming
- optimal policy
- policy iteration
- transition matrices
- data mining
- planning under uncertainty
- average reward
- decision processes
- reinforcement learning algorithms
- infinite horizon
- search space
- learning algorithm
- machine learning