Routing in Reinforcement Learning Markov Chains.
Maximilian MollDominic WellerPublished in: OR (2021)
Keyphrases
- markov chain
- reinforcement learning
- state space
- monte carlo
- steady state
- markov process
- finite state
- transition probabilities
- monte carlo method
- random walk
- monte carlo simulation
- stationary distribution
- markov decision processes
- model free
- temporal difference
- stochastic process
- function approximation
- probabilistic automata
- markov model
- markov processes
- optimal policy
- dynamic programming
- transition matrix
- markov decision process
- machine learning
- assemble to order systems
- single server
- learning algorithm