Hitting time for Markov decision process.
Ruichao JiangJavad TavakoliYiqiang ZhaoPublished in: CoRR (2022)
Keyphrases
- markov decision process
- state space
- markov chain
- optimal policy
- markov decision processes
- reinforcement learning
- infinite horizon
- transition matrices
- temporal difference learning
- finite horizon
- transition probabilities
- initial state
- finite state
- partial observability
- dynamic programming
- policy iteration
- machine learning
- stationary policies
- data mining
- reward function
- reinforcement learning algorithms
- decision making
- search engine