Temporal Regularization for Markov Decision Process.
Pierre ThodoroffAudrey DurandJoelle PineauDoina PrecupPublished in: NeurIPS (2018)
Keyphrases
- markov decision process
- state space
- markov decision processes
- optimal policy
- reinforcement learning
- finite horizon
- infinite horizon
- transition matrices
- temporal difference learning
- initial state
- policy iteration
- temporal information
- prior information
- reward function
- transition probabilities
- finite state
- machine learning
- decision problems
- search space
- decision making
- search engine