Off-Policy Deep Reinforcement Learning Based on Steffensen Value Iteration.
Yuhu ChengLin ChenC. L. Philip ChenXuesong WangPublished in: IEEE Trans. Cogn. Dev. Syst. (2021)
Keyphrases
- reinforcement learning
- markov decision processes
- state space
- optimal policy
- policy iteration
- markov decision process
- dynamic programming
- function approximation
- average reward
- partially observable markov decision processes
- model free
- reinforcement learning algorithms
- heuristic search
- temporal difference
- partially observable
- action selection
- markov decision chains
- finite state
- multi agent reinforcement learning
- supervised learning
- action space
- approximate dynamic programming
- infinite horizon
- optimal control
- transfer learning
- average cost
- learning algorithm
- transition model
- belief nets
- learning classifier systems
- data sets
- planning problems
- multi agent
- bayesian networks
- machine learning