A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning.
Long YangMinhao ShiQian ZhengWenjia MengGang PanPublished in: CoRR (2018)
Keyphrases
- multi step
- temporal difference learning
- eligibility traces
- reinforcement learning
- reinforcement learning algorithms
- function approximation
- temporal difference
- model free
- reinforcement learning methods
- markov decision processes
- state space
- policy evaluation
- learning algorithm
- function approximators
- knn
- machine learning
- multi agent
- supervised learning
- optimal policy
- k nearest neighbor
- markov decision process
- policy iteration
- action space
- dynamic programming
- fixed point
- convergence speed
- action selection
- semi supervised
- partially observable