Revisiting Peng's Q(λ) for Modern Reinforcement Learning.
Tadashi KozunoYunhao TangMark RowlandRémi MunosSteven KapturowskiWill DabneyMichal ValkoDavid AbelPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- state space
- optimal policy
- reinforcement learning methods
- learning algorithm
- data mining
- temporal difference learning
- model free
- optimal control
- dynamic programming
- learning process
- multi agent
- supervised learning
- markov decision processes
- website
- artificial intelligence
- control problems
- machine learning
- neural network
- multi agent reinforcement learning
- policy search
- relational reinforcement learning