Which Temporal Difference Learning Algorithm Best Reproduces Dopamine Activity in a Multi-choice Task?
Jean BellotOlivier SigaudMehdi KhamassiPublished in: SAB (2012)
Keyphrases
- temporal difference
- reinforcement learning
- reinforcement learning algorithms
- learning algorithm
- function approximation
- td learning
- evaluation function
- supervised learning
- policy evaluation
- temporal difference learning
- monte carlo
- step size
- model free
- action selection
- policy iteration
- learning tasks
- learning process
- training data
- function approximators
- predictive state representations
- temporal difference methods
- actor critic
- machine learning
- active learning
- learning rate
- reinforcement learning problems
- dynamic programming
- support vector