Temporal difference learning is favored for rewards, but not punishments, in simulations and human behavior.
Adam MorrisFiery CushmanPublished in: CogSci (2014)
Keyphrases
- human behavior
- temporal difference learning
- reinforcement learning
- function approximation
- fixed point
- reinforcement learning algorithms
- evaluation function
- temporal difference
- game playing
- markov decision processes
- policy iteration
- markov decision process
- human subjects
- daily life
- state space
- function approximators
- visual attention
- sufficient conditions
- reward function
- optimal policy
- model free
- radial basis function
- monte carlo
- image classification
- neural network