Prospect-theoretic Q-learning.

Vivek S. Borkar Siddharth Chandak

Published in: Syst. Control. Lett. (2021)

Keyphrases

reinforcement learning
function approximation
cooperative
multi agent
state space
learning algorithm
reinforcement learning algorithms
action selection
model free
stochastic approximation
multi agent reinforcement learning
optimal policy
temporal difference learning
learning rate
bucket brigade
temporal difference
dynamic programming
credit assignment
continuous state spaces
continuous state and action spaces
markov decision processes
hierarchical reinforcement learning
potential field
multiagent learning
reinforcement learning methods
single agent