Strategically Conservative Q-Learning.

Yutaka Shimizu Joey Hong Sergey Levine Masayoshi Tomizuka

Published in: CoRR (2024)

Keyphrases

reinforcement learning
function approximation
cooperative
learning algorithm
state space
multi agent
stochastic approximation
learning rate
model free
optimal policy
reinforcement learning algorithms
action selection
potential field
temporal difference learning
stochastic shortest path
multi agent reinforcement learning
data sets
reinforcement learning methods
dynamical systems
machine learning
data mining