Strategically Conservative Q-Learning.
Yutaka ShimizuJoey HongSergey LevineMasayoshi TomizukaPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- function approximation
- cooperative
- learning algorithm
- state space
- multi agent
- stochastic approximation
- learning rate
- model free
- optimal policy
- reinforcement learning algorithms
- action selection
- potential field
- temporal difference learning
- stochastic shortest path
- multi agent reinforcement learning
- data sets
- reinforcement learning methods
- dynamical systems
- machine learning
- data mining