Continual Optimistic Initialization for Value-Based Reinforcement Learning.
Sheelabhadra DeyJames AultGuni SharonPublished in: AAMAS (2024)
Keyphrases
- reinforcement learning
- function approximation
- temporal difference learning
- multi agent
- state space
- markov decision processes
- model free
- learning algorithm
- artificial intelligence
- learning process
- k means
- initial conditions
- temporal difference
- control problems
- direct policy search
- real robot
- robotic control
- function approximators
- multi agent reinforcement learning
- fitted q iteration
- real world
- initial state
- action selection
- learning problems
- transfer learning
- monte carlo
- optimal policy
- information retrieval