Variation-resistant Q-learning: Controlling and Utilizing Estimation Bias in Reinforcement Learning for Better Performance.
Andreas PentaliotisMarco A. WieringPublished in: ICAART (2) (2021)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- model free
- state space
- action selection
- optimal policy
- multi agent
- stochastic approximation
- temporal difference learning
- multi agent reinforcement learning
- state action space
- dynamic programming
- control problems
- temporal difference
- relational reinforcement learning
- reinforcement learning methods
- reward function
- learning algorithm
- optimal control
- variance reduction
- continuous state and action spaces
- multi agent systems
- hierarchical reinforcement learning
- learning process
- learning problems
- radial basis function
- neural network
- state action
- function approximators
- learning agent
- policy iteration
- markov decision process
- estimation algorithm
- partially observable