CUER: Corrected Uniform Experience Replay for Off-Policy Continuous Deep Reinforcement Learning Algorithms.
Arda Sarp YenicesuFurkan B. MutluSuleyman S. KozatOzgur S. OguzPublished in: CoRR (2024)
Keyphrases
- reinforcement learning algorithms
- reinforcement learning
- markov decision processes
- state space
- model free
- reinforcement learning problems
- eligibility traces
- action space
- reinforcement learning methods
- temporal difference
- partially observable environments
- function approximation
- learning algorithm
- reward function
- policy search
- neural network
- dynamic environments
- markov chain
- supervised learning
- semi supervised
- function approximators
- search algorithm
- training data
- decision making