Exclusively Penalized Q-learning for Offline Reinforcement Learning.
Junghyuk YeomYonghyeon JoJungmo KimSanghyeon LeeSeungyul HanPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- state space
- least squares
- model free
- multi agent
- action selection
- temporal difference learning
- temporal difference
- state action space
- machine learning
- dynamic programming
- optimal policy
- learning algorithm
- multi agent reinforcement learning
- state action
- supervised learning
- control problems
- continuous state and action spaces
- maximum likelihood
- reinforcement learning methods
- relational reinforcement learning
- partially observable
- loss function
- markov decision processes
- continuous state spaces
- reinforcement learning problems
- eligibility traces
- function approximators
- optimal control
- rl algorithms
- td learning
- learning problems