Improving TD3-BC: Relaxed Policy Constraint for Offline Learning and Stable Online Fine-Tuning.
Alex BeesonGiovanni MontanaPublished in: CoRR (2022)
Keyphrases
- fine tuning
- online learning
- learning algorithm
- reinforcement learning
- learning process
- real time
- active learning
- learning systems
- website
- online training
- connectionist networks
- learning activities
- eligibility traces
- fine tune
- learning agent
- inductive inference
- learning tasks
- supervised learning
- dynamic programming
- learning environment
- neural network