Adaptive Policy Learning for Offline-to-Online Reinforcement Learning.
Han ZhengXufang LuoPengfei WeiXuan SongDongsheng LiJing JiangPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- online learning
- learning process
- learning algorithm
- learning capabilities
- function approximation
- optimal policy
- function approximators
- learning problems
- learning systems
- supervised learning
- active learning
- policy search
- state space
- reinforcement learning methods
- action selection
- actor critic
- adaptive learning
- autonomous learning
- policy gradient
- learning agents
- average reward
- exploration exploitation tradeoff
- robot control
- partially observable
- temporal difference
- adaptive control
- real time
- markov decision processes
- knowledge acquisition
- multi agent
- machine learning