Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning.
Gen LiWenhao ZhanJason D. LeeYuejie ChiYuxin ChenPublished in: NeurIPS (2023)
Keyphrases
- fine tuning
- reinforcement learning
- viable alternative
- fine tune
- function approximation
- fine tuned
- state space
- learning problems
- reinforcement learning methods
- reinforcement learning algorithms
- statistical analysis
- statistical information
- information theoretic
- learning algorithm
- reward function
- model free
- statistical models
- machine learning
- learning process
- dynamic programming
- markov decision processes
- supervised learning
- partially observable environments
- action selection
- markov chain
- data driven
- partially observable
- temporal difference
- learning agent
- action space
- function approximators
- state action
- optimal policy
- multi agent
- hybrid learning
- reward shaping
- optimal control