Some Supervision Required: Incorporating Oracle Policies in Reinforcement Learning via Epistemic Uncertainty Metrics.
Jun Jet TaiJordan K. TerryMauro Sebastián InnocenteJames BruseyNadjim HorriPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- optimal policy
- markov decision process
- control policies
- policy search
- learning process
- partial observability
- state space
- incomplete information
- machine learning
- fitted q iteration
- markov decision problems
- control policy
- reward function
- active learning
- database
- model free
- action space
- function approximation
- analytical models
- transfer learning
- dynamic programming
- database systems
- hierarchical reinforcement learning
- reinforcement learning agents