Theoretically-Grounded Policy Advice from Multiple Teachers in Reinforcement Learning Settings with Applications to Negative Transfer.
Yusen ZhanHaitham Bou-AmmarMatthew E. TaylorPublished in: CoRR (2016)
Keyphrases
- reinforcement learning
- optimal policy
- learning process
- partially observable environments
- reinforcement learning problems
- markov decision process
- high school students
- temporal difference
- action selection
- professional development
- high school
- positive and negative
- markov decision processes
- model free
- transfer learning
- control policy
- state space
- learning algorithm