Learning Intrinsic Rewards as a Bi-Level Optimization Problem.
Bradly C. StadieLunjun ZhangJimmy BaPublished in: UAI (2020)
Keyphrases
- bi level
- reinforcement learning
- learning algorithm
- learning process
- active learning
- learning systems
- pricing model
- learning scheme
- mobile learning
- optimization algorithm
- online learning
- unsupervised learning
- markov decision processes
- intelligent tutoring systems
- optimization method
- learning tasks
- training data
- kernel machines
- genetic algorithm
- data sets