The Option Keyboard: Combining Skills in Reinforcement Learning.
André BarretoDiana BorsaShaobo HouGheorghe ComaniciEser AygünPhilippe HamelDaniel ToyamaJonathan J. HuntShibl MouradDavid SilverDoina PrecupPublished in: NeurIPS (2019)
Keyphrases
- reinforcement learning
- function approximation
- dynamic programming
- multi agent reinforcement learning
- user interface
- state space
- supervised learning
- policy search
- data sets
- temporal difference learning
- reinforcement learning algorithms
- combining multiple
- markov decision processes
- optimal policy
- multi agent
- learning algorithm
- genetic algorithm