Composing Task-Agnostic Policies with Deep Reinforcement Learning.
Ahmed Hussain QureshiJacob J. JohnsonYuzhe QinTaylor HendersonByron BootsMichael C. YipPublished in: ICLR (2020)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- control policies
- reward function
- markov decision processes
- state space
- control policy
- function approximation
- hierarchical reinforcement learning
- macro actions
- cooperative multi agent systems
- fitted q iteration
- reinforcement learning agents
- multi agent
- partially observable markov decision processes
- decision problems
- reinforcement learning algorithms
- temporal difference
- policy gradient methods
- total reward
- learning process
- approximate policy iteration
- infinite horizon
- markov decision problems
- robot control
- robotic control
- transfer learning
- dynamic programming
- multiagent reinforcement learning
- tabula rasa
- decentralized control
- continuous state
- temporal difference learning
- learning agents
- average reward
- function approximators
- deep learning
- policy iteration
- model free
- long run
- multiagent systems