DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies.
Soroush NasirianyVitchyr H. PongAshvin NairAlexander KhazatskyGlen BersethSergey LevinePublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- general purpose
- optimal policy
- markov decision process
- policy search
- state space
- function approximation
- control policies
- reinforcement learning algorithms
- markov decision processes
- partially observable markov decision processes
- special purpose
- control policy
- markov decision problems
- model free
- reward function
- total reward
- semi markov decision process
- probability distribution
- dynamic programming
- partially observable
- multiagent reinforcement learning
- control problems
- autonomous learning
- hierarchical reinforcement learning
- reinforcement learning methods
- reinforcement learning agents
- fitted q iteration
- decision problems
- machine learning
- learning algorithm
- temporal difference
- infinite horizon
- rl algorithms
- actor critic
- optimal control
- partially observable domains
- approximate policy iteration
- multi agent
- learning process
- supervised learning
- transfer learning
- action space
- policy iteration
- multi agent reinforcement learning
- average cost
- finite state
- long run
- state action
- stochastic games
- average reward
- function approximators