Ctrl-Z: Recovering from Instability in Reinforcement Learning.
Vibhavari DasagiJake BruceThierry PeynotJürgen LeitnerPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- optimal policy
- multi agent
- learning algorithm
- direct policy search
- markov decision processes
- state space
- control problems
- robotic control
- model free
- temporal difference learning
- neural network
- evolutionary algorithm
- control system
- dynamic programming
- learning process
- multi agent systems
- video sequences
- knowledge base
- multi agent reinforcement learning