Tactical Optimism and Pessimism for Deep Reinforcement Learning.
Ted MoskovitzJack Parker-HolderAldo PacchianoMichael ArbelMichael I. JordanPublished in: NeurIPS (2021)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- decision making
- optimal policy
- transfer learning
- command and control
- learning classifier systems
- state space
- database
- dynamic programming
- optimal control
- autonomous learning
- control problems
- multi agent reinforcement learning
- temporal difference
- model free
- learning algorithm
- learning problems
- policy search
- image sequences
- supervised learning
- multi agent
- robotic control
- air combat
- direct policy search
- robot control
- partially observable
- markov decision processes
- information systems
- genetic algorithm
- data mining