Regret Minimization for Partially Observable Deep Reinforcement Learning.
Peter H. JinKurt KeutzerSergey LevinePublished in: ICML (2018)
Keyphrases
- partially observable
- reinforcement learning
- regret minimization
- state space
- decision problems
- game theoretic
- markov decision processes
- partial observability
- nash equilibrium
- partially observable domains
- function approximation
- partial observations
- dynamical systems
- hidden state
- markov decision problems
- infinite horizon
- reward function
- multi agent
- partially observable environments
- model free
- multi agent learning
- game theory
- action models
- machine learning
- policy iteration
- reinforcement learning algorithms
- orders of magnitude
- learning algorithm
- single agent
- utility function
- heuristic search
- partially observable markov decision process
- supervised learning
- temporal difference
- state variables