Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games.
Gabriele FarinaTuomas SandholmPublished in: CoRR (2021)
Keyphrases
- model free
- online learning
- sequential decision making problems
- reinforcement learning
- reinforcement learning algorithms
- function approximation
- decision theoretic planning
- e learning
- policy iteration
- state space
- markov decision processes
- temporal difference
- optimal policy
- learning algorithm
- partially observable markov decision processes
- nash equilibria
- game playing
- active learning
- learning process
- transfer learning
- optimal control
- dynamic programming
- partially observable
- markov decision process
- game theory
- average reward
- stochastic games
- markov decision problems
- dec pomdps
- multi agent