Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games.
Gabriele FarinaTuomas SandholmPublished in: AAAI (2021)
Keyphrases
- model free
- online learning
- sequential decision making problems
- reinforcement learning
- reinforcement learning algorithms
- function approximation
- temporal difference
- decision theoretic planning
- e learning
- average reward
- policy iteration
- game theory
- stochastic games
- multi agent
- markov decision problems
- nash equilibria
- partially observable markov decision processes
- learning algorithm
- game playing
- state space
- nash equilibrium
- active learning
- machine learning
- optimal policy
- sufficient conditions
- learning agent
- dynamic programming
- learning process