Keyphrases
- regret bounds
- worst case
- reinforcement learning
- reward function
- cooperative
- state space
- online learning
- function approximation
- expert advice
- lower bound
- multi agent
- learning algorithm
- reinforcement learning algorithms
- model free
- stochastic approximation
- linear regression
- multi agent reinforcement learning
- learning rate
- action selection
- upper bound
- optimal policy
- potential field
- binary classification
- bucket brigade
- loss function
- bandit problems
- multi armed bandit problems
- game theory
- confidence bounds
- regret minimization
- multi armed bandit
- td learning
- minimax regret
- multi class
- single agent
- markov decision processes
- multi agent systems
- temporal difference learning
- learning agent
- machine learning