Stochastic matrix games with bandit feedback.
Brendan O'DonoghueTor LattimoreIan OsbandPublished in: CoRR (2020)
Keyphrases
- computer games
- video games
- stochastic optimization
- user feedback
- monte carlo
- hopfield neural network
- singular value decomposition
- nash equilibria
- multi armed bandit
- feedback mechanisms
- game theoretic
- game theory
- game playing
- game design
- random sampling
- human computation
- covariance matrix
- stochastic model
- nash equilibrium
- visual feedback
- weighted majority
- reinforcement learning