Gradient-free Online Learning in Games with Delayed Rewards.
Amélie HéliouPanayotis MertikopoulosZhengyuan ZhouPublished in: CoRR (2020)
Keyphrases
- online learning
- online course
- e learning
- reinforcement learning
- computer mediated
- distance education
- video games
- nash equilibria
- blended learning
- markov decision processes
- game theoretic
- higher education
- computer games
- game design
- distance learning
- educational games
- online learning environments
- gradient method
- multiarmed bandit
- perfect information
- game development
- gradient direction
- gradient information
- game players
- bandit problems
- data sets
- reward function
- game playing
- game play
- nash equilibrium
- language learning
- game theory
- monte carlo
- edge detection
- multi agent systems
- multiscale
- machine learning