Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games.
Colin McMillenManuela M. VelosoPublished in: AAAI (2007)
Keyphrases
- petri net
- reinforcement learning
- optimal strategy
- markov decision processes
- long term and short term
- imperfect information
- multiarmed bandit
- gray level
- perfect information
- bandit problems
- web services
- dynamic programming
- neural network
- discrete event
- free riding
- multi armed bandits
- real time
- artificial intelligence
- machine learning
- opponent modeling