Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games.

Colin McMillen Manuela M. Veloso

Published in: AAAI (2007)

Keyphrases

petri net
reinforcement learning
optimal strategy
markov decision processes
long term and short term
imperfect information
multiarmed bandit
gray level
perfect information
bandit problems
web services
dynamic programming
neural network
discrete event
free riding
multi armed bandits
real time
artificial intelligence
machine learning
opponent modeling