Budgeted Multi-Armed Bandit in Continuous Action Space.
Francesco TrovòStefano PaladinoMarcello RestelliNicola GattiPublished in: ECAI (2016)
Keyphrases
- action space
- multi armed bandit
- reinforcement learning
- state space
- markov decision processes
- real valued
- continuous state spaces
- multi armed bandits
- action selection
- stochastic processes
- continuous action
- multi class
- single agent
- learning algorithm
- markov decision process
- pairwise
- least squares
- feature space
- machine learning