Reinforcement Learning for Bandits with Continuous Actions and Large Context Spaces.

Paul Duckworth Katherine A. Vallis Bruno Lacerda Nick Hawes

Published in: ECAI (2023)

Keyphrases

reinforcement learning
action space
contextual information
action selection
markov decision processes
context sensitive
perceptual aliasing
state and action spaces
discrete data
partially observable
context aware
partially observable domains
machine learning
partial observability
continuous data
stochastic processes
reinforcement learning algorithms
function approximation
multi agent systems
multi agent
objective function
learning algorithm