Reinforcement Learning for Bandits with Continuous Actions and Large Context Spaces.
Paul DuckworthKatherine A. VallisBruno LacerdaNick HawesPublished in: ECAI (2023)
Keyphrases
- reinforcement learning
- action space
- contextual information
- action selection
- markov decision processes
- context sensitive
- perceptual aliasing
- state and action spaces
- discrete data
- partially observable
- context aware
- partially observable domains
- machine learning
- partial observability
- continuous data
- stochastic processes
- reinforcement learning algorithms
- function approximation
- multi agent systems
- multi agent
- objective function
- learning algorithm