Exploration versus exploitation in reinforcement learning: a stochastic control approach.
Haoran WangThaleia ZariphopoulouXunyu ZhouPublished in: CoRR (2018)
Keyphrases
- stochastic control
- reinforcement learning
- control problems
- exploration exploitation tradeoff
- optimal control
- rl algorithms
- function approximation
- action selection
- queueing systems
- brownian motion
- state space
- operations management
- adaptive control
- model free
- dynamic programming
- optimal policy
- learning algorithm
- markov decision processes
- reinforcement learning algorithms
- learning tasks
- multi agent
- stochastic process
- reward function
- learning problems