Generalized State-Dependent Exploration for Deep Reinforcement Learning in Robotics.
Antonin RaffinFreek StulpPublished in: CoRR (2020)
Keyphrases
- state dependent
- optimal policy
- reinforcement learning
- continuous state
- steady state
- active exploration
- state space
- markov decision processes
- arrival rate
- decision problems
- single server
- function approximation
- action selection
- finite state
- queueing networks
- dynamic programming
- product form
- long run
- asymptotically optimal
- infinite horizon
- markov decision process
- stationary distribution
- multistage
- sufficient conditions
- machine learning
- queue length
- markov chain
- service rates
- learning algorithm
- customer demand