Generalized State-Dependent Exploration for Deep Reinforcement Learning in Robotics.

Antonin Raffin Freek Stulp

Published in: CoRR (2020)

Keyphrases

state dependent
optimal policy
reinforcement learning
continuous state
steady state
active exploration
state space
markov decision processes
arrival rate
decision problems
single server
function approximation
action selection
finite state
queueing networks
dynamic programming
product form
long run
asymptotically optimal
infinite horizon
markov decision process
stationary distribution
multistage
sufficient conditions
machine learning
queue length
markov chain
service rates
learning algorithm
customer demand