Autonomous Exploration for Navigating in MDPs Using Blackbox RL Algorithms.

Pratik Gajane Peter Auer Ronald Ortner

Published in: IJCAI (2023)

Keyphrases

reinforcement learning
rl algorithms
average reward
markov decision processes
model free
continuous state spaces
state space
action selection
policy iteration
optimal policy
function approximation
learning capabilities
stochastic games
cooperative
long run
learning problems
reinforcement learning algorithms
markov decision process
learning algorithm
finite state
adaptive control
partially observable
markov chain
temporal difference
supervised learning