Autonomous Exploration for Navigating in MDPs Using Blackbox RL Algorithms.
Pratik GajanePeter AuerRonald OrtnerPublished in: IJCAI (2023)
Keyphrases
- reinforcement learning
- rl algorithms
- average reward
- markov decision processes
- model free
- continuous state spaces
- state space
- action selection
- policy iteration
- optimal policy
- function approximation
- learning capabilities
- stochastic games
- cooperative
- long run
- learning problems
- reinforcement learning algorithms
- markov decision process
- learning algorithm
- finite state
- adaptive control
- partially observable
- markov chain
- temporal difference
- supervised learning