Explore-Go: Leveraging Exploration for Generalisation in Deep Reinforcement Learning.

Max Weltevrede Felix Kaubek Matthijs T. J. Spaan Wendelin Böhmer

Published in: CoRR (2024)

Keyphrases

reinforcement learning
exploration strategy
action selection
active exploration
function approximation
model based reinforcement learning
autonomous learning
exploration exploitation
learning algorithm
multi agent
state space
optimal policy
model free
policy search
multi agent reinforcement learning
learning process
real world
temporal difference
reward function
reinforcement learning algorithms
markov decision process
deep learning
temporal difference learning
learning problems
dynamic programming
evolutionary algorithm
multi agent systems
genetic algorithm
machine learning