Safe Reinforcement Learning through Meta-learned Instincts.

Djordje Grbic Sebastian Risi

Published in: CoRR (2020)

Keyphrases

reinforcement learning
function approximation
reinforcement learning algorithms
model free
learned knowledge
temporal difference
state space
robotic control
stochastic approximation
evolutionary algorithm
optimal policy
optimal control
learning algorithm
temporal difference learning
reinforcement learning methods
meta level
dynamic programming
domain knowledge
partially observable
markov decision process
genetic algorithm
previously learned
autonomous learning
policy search
real time