On First-Order Meta-Reinforcement Learning with Moreau Envelopes.

Mohammad Taha Toghani Sebastian Perez-Salazar César A. Uribe

Published in: CDC (2023)

Keyphrases

markov decision process
reinforcement learning
state space
markov decision processes
optimal policy
first order logic
higher order
function approximation
learning algorithm
multi agent
temporal difference
meta level
reinforcement learning algorithms
inductive logic programming systems
stochastic dominance
action space
model free
expert systems
machine learning
universally quantified
data mining