On First-Order Meta-Reinforcement Learning with Moreau Envelopes.
Mohammad Taha ToghaniSebastian Perez-SalazarCésar A. UribePublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- higher order
- function approximation
- state space
- first order logic
- reinforcement learning algorithms
- markov decision processes
- multi agent
- real time
- learning algorithm
- stochastic dominance
- temporal difference learning
- optimal control
- meta level
- robotic control
- markov decision process
- model free
- action selection
- optimal policy
- horn clauses
- supervised learning
- control policy
- conditional logic
- dynamic programming
- policy gradient
- case study
- machine learning