Challenging Common Assumptions in Convex Reinforcement Learning.

Mirco Mutti Riccardo De Santi Piersilvio De Bartolomeis Marcello Restelli

Published in: CoRR (2022)

Keyphrases

reinforcement learning
function approximation
learning algorithm
piecewise linear
convex optimization
real world
neural network
robotic control
temporal difference
model free
optimal policy
state space
machine learning
markov decision processes
evolutionary algorithm
temporal difference learning
simplifying assumptions
real time