The Value Function Polytope in Reinforcement Learning.

Robert Dadashi Marc G. Bellemare Adrien Ali Taïga Nicolas Le Roux Dale Schuurmans

Published in: ICML (2019)

Keyphrases

reinforcement learning
function approximators
function approximation
machine learning
information systems
temporal difference learning
supervised learning
markov decision processes
convex hull
lattice points
policy gradient
state action
reinforcement learning algorithms
action selection
state space
dynamic programming
learning process
objective function
learning algorithm