Behind the Myth of Exploration in Policy Gradients.

Adrien Bolland Gaspard Lambrechts Damien Ernst

Published in: CoRR (2024)

Keyphrases

action selection
optimal policy
real time
infinite horizon
genetic algorithm
decision making
state space
edge detection
state dependent
policy making