Sufficient Exploration for Convex Q-learning.

Fan Lu Prashant G. Mehta Sean Meyn Gergely Neu

Published in: CoRR (2022)

Keyphrases

action selection
exploration strategy
cooperative
reinforcement learning
function approximation
learning algorithm
state space
convex optimization
model free
multi agent
convex hull
stochastic approximation
optimal policy
temporal difference learning
multi agent reinforcement learning
strictly convex
reinforcement learning algorithms
globally optimal
convex sets
data sets
convex relaxation
state action
piecewise linear
dynamic environments
risk minimization
graph cuts
dynamic programming
convex constraints
objective function
active exploration