Provably Efficient Model-free RL in Leader-Follower MDP with Linear Function Approximation.

Published in: L4DC (2023)

Keyphrases

function approximation
model free
reinforcement learning
policy iteration
reinforcement learning algorithms
temporal difference
function approximators
temporal difference learning
policy evaluation
radial basis function
markov decision processes
average reward
learning tasks
state space
rl algorithms
leader follower
multi agent
markov decision process
reinforcement learning methods
mobile robot
policy gradient
neural network
finite state