Provably Efficient Model-free RL in Leader-Follower MDP with Linear Function Approximation.

Published in: CoRR (2022)

Keyphrases

function approximation
model free
reinforcement learning
policy iteration
reinforcement learning algorithms
temporal difference
average reward
function approximators
temporal difference learning
leader follower
policy evaluation
learning tasks
markov decision processes
radial basis function
rl algorithms
state space
neural network
optimal policy
machine learning
learning problems
reinforcement learning methods
learning algorithm
optimal control
finite state
transfer learning
support vector machine
mobile robot
dynamic programming
learning process
artificial neural networks