Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation.

Long-Fei Li Yu-Jie Zhang Peng Zhao Zhi-Hua Zhou

Published in: CoRR (2024)

Keyphrases

function approximation
reinforcement learning
radial basis function
temporal difference learning algorithms
temporal difference
reinforcement learning algorithms
tile coding
model free
temporal difference learning
function approximators
learning tasks
state action space
mountain car
state space
pattern recognition
policy gradient
neural network
td learning
temporal difference methods
supervised learning
multinomial logit