Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation.
Long-Fei LiYu-Jie ZhangPeng ZhaoZhi-Hua ZhouPublished in: CoRR (2024)
Keyphrases
- function approximation
- reinforcement learning
- radial basis function
- temporal difference learning algorithms
- temporal difference
- reinforcement learning algorithms
- tile coding
- model free
- temporal difference learning
- function approximators
- learning tasks
- state action space
- mountain car
- state space
- pattern recognition
- policy gradient
- neural network
- td learning
- temporal difference methods
- supervised learning
- multinomial logit