Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation.
Wooseong ChoTaehyun HwangJoongkyu LeeMin-hwan OhPublished in: CoRR (2024)
Keyphrases
- function approximation
- reinforcement learning
- exploration exploitation tradeoff
- temporal difference learning
- function approximators
- action selection
- state action space
- tile coding
- temporal difference
- model free
- mountain car
- radial basis function
- reinforcement learning algorithms
- temporal difference learning algorithms
- learning tasks
- state space
- machine learning
- text categorization
- text classification
- learning process
- learning algorithm
- markov decision processes
- control problems
- optimal policy
- td learning
- dynamic programming
- feature vectors
- multi agent