Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation.
Nikki Lijing KuangMing YinMengdi WangYu-Xiang WangYian MaPublished in: NeurIPS (2023)
Keyphrases
- function approximation
- reinforcement learning
- temporal difference learning algorithms
- function approximators
- delayed feedback
- temporal difference
- mountain car
- model free
- tile coding
- reinforcement learning algorithms
- temporal difference learning
- state action space
- markov decision processes
- radial basis function
- state space
- td learning
- learning tasks
- action selection
- machine learning
- probability distribution
- learning algorithm
- multi agent
- neural network
- feature vectors
- feature extraction
- reinforcement learning methods
- policy gradient
- policy search
- temporal difference methods
- image classification
- genetic algorithm
- knn