Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs.
Dongruo ZhouQuanquan GuPublished in: NeurIPS (2022)
Keyphrases
- reinforcement learning
- computationally efficient
- markov decision processes
- optimal policy
- state space
- function approximation
- dynamic programming
- markov decision process
- reinforcement learning algorithms
- function approximators
- learning algorithm
- policy search
- model free
- multi agent
- reward function
- partially observable
- continuous state and action spaces
- mixture model
- model based reinforcement learning
- action selection
- policy iteration
- average reward
- reinforcement learning methods
- markov decision problems
- policy evaluation
- decision theoretic planning
- factored markov decision processes
- factored mdps
- action sets
- finite state
- machine learning
- continuous state spaces
- planning under uncertainty
- expectation maximization
- temporal difference
- decision problems