Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs.
Kaixuan JiQingyue ZhaoJiafan HeWeitong ZhangQuanquan GuPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- markov decision processes
- state space
- multi agent
- optimal policy
- function approximation
- state and action spaces
- function approximators
- reinforcement learning algorithms
- learning algorithm
- partially observable
- temporal difference
- policy search
- markov decision problems
- markov decision process
- dynamic programming
- machine learning
- reward function
- model free
- control problems
- policy iteration
- finite state
- action space
- mixture model
- factored markov decision processes
- action selection
- optimal control
- real valued
- decision theoretic planning
- gaussian densities
- continuous state and action spaces