Anytime-Competitive Reinforcement Learning with Policy Prior.
Jianyi YangPengfei LiTongxin LiAdam WiermanShaolei RenPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- action selection
- markov decision process
- function approximators
- partially observable environments
- function approximation
- policy gradient
- reinforcement learning problems
- markov decision problems
- actor critic
- approximate dynamic programming
- markov decision processes
- state space
- prior knowledge
- policy iteration
- machine learning
- rl algorithms
- model free
- state and action spaces
- anytime algorithms
- partially observable
- action space
- learning algorithm
- model free reinforcement learning
- policy evaluation
- control policies
- temporal difference learning
- control policy
- reinforcement learning algorithms
- reward function
- temporal difference
- control problems
- robotic control
- transition model
- state action
- dynamic programming
- infinite horizon
- continuous state spaces
- prior information
- multi agent
- transfer learning
- learning problems
- optimal control
- finite state
- continuous state
- policy makers
- state dependent