Latent-Variable Advantage-Weighted Policy Optimization for Offline RL.
Xi ChenAli GhadirzadehTianhe YuYuan GaoJianhao WangWenzhe LiBin LiangChelsea FinnChongjie ZhangPublished in: CoRR (2022)
Keyphrases
- latent variables
- optimal policy
- reinforcement learning
- probabilistic model
- latent variable models
- topic models
- random variables
- real valued
- policy gradient
- action selection
- state space
- hierarchical model
- markov decision process
- policy search
- markov decision processes
- reinforcement learning algorithms
- control policy
- prior knowledge
- data sets
- hidden variables
- structured prediction
- action space
- knowledge discovery