Enhancing OOD Generalization in Offline Reinforcement Learning with Energy-Based Policy Optimization.
Hongye CaoShangdong YangJing HuoXingguo ChenYang GaoPublished in: ECAI (2023)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- formal specification
- action selection
- markov decision process
- dynamic aspects
- markov decision processes
- learning algorithm
- state space
- infinite horizon
- function approximation
- partially observable
- action space
- markov decision problems
- partially observable environments
- reinforcement learning algorithms
- function approximators
- data model
- machine learning
- reinforcement learning problems