Mildly Constrained Evaluation Policy for Offline Reinforcement Learning.
Linjie XuZhengyao JiangJinyu WangLei SongJiang BianPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- optimal policy
- action selection
- partially observable environments
- markov decision process
- multi agent
- evaluation criteria
- policy search
- real time
- approximate dynamic programming
- state action
- learning algorithm
- data sets
- state and action spaces
- policy gradient
- function approximators
- action space
- dynamic programming
- partially observable
- model free
- transfer learning
- state space