MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator.
Xiao-Yin LiuXiao-Hu ZhouGuo-Tao LiHao LiMei-Jiang GuiTian-Yu XiangDe-Xing HuangZeng-Guang HouPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- model free
- real time
- state space
- reinforcement learning algorithms
- function approximation
- temporal difference learning
- state action
- data driven
- linear program
- actor critic
- temporal difference
- machine learning
- multi agent reinforcement learning
- piecewise linear
- markov decision processes
- optimal policy
- learning process
- learning algorithm
- learning problems
- partially observable
- support vector