Publication: Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning.