Geometric Value Iteration: Dynamic Error-Aware KL Regularization for Reinforcement Learning.
Toshinori KitamuraLingwei ZhuTakamitsu MatsubaraPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- markov decision processes
- state space
- optimal policy
- dynamic programming
- neural network
- error analysis
- policy iteration
- markov decision process
- function approximation
- error rate
- prior information
- dynamic environments
- geometric structure
- search algorithm
- model free
- multi agent
- learning algorithm