Twice regularized MDPs and the equivalence between robustness and regularization.
Esther DermanMatthieu GeistShie MannorPublished in: NeurIPS (2021)
Keyphrases
- markov decision processes
- regularization method
- regularization framework
- risk minimization
- reinforcement learning
- decision diagrams
- trace norm
- half quadratic
- regularization methods
- solution path
- state space
- regularization parameter
- mixed norm
- policy iteration
- planning under uncertainty
- finite horizon
- optimal policy
- semi supervised learning
- equivalence relationship
- regularization term
- prior information
- total least squares
- support vector