Optimal Policy Sparsification and Low Rank Decomposition for Deep Reinforcement Learning.
Vikram GoddlaPublished in: CoRR (2024)
Keyphrases
- optimal policy
- low rank
- reinforcement learning
- markov decision processes
- missing data
- convex optimization
- linear combination
- rank minimization
- state space
- singular value decomposition
- low rank matrix
- finite horizon
- long run
- matrix factorization
- dynamic programming
- high dimensional data
- kernel matrix
- semi supervised
- infinite horizon
- high order
- function approximation
- markov decision process
- multistage
- state dependent
- least squares
- average reward
- learning algorithm
- markov decision problems
- sufficient conditions
- policy iteration
- data analysis
- reward function
- model free
- transfer learning
- data points
- learning process
- feature space
- small number
- cost function