Trajectory-Oriented Policy Optimization with Sparse Rewards.
Guojian WangFaguo WuXiao ZhangPublished in: CoRR (2024)
Keyphrases
- optimization problems
- optimal policy
- sparse pca
- global optimization
- constrained optimization
- reward function
- control policy
- reinforcement learning
- high dimensional
- markov decision processes
- multiarmed bandit
- neural network
- direct search
- joint optimization
- partially observable
- optimization methods
- sparse representation
- optimization algorithm