Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization.
Zihan ZhouWei FuBingliang ZhangYi WuPublished in: CoRR (2022)
Keyphrases
- optimization strategies
- reinforcement learning
- optimization algorithm
- global optimization
- constrained optimization
- action selection
- optimal policy
- optimization process
- objective function
- policy gradient
- average reward
- optimization method
- total reward
- neural network
- control policy
- reward function
- optimization model
- linear programming
- optimization problems
- search space
- decision making