Towards Variance Reduction for Reinforcement Learning of Industrial Decision-making Tasks: A Bi-Critic based Demand-Constraint Decoupling Approach.
Jianyong YuanJiayi ZhangZinuo CaiJunchi YanPublished in: KDD (2023)
Keyphrases
- variance reduction
- reinforcement learning
- policy gradient
- decision making
- function approximation
- actor critic
- business intelligence
- reinforcement learning algorithms
- temporal difference
- gradient estimation
- transfer learning
- sample size
- monte carlo
- state space
- bias variance decomposition
- action selection
- data mining
- optimal control
- policy iteration
- importance sampling
- confidence intervals
- model free
- maximum likelihood
- support vector machine
- machine learning