Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment Regimes.
Changchang YinRuoqi LiuJeffrey M. CaterinoPing ZhangPublished in: CoRR (2022)
Keyphrases
- actor critic
- policy gradient
- reinforcement learning
- reinforcement learning algorithms
- policy gradient methods
- approximate dynamic programming
- average reward
- neural network
- cost function
- optimal policy
- policy iteration
- dynamic environments
- dynamical systems
- markov decision processes
- optimal control
- temporal difference