Discrete Action On-Policy Learning with Action-Value Critic.
Yuguang YueYunhao TangMingzhang YinMingyuan YinPublished in: CoRR (2020)
Keyphrases
- action selection
- learning algorithm
- learning process
- action models
- prior knowledge
- online learning
- learning systems
- learning problems
- action sequences
- reinforcement learning
- active learning
- learning tasks
- actor critic
- learning from experience
- inverse reinforcement learning
- policy gradient
- state action
- access control
- e learning