Optimizing ZX-Diagrams with Deep Reinforcement Learning.
Maximilian NägeleFlorian MarquardtPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- state space
- function approximation
- reinforcement learning algorithms
- model free
- learning algorithm
- decision making
- multi agent
- temporal difference learning
- deep learning
- markov decision processes
- exploration exploitation tradeoff
- direct policy search
- supervised learning
- learning process
- machine learning
- real time
- monte carlo
- optimal policy
- action selection
- dynamic programming
- action space
- stochastic approximation
- hand drawn
- uml class diagrams
- robotic control
- data mining