Policy Gradient MaxSAT Solver.
Omar Gutiérrez-De-La-PazRicardo Menchaca-MendezErik Zamora GómezUriel Corona BermúdezRolando Menchaca-MéndezBruno Gutiérrez-De-La-PazPublished in: Computación y Sistemas (CyS) (2024)
Keyphrases
- policy gradient
- actor critic
- reinforcement learning
- gradient method
- parametric optimization
- upper bound
- function approximation
- optimal control
- sat solvers
- approximation methods
- reinforcement learning algorithms
- partially observable markov decision processes
- model free reinforcement learning
- variance reduction
- machine learning
- lower bound
- search algorithm