Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization.
Alexandre LaterreYunguan FuMohamed Khalil JabriAlain-Sam CohenDavid KasKarl HajjarTorbjorn S. DahlAmine KerkeniKarim BeguirPublished in: CoRR (2018)
Keyphrases
- combinatorial optimization
- reinforcement learning
- combinatorial optimization problems
- metaheuristic
- branch and bound
- simulated annealing
- traveling salesman problem
- function approximation
- optimization problems
- branch and bound algorithm
- model free
- state space
- reinforcement learning algorithms
- mathematical programming
- reward function
- combinatorial problems
- markov decision processes
- partially observable environments
- eligibility traces
- temporal difference
- learning algorithm
- machine learning
- exact algorithms
- learning problems
- dynamic programming
- hard combinatorial optimization problems
- quadratic assignment problem
- reward shaping
- max flow min cut
- policy gradient
- learning agent
- combinatorial search
- vehicle routing problem
- evolutionary algorithm
- search space
- genetic algorithm