DRAS: Deep Reinforcement Learning for Cluster Scheduling in High Performance Computing.
Yuping FanBoyang LiDustin FavoriteNaunidh SinghJohn Taylor ChildersPaul RichWilliam E. AllcockMichael E. PapkaZhiling LanPublished in: IEEE Trans. Parallel Distributed Syst. (2022)
Keyphrases
- high performance computing
- reinforcement learning
- scientific computing
- massively parallel
- heterogeneous computing
- computational science
- computing systems
- computing resources
- grid computing
- fault tolerance
- molecular dynamics
- parallel computing
- energy efficiency
- parallel machines
- scheduling problem
- multi agent
- national laboratory
- computing environments
- distributed environment
- resource allocation
- dynamic programming
- databases
- power consumption
- distributed systems
- response time
- computing infrastructure
- metadata
- machine learning