Login / Signup
GraphIt to CUDA Compiler in 2021 LOC: A Case for High-Performance DSL Implementation via Staging with BuilDSL.
Ajay Brahmakshatriya
Saman P. Amarasinghe
Published in:
CGO (2022)
Keyphrases
</>
general purpose
parallel implementation
parallel computers
distributed memory machines
neural network
programming language
efficient implementation
scientific computing
highly optimized
databases
distributed systems
software systems
low overhead
graphics processors
ibm sp