Optimizing High-Throughput Inference on Graph Neural Networks at Shared Computing Facilities with the NVIDIA Triton Inference Server.
Claire SavardNicholas ManganelliBurt HolzmanLindsey GrayAlexx PerloffKevin PedroKevin StensonKeith UlmerPublished in: Comput. Softw. Big Sci. (2024)