Login / Signup

Accelerating a Triton Fused Kernel for W4A16 Quantized Inference with SplitK work decomposition.

Adnan HoqueLess WrightJamie YangMudhakar SrivatsaRaghu K. Ganti
Published in: CoRR (2024)
Keyphrases