Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers.
Pingcheng DongYonghao TanDong ZhangTianwei NiXuejiao LiuYu LiuPeng LuoLuhong LiangShih-Yang LiuXijie HuangHuaiyu ZhuYun PanFengwei AnKwang-Ting ChengPublished in: CoRR (2024)