​
Login / Signup
Zhenyu Zhang
Publication Activity (10 Years)
Years Active: 2024-2024
Publications (10 Years): 10
Top Topics
Rank Minimization
Low Rank Matrices
Robust Principal Component Analysis
Matrix Completion
Top Venues
CoRR
ICLR
MLSys
AAAI
</>
Publications
</>
Harry Dong
,
Xinyu Yang
,
Zhenyu Zhang
,
Zhangyang Wang
,
Yuejie Chi
,
Beidi Chen
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference.
CoRR
(2024)
Jiawei Zhao
,
Zhenyu Zhang
,
Beidi Chen
,
Zhangyang Wang
,
Anima Anandkumar
,
Yuandong Tian
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection.
CoRR
(2024)
Zhenyu Zhang
,
Ajay Jaiswal
,
Lu Yin
,
Shiwei Liu
,
Jiawei Zhao
,
Yuandong Tian
,
Zhangyang Wang
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
CoRR
(2024)
Zhenyu Zhang
,
Runjin Chen
,
Shiwei Liu
,
Zhewei Yao
,
Olatunji Ruwase
,
Beidi Chen
,
Xiaoxia Wu
,
Zhangyang Wang
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding.
CoRR
(2024)
Tianlong Chen
,
Zhenyu Zhang
,
Hanrui Wang
,
Jiaqi Gu
,
Zirui Li
,
David Z. Pan
,
Frederic T. Chong
,
Song Han
,
Zhangyang Wang
QuantumSEA: In-Time Sparse Exploration for Noise Adaptive Quantum Circuits.
CoRR
(2024)
Zhenyu Zhang
,
Shiwei Liu
,
Runjin Chen
,
Bhavya Kailkhura
,
Beidi Chen
,
Atlas Wang
Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache.
MLSys
(2024)
Zhen Tan
,
Tianlong Chen
,
Zhenyu Zhang
,
Huan Liu
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention.
AAAI
(2024)
Ajay Jaiswal
,
Lu Yin
,
Zhenyu Zhang
,
Shiwei Liu
,
Jiawei Zhao
,
Yuandong Tian
,
Zhangyang Wang
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients.
CoRR
(2024)
Yuandong Tian
,
Yiping Wang
,
Zhenyu Zhang
,
Beidi Chen
,
Simon Shaolei Du
JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention.
ICLR
(2024)
Pingzhi Li
,
Zhenyu Zhang
,
Prateek Yadav
,
Yi-Lin Sung
,
Yu Cheng
,
Mohit Bansal
,
Tianlong Chen
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy.
ICLR
(2024)