Login / Signup
Ziniu Li
Publication Activity (10 Years)
Years Active: 2019-2024
Publications (10 Years): 23
Top Topics
Small Sample
Language Model
Imitation Learning
Error Bounds
Top Venues
CoRR
NeurIPS
ICLR
IEEE Trans. Pattern Anal. Mach. Intell.
</>
Publications
</>
Jiancong Xiao
,
Ziniu Li
,
Xingyu Xie
,
Emily J. Getzen
,
Cong Fang
,
Qi Long
,
Weijie J. Su
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization.
CoRR
(2024)
Yushun Zhang
,
Congliang Chen
,
Tian Ding
,
Ziniu Li
,
Ruoyu Sun
,
Zhi-Quan Luo
Why Transformers Need Adam: A Hessian Perspective.
CoRR
(2024)
Yushun Zhang
,
Congliang Chen
,
Ziniu Li
,
Tian Ding
,
Chenwei Wu
,
Yinyu Ye
,
Zhi-Quan Luo
,
Ruoyu Sun
Adam-mini: Use Fewer Learning Rates To Gain More.
CoRR
(2024)
Chengxing Jia
,
Pengyuan Wang
,
Ziniu Li
,
Yi-Chen Li
,
Zhilong Zhang
,
Nan Tang
,
Yang Yu
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation.
CoRR
(2024)
Ziniu Li
,
Tian Xu
,
Yang Yu
When is RL better than DPO in RLHF? A Representation and Optimization Perspective.
Tiny Papers @ ICLR
(2024)
Ziniu Li
,
Tian Xu
,
Yang Yu
Policy Optimization in RLHF: The Impact of Out-of-preference Data.
CoRR
(2023)
Ziniu Li
,
Tian Xu
,
Zeyu Qin
,
Yang Yu
,
Zhi-Quan Luo
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms.
NeurIPS
(2023)
Tian Xu
,
Ziniu Li
,
Yang Yu
,
Zhi-Quan Luo
Provably Efficient Adversarial Imitation Learning with Unknown Transitions.
UAI
(2023)
Tian Xu
,
Ziniu Li
,
Yang Yu
,
Zhi-Quan Luo
Provably Efficient Adversarial Imitation Learning with Unknown Transitions.
CoRR
(2023)
Ziniu Li
,
Tian Xu
,
Yang Yu
,
Zhi-Quan Luo
Theoretical Analysis of Offline Imitation With Supplementary Dataset.
CoRR
(2023)
Ziniu Li
,
Tian Xu
,
Yushun Zhang
,
Yang Yu
,
Ruoyu Sun
,
Zhi-Quan Luo
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.
CoRR
(2023)
Ziniu Li
,
Ke Xu
,
Liu Liu
,
Lanqing Li
,
Deheng Ye
,
Peilin Zhao
Deploying Offline Reinforcement Learning with Human Feedback.
CoRR
(2023)
Tian Xu
,
Ziniu Li
,
Yang Yu
Error Bounds of Imitating Policies and Environments for Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell.
44 (10) (2022)
Tian Xu
,
Ziniu Li
,
Yang Yu
,
Zhi-Quan Luo
Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis.
CoRR
(2022)
Ziniu Li
,
Yingru Li
,
Yushun Zhang
,
Tong Zhang
,
Zhi-Quan Luo
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning.
ICLR
(2022)
Ziniu Li
,
Tian Xu
,
Yang Yu
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle.
CoRR
(2022)
Ziniu Li
,
Tian Xu
,
Yang Yu
,
Zhi-Quan Luo
Rethinking ValueDice: Does It Really Improve Performance?
CoRR
(2022)
Tian Xu
,
Ziniu Li
,
Yang Yu
Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions.
CoRR
(2021)
Tian Xu
,
Ziniu Li
,
Yang Yu
Error Bounds of Imitating Policies and Environments.
NeurIPS
(2020)
Tian Xu
,
Ziniu Li
,
Yang Yu
Error Bounds of Imitating Policies and Environments.
CoRR
(2020)
Ziniu Li
,
Xiong-Hui Chen
Efficient Exploration by Novelty-Pursuit.
DAI
(2020)
Xinjian Huang
,
Ziniu Li
,
Zhiyuan Liu
,
Bin Xiang
,
Yingsan Geng
,
Jianhua Wang
Solving the Inverse Design Problem of Electrical Fuse With Machine Learning.
IEEE Access
8 (2020)
Tian Xu
,
Ziniu Li
,
Yang Yu
On Value Discrepancy of Imitation Learning.
CoRR
(2019)