Sign in
Ziniu Li
Publication Activity (10 Years)
Years Active: 2019-2023
Publications (10 Years): 18
Top Topics
State And Action Spaces
Reward Signal
Error Bounds
Markov Games
Top Venues
CoRR
NeurIPS
ICLR
IEEE Trans. Pattern Anal. Mach. Intell.
</>
Publications
</>
Ziniu Li
,
Tian Xu
,
Yang Yu
Policy Optimization in RLHF: The Impact of Out-of-preference Data.
CoRR
(2023)
Ziniu Li
,
Tian Xu
,
Zeyu Qin
,
Yang Yu
,
Zhi-Quan Luo
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms.
NeurIPS
(2023)
Tian Xu
,
Ziniu Li
,
Yang Yu
,
Zhi-Quan Luo
Provably Efficient Adversarial Imitation Learning with Unknown Transitions.
UAI
(2023)
Tian Xu
,
Ziniu Li
,
Yang Yu
,
Zhi-Quan Luo
Provably Efficient Adversarial Imitation Learning with Unknown Transitions.
CoRR
(2023)
Ziniu Li
,
Tian Xu
,
Yang Yu
,
Zhi-Quan Luo
Theoretical Analysis of Offline Imitation With Supplementary Dataset.
CoRR
(2023)
Ziniu Li
,
Tian Xu
,
Yushun Zhang
,
Yang Yu
,
Ruoyu Sun
,
Zhi-Quan Luo
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.
CoRR
(2023)
Ziniu Li
,
Ke Xu
,
Liu Liu
,
Lanqing Li
,
Deheng Ye
,
Peilin Zhao
Deploying Offline Reinforcement Learning with Human Feedback.
CoRR
(2023)
Tian Xu
,
Ziniu Li
,
Yang Yu
Error Bounds of Imitating Policies and Environments for Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell.
44 (10) (2022)
Tian Xu
,
Ziniu Li
,
Yang Yu
,
Zhi-Quan Luo
Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis.
CoRR
(2022)
Ziniu Li
,
Yingru Li
,
Yushun Zhang
,
Tong Zhang
,
Zhi-Quan Luo
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning.
ICLR
(2022)
Ziniu Li
,
Tian Xu
,
Yang Yu
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle.
CoRR
(2022)
Ziniu Li
,
Tian Xu
,
Yang Yu
,
Zhi-Quan Luo
Rethinking ValueDice: Does It Really Improve Performance?
CoRR
(2022)
Tian Xu
,
Ziniu Li
,
Yang Yu
Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions.
CoRR
(2021)
Tian Xu
,
Ziniu Li
,
Yang Yu
Error Bounds of Imitating Policies and Environments.
NeurIPS
(2020)
Tian Xu
,
Ziniu Li
,
Yang Yu
Error Bounds of Imitating Policies and Environments.
CoRR
(2020)
Ziniu Li
,
Xiong-Hui Chen
Efficient Exploration by Novelty-Pursuit.
DAI
(2020)
Xinjian Huang
,
Ziniu Li
,
Zhiyuan Liu
,
Bin Xiang
,
Yingsan Geng
,
Jianhua Wang
Solving the Inverse Design Problem of Electrical Fuse With Machine Learning.
IEEE Access
8 (2020)
Tian Xu
,
Ziniu Li
,
Yang Yu
On Value Discrepancy of Imitation Learning.
CoRR
(2019)