​
Login / Signup
Shentao Yang
Publication Activity (10 Years)
Years Active: 2022-2024
Publications (10 Years): 11
Top Topics
Dialogue System
Reinforcement Learning
Partially Observable
Stationary Distribution
Top Venues
CoRR
NeurIPS
ICLR
ICML
</>
Publications
</>
Rohan Chitnis
,
Shentao Yang
,
Alborz Geramifard
Sequential Decision-Making for Inline Text Autocomplete.
CoRR
(2024)
Shentao Yang
,
Tianqi Chen
,
Mingyuan Zhou
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference.
CoRR
(2024)
Yihao Feng
,
Shentao Yang
,
Shujian Zhang
,
Jianguo Zhang
,
Caiming Xiong
,
Mingyuan Zhou
,
Huan Wang
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems.
ICLR
(2023)
Shentao Yang
,
Shujian Zhang
,
Congying Xia
,
Yihao Feng
,
Caiming Xiong
,
Mingyuan Zhou
Preference-grounded Token-level Guidance for Language Model Fine-tuning.
NeurIPS
(2023)
Shentao Yang
,
Shujian Zhang
,
Congying Xia
,
Yihao Feng
,
Caiming Xiong
,
Mingyuan Zhou
Preference-grounded Token-level Guidance for Language Model Fine-tuning.
CoRR
(2023)
Yihao Feng
,
Shentao Yang
,
Shujian Zhang
,
Jianguo Zhang
,
Caiming Xiong
,
Mingyuan Zhou
,
Huan Wang
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems.
CoRR
(2023)
Shentao Yang
,
Zhendong Wang
,
Huangjie Zheng
,
Yihao Feng
,
Mingyuan Zhou
A Regularized Implicit Policy for Offline Reinforcement Learning.
CoRR
(2022)
Shentao Yang
,
Yihao Feng
,
Shujian Zhang
,
Mingyuan Zhou
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning.
CoRR
(2022)
Shentao Yang
,
Shujian Zhang
,
Yihao Feng
,
Mingyuan Zhou
A Unified Framework for Alternating Offline Model Training and Policy Learning.
NeurIPS
(2022)
Shentao Yang
,
Yihao Feng
,
Shujian Zhang
,
Mingyuan Zhou
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning.
ICML
(2022)
Shentao Yang
,
Shujian Zhang
,
Yihao Feng
,
Mingyuan Zhou
A Unified Framework for Alternating Offline Model Training and Policy Learning.
CoRR
(2022)