Login / Signup

Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management.

Zhengxu HouBang LiuRuihui ZhaoZijing OuYafei LiuXi ChenYefeng Zheng
Published in: NAACL-HLT (2021)
Keyphrases