Login / Signup
Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management.
Zhengxu Hou
Bang Liu
Ruihui Zhao
Zijing Ou
Yafei Liu
Xi Chen
Yefeng Zheng
Published in:
NAACL-HLT (2021)
Keyphrases
</>
reinforcement learning
long run
decision support
neural network
information systems
decision making
case study
management system
average reward
database
modeling language
information management
data processing
data management
software engineering
user interface
natural language
databases