Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management.
Zhengxu HouBang LiuRuihui ZhaoZijing OuYafei LiuXi ChenYefeng ZhengPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- decision making
- long run
- management system
- information systems
- information management
- modeling method
- data management
- data structure
- inverse reinforcement learning
- modeling framework
- network management
- user interface
- database
- image sequences
- database systems
- case study
- learning algorithm
- genetic algorithm
- data sets