RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs.
Afra Feyza AkyürekEkin AkyürekAshwin KalyanPeter ClarkDerry Tanti WijayaNiket TandonPublished in: ACL (1) (2023)
Keyphrases
- reinforcement learning
- natural language
- mathematical model
- model free
- high level
- computational model
- learning algorithm
- prior knowledge
- probabilistic model
- management system
- language understanding
- generation process
- dialogue system
- markov decision processes
- question answering
- optimal policy
- natural language processing
- machine learning
- probability distribution
- bayesian networks
- similarity measure
- genetic algorithm