LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback.
Timon ZiegenbeinGabriella SkitalinskayaAlireza Bayat MakouHenning WachsmuthPublished in: ACL (1) (2024)
Keyphrases
- reinforcement learning
- function approximation
- argumentation systems
- state space
- optimal control
- relevance feedback
- reinforcement learning algorithms
- model free
- user feedback
- rewriting rules
- policy search
- query rewriting
- temporal difference
- flowshop
- markov decision processes
- dynamic programming
- learning process
- computer science courses
- argumentation skills
- robotic control
- active exploration
- machine learning
- defeasible reasoning
- agent communication
- reinforcement learning methods
- datalog programs
- transfer learning
- learning algorithm