Reinforcement Learning from Reflective Feedback (RLRF): Aligning and Improving LLMs via Fine-Grained Self-Reflection.
Kyungjae LeeDasol HwangSunghyun ParkYoungsoo JangMoontae LeePublished in: CoRR (2024)
Keyphrases
- fine grained
- reinforcement learning
- coarse grained
- reflective learning
- access control
- tightly coupled
- state space
- optimal policy
- function approximation
- massively parallel
- image registration
- relevance feedback
- database
- markov decision processes
- user feedback
- xml documents
- relational databases
- information retrieval
- databases