Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels.
Jan-Philipp FränkenEric ZelikmanRafael RafailovKanishk GandhiTobias GerstenbergNoah D. GoodmanPublished in: CoRR (2024)
Keyphrases
- mutual information
- learning process
- learning problems
- training data
- reinforcement learning
- incremental learning
- online learning
- learning tasks
- learning algorithm
- prior knowledge
- active learning
- image registration
- information theoretic
- preference learning
- preference elicitation
- mobile learning
- class labels
- collaborative learning
- support vector
- similarity measure
- e learning