Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.
Stephen CasperXander DaviesClaudia ShiThomas Krendl GilbertJérémy ScheurerJavier RandoRachel FreedmanTomasz KorbakDavid LindnerPedro FreireTony Tong WangSamuel MarksCharbel-Raphaël SégerieMicah CarrollAndi PengPhillip J. K. ChristoffersenMehul DamaniStewart SlocumUsman AnwarAnand SiththaranjanMax NadeauEric J. MichaudJacob PfauDmitrii KrasheninnikovXin ChenLauro LangoscoPeter HaseErdem BiyikAnca D. DraganDavid KruegerDorsa SadighDylan Hadfield-MenellPublished in: Trans. Mach. Learn. Res. (2023)
Keyphrases
- open problems
- reinforcement learning
- long standing
- database theory
- human subjects
- function approximation
- reinforcement learning algorithms
- state space
- human operators
- machine learning
- multidatabase transaction management
- computational advertising
- model free
- human behavior
- human experts
- optimal policy
- user feedback
- human cognition
- multiagent learning
- learning algorithm
- database