Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.
Stephen CasperXander DaviesClaudia ShiThomas Krendl GilbertJérémy ScheurerJavier RandoRachel FreedmanTomasz KorbakDavid LindnerPedro FreireTony WangSamuel MarksCharbel-Raphaël SégerieMicah CarrollAndi PengPhillip J. K. ChristoffersenMehul DamaniStewart SlocumUsman AnwarAnand SiththaranjanMax NadeauEric J. MichaudJacob PfauDmitrii KrasheninnikovXin ChenLauro LangoscoPeter HaseErdem BiyikAnca D. DraganDavid KruegerDorsa SadighDylan Hadfield-MenellPublished in: CoRR (2023)
Keyphrases
- open problems
- reinforcement learning
- database theory
- long standing
- function approximation
- multidatabase transaction management
- computational advertising
- reinforcement learning algorithms
- human operators
- learning algorithm
- dynamic programming
- human subjects
- human activities
- markov decision processes
- data warehouse
- state space
- mobile robot
- human cognition
- multiagent learning
- neural network