Login / Signup
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model.
Qi Gou
Cam-Tu Nguyen
Published in:
CoRR (2024)
Keyphrases
</>
reference model
reinforcement learning
data collection
data analysis
data sets
data structure
database
databases
data mining
machine learning
data points
knowledge discovery
computational intelligence
multiagent systems
markov decision processes