Login / Signup
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling.
Xingzhou Lou
Junge Zhang
Jian Xie
Lifeng Liu
Dong Yan
Kaiqi Huang
Published in:
CoRR (2024)
Keyphrases
</>
multi dimensional
reinforcement learning
multi attribute
data cube construction
data sets
decision trees
pairwise
modeling method
web services
index structure
sequential patterns
image alignment
soft constraints