Login / Signup

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling.

Xingzhou LouJunge ZhangJian XieLifeng LiuDong YanKaiqi Huang
Published in: CoRR (2024)
Keyphrases
  • multi dimensional
  • reinforcement learning
  • multi attribute
  • data cube construction
  • data sets
  • decision trees
  • pairwise
  • modeling method
  • web services
  • index structure
  • sequential patterns
  • image alignment
  • soft constraints