Login / Signup
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning.
Zimu Lu
Aojun Zhou
Ke Wang
Houxing Ren
Weikang Shi
Junting Pan
Mingjie Zhan
Hongsheng Li
Published in:
CoRR (2024)
Keyphrases
</>
knowledge base
mathematical proofs
error rate
error bounds
post processing
reasoning systems
human reasoning
image sequences
knowledge representation
prediction error
estimation error
error analysis
meta level
reasoning tasks
relative error
knowledge representation and reasoning