Login / Signup
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold.
Amrith Setlur
Saurabh Garg
Xinyang Geng
Naman Garg
Virginia Smith
Aviral Kumar
Published in:
CoRR (2024)
Keyphrases
</>
synthetic data
reinforcement learning
real image data
real world
mri data
computational efficiency
data sets
reasoning systems
automated reasoning
database
image structure
synthetic datasets
markov decision processes
model free
transfer learning
multiple scales
learning process