Login / Signup
Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization.
Yihan Du
Anna Winnicki
Gal Dalal
Shie Mannor
R. Srikant
Published in:
CoRR (2024)
Keyphrases
</>
data sets
synthetic data
data collection
raw data
image data
knowledge discovery
data processing
historical data
data objects
original data
data distribution
experimental data
statistical analysis
probability distribution
data sources
data analysis
database systems
neural network
end users
high quality