Publication: Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis.