StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback.
Shihan DouYan LiuHaoxiang JiaLimao XiongEnyu ZhouWei ShenJunjie ShanCaishuang HuangXiao WangXiaoran FanZhiheng XiYuhao ZhouTao JiRui ZhengQi ZhangXuanjing HuangTao GuiPublished in: CoRR (2024)