Prompt-Based Length Controlled Generation with Reinforcement Learning.
Renlong JieXiaojun MengLifeng ShangXin JiangQun LiuPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- machine learning
- function approximation
- multi agent
- active learning
- control system
- dynamic programming
- least squares
- temporal difference learning
- action selection
- robotic control
- real time
- stochastic approximation
- learning capabilities
- learning problems
- markov decision processes
- decision making
- databases