Reinforcement Learning with Token-level Feedback for Controllable Text Generation.
Wendi LiWei WeiKaihe XuWenfeng XieDangyang ChenYu ChengPublished in: CoRR (2024)
Keyphrases
- text generation
- reinforcement learning
- natural language generation
- state space
- machine learning
- artificial intelligence
- natural language
- real time
- supervised learning
- function approximation
- model free
- feedback mechanisms
- reinforcement learning algorithms
- theorem prover
- markov decision processes
- transfer learning
- higher level
- probability distribution