Adversarial Imitation Learning with Controllable Rewards for Text Generation.
Keizaburo NishikinoKenichi KobayashiPublished in: ECML/PKDD (1) (2023)
Keyphrases
- imitation learning
- text generation
- reinforcement learning
- natural language generation
- multi agent
- natural language
- robotic systems
- markov decision processes
- humanoid robot
- reinforcement learning methods
- maximum margin
- theorem prover
- machine learning
- control problems
- transfer learning
- state space
- reinforcement learning algorithms
- model free
- natural language processing
- reward function
- dynamic programming
- data mining
- learning algorithm
- optimal policy
- information retrieval
- multi modal