Evaluating BERT-based Rewards for Question Generation with Reinforcement Learning.
Peide ZhuClaudia HauffPublished in: ICTIR (2021)
Keyphrases
- reinforcement learning
- markov decision processes
- function approximation
- state space
- model free
- reinforcement learning algorithms
- optimal policy
- reward shaping
- reward function
- machine learning
- learning algorithm
- complex domains
- data mining
- learning problems
- dynamic programming
- action selection
- multi agent
- function approximators
- control policy
- temporal difference learning
- partial observability
- robotic control