Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation.
Do June MinVerónica Pérez-RosasKen ResnicowRada MihalceaPublished in: LREC/COLING (2024)
Keyphrases
- reinforcement learning
- state space
- function approximation
- reward function
- average reward
- partially observable environments
- dynamic environments
- markov decision processes
- eligibility traces
- optimal policy
- multi agent
- reinforcement learning algorithms
- learning problems
- long run
- website
- inverse reinforcement learning
- policy gradient
- partially observable
- real time
- total reward
- learning algorithm
- generation method
- markov decision process
- real robot
- least squares
- dynamic programming
- supervised learning