Provable Multi-Party Reinforcement Learning with Diverse Human Feedback.
Huiying ZhongZhun DengWeijie J. SuZhiwei Steven WuLinjun ZhangPublished in: CoRR (2024)
Keyphrases
- multi party
- reinforcement learning
- human communication
- virtual humans
- turn taking
- privacy preserving
- function approximation
- personality traits
- description language
- human operators
- human subjects
- optimal policy
- markov decision processes
- motor skills
- multi issue
- human behavior
- human users
- mental states
- software development
- relevance feedback
- multi agent
- artificial intelligence