Learning and Selection of Pareto Optimal Policies Matching User Preferences.
Akinori TamuraSachiyo AraiPublished in: IEEE Access (2024)
Keyphrases
- user preferences
- optimal policy
- reinforcement learning
- average reward reinforcement learning
- hierarchical task networks
- learning algorithm
- collaborative filtering
- user profiles
- decision problems
- sufficient conditions
- user behavior
- markov decision processes
- supervised learning
- dynamic programming
- domain knowledge
- learning tasks
- long run
- active learning
- computational complexity