Interactive Multi-objective Reinforcement Learning in Multi-armed Bandits with Gaussian Process Utility Models.
Diederik M. RoijersLuisa M. ZintgrafPieter LibinMathieu ReymondEugenio BargiacchiAnn NowéPublished in: ECML/PKDD (3) (2020)
Keyphrases
- gaussian process
- multi objective
- gaussian processes
- reinforcement learning
- model selection
- bayesian framework
- gaussian process models
- covariance function
- multi armed bandits
- hyperparameters
- gaussian process regression
- genetic algorithm
- statistical models
- evolutionary algorithm
- regression model
- state space
- probabilistic model
- objective function
- parameter estimation
- decision makers
- noise level