Human-aligned trading by imitative multi-loss reinforcement learning.

Zhengxin Joseph Ye Björn W. Schuller

Published in: Expert Syst. Appl. (2023)

Keyphrases

reinforcement learning
human behavior
learning algorithm
dynamic programming
model free
human interaction
markov decision processes
state space
electronic commerce
human subjects
function approximation
transfer learning
behavioural cloning
robotic control
reinforcement learning algorithms
temporal difference
optimal control
data sets
supervised learning
supply chain
learning process
case study
genetic algorithm