Prioritized Soft Q-Decomposition for Lexicographic Reinforcement Learning.

Finn Rietz Erik Schaffernicht Stefan Heinrich Johannes A. Stork

Published in: ICLR (2024)

Keyphrases

reinforcement learning
function approximation
decomposition method
markov decision processes
learning algorithm
image decomposition
temporal difference
state space
optimal policy
learning process
optimal control
reinforcement learning algorithms
multi agent
function approximators
robotic control
dynamic programming
stochastic approximation
reinforcement learning methods
temporal difference learning
cardinality constraints
machine learning
search algorithm
action selection
model free
search space
learning problems
combinatorial optimization