Algorithms for learning value-aligned policies considering admissibility relaxation.
Andrés Holgado-SánchezJoaquín AriasHolger BillhardtSascha OssowskiPublished in: CoRR (2024)
Keyphrases
- learning algorithm
- learning tasks
- learning process
- theoretical analysis
- noise tolerant
- learning problems
- orders of magnitude
- online learning
- significant improvement
- reinforcement learning
- computational cost
- benchmark datasets
- learning systems
- supervised learning
- linear programming
- computationally efficient
- machine learning algorithms
- data structure
- combinatorial optimization
- clustering algorithm
- learning models
- iterative algorithms
- machine learning