ROSARL: Reward-Only Safe Reinforcement Learning.
Geraud Nangue TasseTamlin LoveMark NemecekSteven JamesBenjamin RosmanPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- state space
- model free
- optimal policy
- eligibility traces
- machine learning
- dynamic programming
- reward function
- learning algorithm
- control problems
- temporal difference
- supervised learning
- learning process
- learning problems
- partially observable environments
- multi agent systems
- real robot
- learning agent
- function approximators
- multi agent
- fitted q iteration
- total reward
- state and action spaces
- policy search
- markov decision problems
- temporal difference learning
- sufficient conditions
- transfer learning
- markov decision processes