Login / Signup
Reward Penalties on Augmented States for Solving Richly Constrained RL Effectively.
Hao Jiang
Tien Mai
Pradeep Varakantham
Huy Hoang
Published in:
AAAI (2024)
Keyphrases
</>
reinforcement learning
markov decision problems
state action
constrained problems
optimal policy
function approximation
learning process
average reward
learning algorithm
continuous domains
model free
state transition
initial state
state transitions
markov decision processes
state space
multi agent
neural network