DCRAC: Deep Conditioned Recurrent Actor-Critic for Multi-Objective Partially Observable Environments.
Xiaodong NianAthirai Aravazhi IrissappaneDiederik M. RoijersPublished in: AAMAS (2020)
Keyphrases
- partially observable environments
- actor critic
- multi objective
- reinforcement learning algorithms
- reinforcement learning
- temporal difference
- evolutionary algorithm
- model free
- policy gradient
- state space
- markov decision processes
- optimization algorithm
- partially observable markov decision processes
- genetic algorithm
- particle swarm optimization
- function approximation
- inverse reinforcement learning
- partially observable
- learning algorithm
- objective function
- optimal control
- reward function
- recurrent neural networks
- gradient method
- approximate dynamic programming
- dynamic programming
- machine learning
- stochastic games
- heuristic search
- state variables
- average reward
- policy iteration
- optimization problems
- dynamic environments
- neural network