MA-TREX: Mutli-agent Trajectory-Ranked Reward Extrapolation via Inverse Reinforcement Learning.
Sili HuangBo YangHechang ChenHaiyin PiaoZhixiao SunYi ChangPublished in: KSEM (2) (2020)
Keyphrases
- inverse reinforcement learning
- reward function
- development environment
- partially observable environments
- bayesian nonparametric
- quality assurance
- preference elicitation
- multiple agents
- reinforcement learning
- markov decision processes
- state space
- partially observable
- optimal policy
- transition probabilities
- multiagent systems
- multi agent systems
- reinforcement learning algorithms
- markov decision process
- temporal difference
- search space
- state variables
- learning agent
- simple examples
- multi agent
- software engineering
- user interface