Fall avoidance of bipedalwalking robot by profit sharing that can learn deterministic policy for POMDPs environments.
Toshihiro SuzukiYuko OsanaPublished in: NaBIC (2014)
Keyphrases
- profit sharing
- reinforcement learning
- uncertain environments
- partially observable markov decision processes
- optimal policy
- policy making
- autonomous robots
- multi robot
- robotic systems
- mobile robot
- policy search
- partially observable
- learning agent
- policy gradient
- markov decision problems
- supply chain
- action selection
- dynamic environments
- vision system
- joint replenishment
- dynamic programming
- path planning
- markov decision process
- belief state
- autonomous systems
- action space
- cooperative