Deep Reinforcement Learning-based Rebalancing Policies for Profit Maximization of Relay Nodes in Payment Channel Networks.
Nikolaos PapadisLeandros TassiulasPublished in: IACR Cryptol. ePrint Arch. (2022)
Keyphrases
- profit maximization
- reinforcement learning
- optimal policy
- relay nodes
- markov decision process
- cooperative
- computer networks
- control policy
- optimal control
- temporal difference
- learning algorithm
- multi agent
- control policies
- reward function
- markov decision processes
- model free
- long run
- multi hop
- finite state
- finite horizon
- routing algorithm
- production system
- expected profit
- utility function