SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture.
Amine BarrakMayssa JaziriRanim TrabelsiFehmi JaafarFábio PetrilloPublished in: CoRR (2023)
Keyphrases
- fault tolerant
- fault tolerance
- peer to peer
- load balancing
- distributed systems
- distributed computing
- high availability
- maximum likelihood
- management system
- distributed hash table
- state machine
- file sharing
- client server
- training set
- distributed environment
- super peer
- operating system
- low cost
- peer to peer networks
- sensor networks
- interconnection networks
- mobile agent system
- database systems