SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture.
Amine BarrakMayssa JaziriRanim TrabelsiFehmi JaafarFábio PetrilloPublished in: QRS (2023)
Keyphrases
- fault tolerant
- fault tolerance
- peer to peer
- load balancing
- distributed systems
- distributed computing
- maximum likelihood
- high availability
- state machine
- digital libraries
- management system
- training set
- file sharing
- ad hoc networks
- software development
- overlay network
- data replication
- safety critical
- interconnection networks
- artificial intelligence
- super peer
- content addressable
- high assurance