Credit points: 3.0
This course discusses aspects of building distributed systems, with an emphasis on reliability.
Lectures are held on Sunday 12:30-14:20 in Taub 5.
Recitations are held on Sunday 14:30-15:20 in Taub 5.
Teaching Method:
The Teaching method would be according to the Corona related regulations during the semester. The default for lectures is frontal teaching in class + recording + best effort live zoom, and the default for tutorials is frontal teaching in class + recording. In terms of participation in class discussion, we would try to accommodate home participants as long as it does not interfere with the overall quality of the teaching. Also, priority would be given to students in the classroom. In case of network problems, those not physically attending class would need to settle for the recording.
To view the online zoom session, you will need to login to zoom with your Technion account (@campus.technion.ac.il). Make sure you know your password.
All zoom recordings (regardless of how we teach) would be also available through panopto till the end of the exam period for this semester.
Reception hours will be given both in person and over zoom.
Grading policy:
The structure of the final grade: one programming assignment to be submitted in couples (30%), two dry home assignments to be submitted individually (20%), and a final exam (50%). You need to get at least 40 in the exam in order to pass the course.
The programming assignment will be published around the 8th week of the semester. The dry ones will be published roughly on weeks 5 and 12. (This is an approximate estimation)
H.W. late submissions:
Late submissions will result in a penalty in form of 5 points per day for the first two days, and 10 points per day for the rest.
Prerequisites:
- Working knowledge of Java/Go/Kotlin (needed for programming assignments).
- General knowledge of OOP concepts.
- Operating Systems.
- Parallel and Distributed Programming.
- Other courses in Distributed Algorithms and Introduction to Computer Networks can be helpful, but are not necessary.
- Introduction
- Client/Server Programming, RPC, REST
- Replication (replicated state machine, primary/backup, quorum replication)
- Consensus and Failure Detection
- ZooKeeper
- Distributed Transactions
- Decentralized Group Membership
- Reliable and Ordered Multicast
- Scalability issues and the concept of gossip
- Peer-to-Peer Systems
- Distributed Storage Systems (and Distributed File Systems)
- Basics of Byzantine Fault Tolerance (including distributed aspects of blockchains)
- Cloud Based Analytics Systems
- Publish/Subscribe Systems
- Checkpoint/Restart