Credit points: 3.0
This course discusses aspects of building distributed systems, with an emphasis on reliability.
Lectures are held on Sunday 12:30-14:20 in TBA.
Recitations are held on Sunday 14:30-15:20 in TBA.
Teaching Method:
Update from 30/9/2020: Like the entire Technion, we will start the semester with Zoom only teaching.
Important: To view the online zoom session, you will need to login to zoom with your Technion account (@campus.technion.ac.il). Make sure you know your password.
Depending on the health situation and other relevant regulations during the semester as they evolve, we may switch to Hybrid teaching, but this is not clear yet.
Hybrid teaching means a live lecture/recitation with students allowed to attend according to the Purple badge regulations at the time of class plus being recorded and broadcast live through zoom. In case of hybrid teaching, the lecturer/TA would occasionally check the zoom chat for questions, or you can raise your hand.
All zoom recordings (regardless of how we teach) would be also available through panopto till the end of the exam period for this semester.
Reception hours will be given over zoom.
Grading policy:
The structure of the final grade: one programming assignment to be submitted in couples (30%), two dry home assignments to be submitted individually (20%), and a final exam (50%). You need to get at least 40 in the exam in order to pass the course.
The programming assignment will be published around the 8th week of the semester. The dry ones will be published roughly on weeks 5 and 12. (This is an approximate estimation)
H.W. late submissions:
Late submissions will result in a penalty in form of 5 points per day for the first two days, and 10 points per day for the rest.
Prerequisites:
- Working knowledge of Java/JavaScript (needed for programming assignments).
- General knowledge of OOP concepts.
- Operating Systems.
- Parallel and Distributed Programming.
- Other courses in Distributed Algorithms and Introduction to Computer Networks can be helpful, but are not necessary.
- Introduction
- Client/Server Programming, RPC, REST
- Replication (replicated state machine, primary/backup, quorum replication)
- Consensus and Failure Detection
- ZooKeeper
- Distributed Transactions
- Decentralized Group Membership
- Reliable and Ordered Multicast
- Scalability issues and the concept of gossip
- Peer-to-Peer Systems
- Distributed Storage Systems (and Distributed File Systems)
- Basics of Byzantine Fault Tolerance (including distributed aspects of blockchains)
- Cloud Based Analytics Systems
- Publish/Subscribe Systems
- Checkpoint/Restart