Updated Lecture 5 Notes + clarification for HW1 | |
Dear Students, Thanks again for an interactive lecture. Following the lecture, please find the updated lecture notes for lecture 5. Changes compared to the notes from before the lecture include: 1. A few more takeaways highlighted (JL Lemma, Chernoff Bounds + a template for proving these, and the probabilistic method). 2. Remark 5.6 added to motivate why we cared about the expected distance, or revisiting our analysis of the expectation of the Tug-of-War sketch. 3. The exact Chernoff bound for our distributions of interest were added. As mentioned in class, the variables we should have focused on were based on M rather than its scaled version A. (Or at least should have factored in the difference between them). This adds a k term in the exponent of the denominator in the Chernoff bound in Equation 5.2, which we needed later.. The notes also include the Chernoff bound for the lower tail, which is mostly the same analysis, only starting with a negative t<0. If you're less familiar with Chernoff bounds, please take a look. Following questions about HW1 by a few students: here are a few clarifications / typo fixes (thanks Benny, Tal, and Yuval!): In Question 1.1(c): you should upper bound the number of changes to the sample *in expectation* (and not in the worst case, which is clearly false). In Question 1.3(d): you should show that Y_k/k (and not Y_k) is an (eps/4,delta)-approximation of 1/(n+1). Corrections have been marked in blue in the updated HW sheet, under the HW tab. Cheers, David |
עדכון אחרון ב-10/12/2024, 14:13:40 Last updated on 10/12/2024, 14:13:40 Последняя модификация10/12/2024, 14:13:40 تمت الحتلنة الأخيرة ب-10/12/2024, 14:13:40 |
Updated Lecture 4 Notes | |
Dear all, The lecture notes for lecture 4 have been updated to reflect: (1) Our discussion on the communication complexity of equality. See the appendix.* (2) A proof sketch of the coding theoretic lemma we used without proof. See the second appendix. (3) A corrected statement of Theorem 4.11: for any f_k estimation there is some relative error c=c(k) that is impossible without using logarithmic space (though for some, 1/10 relative error is achievable). (4) See also the proof of the eps^(-2) lower bound for f_0 estimation. * For those of you wondering how private coin help solve equality, I'll mention that some ideas we saw in class allow to solve equality with private randomness. You are welcome to think of this for yourself, and/or check out the appendix for a proof. Cheers, David |
עדכון אחרון ב-8/12/2024, 17:01:34 Last updated on 8/12/2024, 17:01:34 Последняя модификация8/12/2024, 17:01:34 تمت الحتلنة الأخيرة ب-8/12/2024, 17:01:34 |
HW1 out | |
Dear students, HW1 is now out. Due date is 23/12. I recommend starting soon. In particular, it is good to start with Question 1.1 parts (a) and (b) before next week's lesson. For Question 1.5, it might help to wait until after tomorrow's lesson. As a reminder, submission in pairs is allowed. Your submissions should be typed, ideally in LaTeX. Enjoy! David |
עדכון אחרון ב-2/12/2024, 09:36:10 Last updated on 2/12/2024, 09:36:10 Последняя модификация2/12/2024, 09:36:10 تمت الحتلنة الأخيرة ب-2/12/2024, 09:36:10 |
Back to in-person lectures (Taub 3) | |
Dear Students, As you may have seen, the Technion has announced that classes are moving back to in-person format. Therefore, our lectures will move to their originally scheduled location, in Taub 3. Looking forward to even more interactive lectures! Wishing us a quiet and enjoyable rest of the semester. Best, David |
עדכון אחרון ב-30/11/2024, 19:20:03 Last updated on 30/11/2024, 19:20:03 Последняя модификация30/11/2024, 19:20:03 تمت الحتلنة الأخيرة ب-30/11/2024, 19:20:03 |
Today's office hours + Lecture 3 Whiteboard | |
Dear students, Today's office hours will not take place in person. If you want to meet (possibly at a later time) to ask me questions, please email me and we can schedule a zoom meeting. On another note: thanks for another interactive lesson. The whiteboard for lecture 3 has been added to added to the course material. The topics from the lecture notes not covered in class in detail provide more practice / examples of tricks seen in previous lectures. Wishing us all a quiet rest of the week, David |
פורסם ב-26/11/2024, 12:41:19 Created on 26/11/2024, 12:41:19 Создано26/11/2024, 12:41:19 تم النشر ب-26/11/2024, 12:41:19 |
Lecture 2 whiteboard + addition to lecture 1 | |
Dear students, The whiteboard from yesterday's lecture (Lecture 2) was uploaded. Thanks to the students who attended and participated, and thanks for the votes on preferred proofs of Lemma 1. For Lecture 1, the construction of k-wise independent hash functions presented was insufficient for our needs, since we require to map elements in [m] to a much smaller range [2/eps] (thanks Michael for pointing this out!) The fix is fairly simple, and effectively boils down to taking remainders modulo the size of the smaller range. See the updated slides (especially the text in blue). For those who haven't checked this out yet, I remind you that the end of Lecture 1's slides have a numeric example to provide some explanation of why these kind of sketching algorithms are so popular in practice. (In short: these algorithms provide savings of several orders of magnitude in space!) Best, David |
עדכון אחרון ב-20/11/2024, 13:12:58 Last updated on 20/11/2024, 13:12:58 Последняя модификация20/11/2024, 13:12:58 تمت الحتلنة الأخيرة ب-20/11/2024, 13:12:58 |
Lecture 2 notes | |
Dear students, The notes for today's lecture have been updated (minor change of flow of the lecture). If you use the notes during class, do make sure to download the most recent notes. See you soon, David |
פורסם ב-19/11/2024, 09:24:40 Created on 19/11/2024, 09:24:40 Создано19/11/2024, 09:24:40 تم النشر ب-19/11/2024, 09:24:40 |
Lecture 1 notes | |
Dear students, Please note that the notes for Lecture 1 now include: (1) a discussion of the Reservoir Sampling algorithm we saw in the last ten minutes of class. (2) a quick numerical example and discussion explaining the popularity in practice of streaming algorithms. (Thanks for Adar for pointing out that (1) wasn't actually uploaded, and asking about (2).) Best, David |
עדכון אחרון ב-14/11/2024, 15:44:11 Last updated on 14/11/2024, 15:44:11 Последняя модификация14/11/2024, 15:44:11 تمت الحتلنة الأخيرة ب-14/11/2024, 15:44:11 |
Extra office hours to questions regarding the first lecture | |
Dear students, To help students who missed the first lecture and were worried about making up the missed material, I will hold an exceptional extra office hour this week on Thursday 11:00-12:00 for clarification questions regarding the first lecture (see course material). If you cannot attend this extra office hour and missed the first lesson, you are strongly recommended to attend my office hour next Tuesday an hour after class. (Email me if meeting over Zoom for the office hour is easier.) As a reminder: for the second lecture, make sure to go over HW0 for a probability refresher. Best, David |
עדכון אחרון ב-13/11/2024, 10:58:19 Last updated on 13/11/2024, 10:58:19 Последняя модификация13/11/2024, 10:58:19 تمت الحتلنة الأخيرة ب-13/11/2024, 10:58:19 |
Lecture 1 summary: | |
Dear Students, Thanks to the students who came to the first lecture and contributed to an interactive and engaging lesson. I hope you enjoyed the discussion. Following the lesson, I uploaded the lecture whiteboard and updated the notes to include an appendix on Reservoir Sampling. Looking forward to seeing you in the next lectures. (To best prepare: please finish HW0 and go over Lecture 1, especially if you missed it.) Best, David |
עדכון אחרון ב-12/11/2024, 14:33:00 Last updated on 12/11/2024, 14:33:00 Последняя модификация12/11/2024, 14:33:00 تمت الحتلنة الأخيرة ب-12/11/2024, 14:33:00 |
Welcome to Foundations of Algorithms for Massive Datasets! | |
Dear students, Welcome to the course. I'm looking forward to discussing some elegant algorithms (and lower bounds) for massive data. See the course syllabus for tentative topics. We may cover these in less detail and also cover other topics. As per current Technion regulations, we will have our classes *over zoom*. See course syllabus for the zoom room link and some tips and requests to make lectures as interactive (and effective) as possible given the circumstances. To prepare for the first few lessons, I uploaded a *non-mandatory HW0*. This homework is intended as a reminder of some basic probabilistic tools that you have seen before and we will use in the course. It is recommended you do at least parts (a) and (b) before the first lecture, and the other parts before Lecture 2. Looking forward to seeing you in (virtual) class, and wishing us all a productive and quiet semester! Best, David |
עדכון אחרון ב-6/11/2024, 18:35:23 Last updated on 6/11/2024, 18:35:23 Последняя модификация6/11/2024, 18:35:23 تمت الحتلنة الأخيرة ب-6/11/2024, 18:35:23 |