11-631: Data Science Seminar - Syllabus
Course Learning Outcomes
The main learning objectives of the course are for students to (a) demonstrate a basic understanding of the Data Science literature (via sample application areas, associated publication venues, and writing styles), (b) apply this understanding to specific publications (by writing and justifying academic evaluations of the work), (c) report on a Data Science publication in a comprehensive, collaborative presentation of a given publication and its related works, (d) defend and criticize, via relevant statements, questions and form-based evaluation, reports, and presentations on Data Science publications while participating in constructive discussion about such presentations, and (e) able to critically analyze and synthesize Data Science literature individually from the lens of a specialist role and collaboratively in a group.
All of these outcomes are essential preparation for the subsequent MCDS capstone course sequence (11-634 Capstone Planning Seminar, 11-632 Data Science Capstone, and 11-635 Data Science Capstone Research).
Time & Location
TR 05:00PM- 06:20PM, TEP 1403
Course Format
In-Person. The course opens with an initial overview of the Data Science literature and tutorials on how to analyze and critique Data Science publications. The course also provides tutorials on preparing and presenting reviews of Data Science publications and related literature.
Course Organization
The main objective of the course is to get familiar with critically reading, reviewing, and presenting data science papers, to prepare you for the capstone courses. As such, the course will involve the following activities which each student is expected to participate in:
- Attend lectures (3) and guest lectures (3)
- Read paper according to a specialist role, write a short summary, and discuss it with a group (3)
- Do a practice (1) and main (1) in-class presentation (individual or in pairs) for papers that you have read
- Attend in-class presentations by your colleagues, and ask a question for each presentation
- Write a paper review (1) and a literature survey (2) according to a theme of your choosing
- Write a constructive review of a capstone report (1) and of a capstone presentation (1)
Presentations
A main goal of the class is to learn how to clearly and effectively present research/papers to others, which is a core component of being a data scientist. There will be two opportunities to present papers in class:
- Practice presentations: Each student will present a paper to a smaller audience over Zoom, and will receive feedback from one of the TAs.
- Main presentations: Each student (individually or in pair) will present a paper to the entire class.
Presentation length: There will be 4 presentations per class, which means each presentation must be at most 15 minutes long, with 5 minutes for questions from the audience.
Audience questions: During the main presentation phase, those who are not presenting that week must fill out an audience form with a question for each of the papers. The instructor will randomly select a couple of people to ask their question out loud.
Selection of papers: For the practice presentations, you will be assigned the paper that you have to present. For the main presentations, we will ask for paper suggestions from all students, and then assign presentation papers taking into account who nominated what paper.
Presentation format and visuals: Please refer to this guide for tips on how to present your paper: click here
Summaries and Discussions
Another main goal of the class is to be able to critically read a paper and discuss it with others. There will be a total of three discussion lectures, which will go like this:
- You will be assigned a paper
- You will be assigned a specialist role, which will describe the lens through which you should present the paper
- Before the start of each discussion lecture, you will have to post a summary on Canvas that summarizes your talking points for the discussion.
- After each discussion lecture (or during), you will post a follow up comment on your own summary on Canvas with one thing you learned from the discussion.
Summary format & length: you will post the summaries to Canvas, and keep them between 400 and 600 words. Ensure that your summaries are readable to someone who has not read the paper. Avoid huge walls of text, and feel free to use bulleted lists or other formatting. Do not post images, charts, or tables.
The specialist roles are the following:
- Reviewer
- Archaeologist
- Researcher
- Industry Expert
- Social Impact Assessor
See the following link for descriptions on what each role should do: click here
Review & literature survey
Another goal of this class is to prepare you for the capstone classes in the Spring and Fall, in which you will do a capstone research project of your choice. To prepare you for that project, you will have to work your way up to a literature survey in the topic area of your project, in a group of at most 4 students. This will encompass three steps:
- Team and topic choice: you will have to come up with a topic area, 4 relevant papers, and find your teammates.
- Paper review: you will each write a review of a paper, focusing on the remaining broad questions that the paper leaves open. Each teammate will get one of the papers you submitted. Write at most 1.5 pages of content.
- Draft literature survey: focusing on your task and topic area, summarize and compare your 4 related papers’ similarities and differences, as well as the remaining questions that the papers leave open. Write at most 3 pages, and at least 2 pages (excluding references).
- Literature survey: staying in the same topic area and task, you will choose an additional 4 papers, and discuss them together with the 4 papers from the draft literature survey (8 total) in a literature survey. Write at most 6 pages, and at least 5 pages (excluding references).
Format: LaTex with Bibtex, using the ARR style format (LaTeX templates, also available as an Overleaf template).
All assignments from this sequence can be found here: click here
Capstone project review
Finally, during the last two weeks of class, students will attend at least one final second-year capstone presentation and review one draft capstone report written by a second-year MCDS student team.
ChatGPT red-teaming
Extra credit: Try asking ChatGPT/other AI platforms some questions related to a paper that someone presented in class, and assess the output’s correctness. Your goal is two-fold: (1) find an input question / prompt that will lead ChatGPT to produce something incorrect, and (2) explain what about the output is incorrect, and hypothesize why ChatGPT might have gotten it wrong.
You will get more points the more creative your input prompt/question is, and the better your explanation is for why it got it wrong.
Attendance Policy
This course will be held in person. You are responsible for completing the work assigned and seeking clarification as needed. Late work is generally not accepted without prior arrangement or proper justification.
- Attendance is required for:
- Lectures
- Discussions (absence will incur a penalty)
- Assigned presentation slots (practice and real)
- For presentation phase, on days you are not presenting:
- You get 2 unexcused absences, i.e., you will get to be absent without telling the instructors beforehand.
- For any further absences during presentation phase, you will not get attendance points for that lecture.
- If you get caught posting questions without being in class, you will get 3 absences worth of penalty.
- In-advance excused absences:
- If you have interviews or other pre-existing commitments, can ask for an exceptional absence (up to 2 more), not including medical reasons.
- You can get extra credit for attending all guest lectures.
Grace Day Policy & Scheduling Presentations:
- You are allowed one homework grace day that you can use towards exactly one of the following assignments:
- Paper review
- Literature survey draft
- Literature survey
- Capstone report review
- Capstone final presentation review
- For main presentations only, you are allowed to swap dates with another student exactly once. If you chose to swap main presentation slots (e.g., due to a job interview), you and your swapping partner must let the instructors know via email at least 3 days in advance (otherwise you will get a zero on the presentation assignment).
- There are no grace days for the discussion lectures (and corresponding summaries) or main presentation questions.
Assessment
Assessment type | Grade percentage |
---|---|
Practice presentation | 10 |
Main presentation | 15 |
Attend main presentations | 5 |
Three discussions / paper summaries | 30 (10 each) |
Paper review | 5 |
Literature survey draft | 10 |
Literature survey | 20 |
Capstone Report Review | 2.5 |
Capstone Final Presentation Review | 2.5 |
Extra credit: attending all three guest lectures | 1 |
Extra credit: red-teaming ChatGPT | 2 |
Extra credit: end-of-course survey | 2 |
TOTAL | 105 |
AIV Policy
Collaboration policy: For preparing each presentation, you share work with your assigned teammates and no other students. In particular, when your paper is also being presented by a different team(s) in the same or different section of this course, you may not collaborate or share work with students in this other team(s). Similarly, all other deliverables in the course are individual assignments. You are required to synthesize, research literature, and produce the document by yourself without working with your classmates. This course is intended to give you experience in autonomous research, so trying to delegate or shortcut preparation is a wasted learning opportunity. Acting against this rule will be considered an academic integrity violation and lead to reprimands, including possible dismissal from the program (see the MCDS Handbook).
For your paper summaries and comparative analyses, you must produce your own work. You may discuss the papers with classmates, but the submissions must be your own work. Do not use the internet or other sources to find prior analyses to complete your assignments.
Plagiarism and AIV policy: The presentation and related work survey emphasize a literature search and compare/contrast to other material. All material you find and use in any of the course deliverables must be explicitly and correctly referenced/cited. Notes:
- Directly copying text from the paper being summarized, and/or from author websites or other sources, without using “quotation marks” around everything that is a direct quote, followed by a reference to the source being quoted, is plagiarism.
- Text and/or slides copied directly from other sources without attribution in presentations is also considered plagiarism.
Here are some resources for learning what is and isn’t plagiarism: