11-632: Data Science Capstone - Syllabus
Time & Location
TR, 08:00AM - 09:20AM, TEP 1403
Course Description
The Data Science Capstone and Data Science Capstone Research courses (11-632 and 11-635) are a continuation of the MCDS Capstone course sequence. Students will build upon the foundation laid in the Data Science Capstone Planning course (11-634) as they work towards completing their assigned capstone projects.
Upon completing the full MCDS Capstone course sequence (11-634, 11-632, 11-635), students should be able to:
- Analyze computational data science problems in different application domains and critique solutions to those problems.
- Design, implement and evaluate a software solution (comprising software system(s) and/or machine learning model(s)) on real world datasets at real world scale.
- Organize, present, and report on a real-world data science project in collaboration with other researchers/programmers.
Course Format
In-person class sessions are held twice a week, with the scheduled dates available in the course calendar).
Outside of class meetings, student teams will work on their assigned Capstone projects under the supervision of the project advisor and, where applicable, in collaboration with other students/faculty.
Course Organization
The primary objective of the Data Science Capstone and Data Science Capstone Research courses is to guide you through the completion of a comprehensive data science project, building on the foundation established in the Data Science Capstone Planning course. Throughout these courses, you will deepen your technical expertise, enhance your collaboration skills, and further develop your ability to carry a project from initial planning through to final execution and presentation.
In this course, you will:
- Develop and submit a Fall Plan, outlining your project goals and receiving feedback to guide your progress.
- Engage actively in class through one-on-one standup meetings with instructors and TAs.
- Submit a content-complete draft report, ensuring that all sections are well-developed and free of errors.
- Prepare a Final Report, incorporating feedback from your final presentation, which will be evaluated as if submitted to a peer-reviewed workshop.
- Deliver a Final Poster Presentation of your project, demonstrating your results and insights.
- Maintain a Project Code Repository, documenting your technical process and decisions.
- Conduct an End-of-Semester Internal Evaluation to appraise the overall quality of your project and to evaluate both your own and your teammate’s contributions to the project and each other’s learning.
- Reflect on your Individual Growth as a data scientist, updating your Personal Learning Objectives (PLOs) and submitting a reflection on your development.
Throughout the course, you are expected to take ownership of your project, proactively seek assistance when needed, and consult with your project mentor or instructors if any requirements are unclear. Detailed grading criteria will be provided on Canvas to guide you through the assessment process.
Technical Process Criteria
The Capstone project is also an exercise in proper software engineering. Your technical process evaluation will consider the following factors:
- Every project is required to use a GitHub or bitbucket repository.
- Every team member is expected to produce regular and sensible commits.
- To do items and nontrivial ongoing tasks are to be organized and documented in the GitHub issue system. This documentation is particularly important for planning milestones and action items produced during weekly meetings.
- Documentation must include a plan with timelines and milestones. Time and labor estimates for tasks are also a critical part of a project plan.
- Any documentation that will be needed on an ongoing basis (e.g., APIs, file formats, etc.) is to be kept in the GitHub repository readme and/or wiki pages.
- Code quality will be accounted for by the mentor and/or peer review.
Assessment
The grade will consist of an assessment of the quality of the data science experiment, its results, the technical process over the course of the semester, peer evaluations, and your individual growth as a data scientist.
The course grade will be based on the following:
Assignment | Weight |
---|---|
Participation | 5% |
Biweekly Standup with Instructor | 10% |
Biweekly Standup with TA | 10% |
Presentation Peer Review | 5% |
Draft Report | 5% |
Final Report | 20% |
Poster Presentation | 25% |
Updated Personal Learning Objectives | 5% |
Personal Learning Objectives Reflection | 5% |
Poster Peer Review | 5% |
Project Code Repository | 5% |
Students are expected to take ownership of the project, take the initiative in driving the development forward and autonomously seek help when getting stuck. If requirements are unclear at any point, please talk to your project mentor or the instructors. For a detailed rubric of how the system, experiment, and results are assessed, you will be directed to the grading criteria document as posted on Canvas during the semester.
Weekly Standups
Starting week 4, each week, your team will deliver a 5 to 10-minute presentation summarizing your project’s progress from the past week. The presentation must use the quad chart template provided on Canvas and should focus on detailed progress updates rather than an extended introduction. The aim is to clearly communicate your work and facilitate productive discussions.
To accommodate the large number of teams, these standups will alternate biweekly between meetings with the instructor and meetings with a designated TA. These meetings will not occur during regular class times:
- Instructor Meetings: Held Tuesday to Thursday, 1:00–5:00 PM in GHC 5419.
- TA Meetings: Teams and TAs will coordinate the time and location, with the option to meet on Zoom.
Each standup will be graded, and attendance is mandatory.
Poster Peer Review
Each student must attend at least three poster presentations by other teams, including one assigned team and two of their choice. They are required to submit a brief review of the assigned team’s poster.
Grade Policy
Individual Grade Adjustments
In our team-based academic settings, it is recognized that uniform grading may not always reflect individual contributions. Hence, instructors have the discretion to modify a student’s grade relative to the team’s collective grade. This adjustment is grounded in a thorough evaluation of various factors such as the student’s effort, active participation in course work, professionalism, ability to work collaboratively, and demonstration of a growth mindset.
Grading Methodology
Our grading process begins with the normalization of scores using statistical techniques to calculate the mean and variance. Individual grades are determined based on the standard deviations from the mean.
Exceptional Performance Recognition
We reserve the highest accolade, the ‘A+’ grade, for students who not only secure top marks but also demonstrate significant research impact. This decision is made collectively by course instructors and mentors, ensuring a fair and comprehensive assessment of each student’s academic prowess.
Regrading Policy
All grading disputes and regrading requests must be made within 7 days after the grade is released. No requests will be accepted after this deadline.
Attendance Policy
Attendance is a critical component of the participation score, accounting for 5% of the overall grade. Attendance at all class meetings is mandatory. Our class sessions are specifically designed for lectures, in-class standups, and presentations. Experience from previous cohorts strongly indicates that regular attendance is crucial for both the success of the team and the individual growth of each student. Therefore, each student is expected to attend every session as scheduled.
In recognition of the unpredictable nature of circumstances, each student will be allowed one absence per semester. This absence is permissible only for sessions where the student’s team is not scheduled to present. It should be noted that interview appointments or similar commitments are not considered valid reasons for additional absences beyond this single allowance.
Academic Integrity
For all team presentations and the final report, it’s imperative that students present their work in a manner that distinctly outlines their contributions and differentiates them from existing work. This includes clearly indicating parts of the project that have been influenced by, derived from, or adapted from prior works. It is crucial that these external influences are not only acknowledged but also properly cited in both the report and presentation slides.
Appropriate citation must cover a wide range of materials, including but not limited to:
- Academic writings (including those published by collaborators of the project)
- Diagrams and visual aids
- Datasets utilized in the project
- Reports from previous Capstone projects
- Video tutorials and scientific blog posts
- Technical components such as algorithms, software libraries, and similar tools
- When incorporating these materials into your work, two key principles must be followed:
Paraphrasing: When the text from a source is paraphrased (i.e., rewritten in your own words), it is essential to cite the original source to acknowledge its influence on your work.
Direct Quotation: If a piece of text is used verbatim (exactly as it appears in the source), it must be placed within quotation marks and accompanied by a specific reference to its origin.
Ensuring the integrity of your work through proper citations is not only a scholarly requirement but also a mark of respect for the intellectual property of others. This practice is central to upholding academic honesty and fostering a culture of responsibility and ethical scholarship in our academic community.
Assignment Submission Confidentiality and Use for Course Improvement
All assignment submissions for this course will be utilized for course analytics and to enhance future course offerings. These submissions will remain confidential and will not be made publicly available, unless express written consent is provided by the authors of the submitted work.