Link Search Menu Expand Document

11-637: Foundations of Computational Data Science

Spring 2024

11-637 Foundations of Computational Data Science (FCDS) is a fully online course offered by the Master of Computational Data Science (MCDS) program, School of Computer Science, Carnegie Mellon University (CMU). The course is offered in all semesters (Spring, Summer, and Fall) and is open to MCDS and non-MCDS students from all CMU programs and campuses. The course is also open to non-CMU students.

Course Description

This course introduces foundational concepts, learning material, and projects related to the three core areas of Data Science: Computing Systems, Analytics, and Human-Centered Data Science. Students completing this class will be prepared for further graduate education in Data Science and/or Artificial Intelligence. Students acquire skills in solution design (e.g., architecture, framework APIs, cloud computing), analytic algorithms (e.g., classification, clustering, ranking, prediction), interactive analysis (Jupyter Notebook), applications to data science domains (e.g., Natural Language Processing, Computer Vision) and visualization techniques for data analysis, solution optimization, and performance measurement on real-world tasks.

Course Goals

This course will equip students with the foundational knowledge of computational data science. Students will learn about the Data Science Process by completing projects introducing problem identification, data gathering, exploratory data analysis, supervised and unsupervised learning techniques, model evaluation, and visualizing and interpreting results to inform decision-making. Our goal is that students will develop the skills needed to become a practitioner or carry out research projects in computational data science. Specifically, students are exposed to real-world data and scenarios to learn how to:

  1. Define analytic requirements and develop appropriate questions to guide the solution design process.
  2. Design a data-gathering plan incorporating principles of data governance and sovereignty to ensure usability, integrity, security, and data availability.
  3. Use univariate and multivariate visual and non-visual techniques to identify trends, patterns, and outliers in large datasets.
  4. Build and deploy models using the appropriate analytic algorithms (such as linear and logistic regression, k-nearest neighbors, naive Bayes, k-means, and hierarchical clustering, among others) to gain understanding from data, make predictions to solve business problems, and inform decision-making.
  5. Assess the goodness of fit between a model and data using model evaluation metrics and cross-validation frameworks to evaluate predictive models.

Through this process, we aspire for our students to become independent and resilient problem solvers who can overcome challenges and learn.

Who We Are

Teaching Staff

Teaching Assistants

Course Developers