Data Science Masters

DATA 511: Data Visualization for Data Scientists

Introduction to the visual tools and techniques used in modern data science to develop and deploy data driven insights. Provides a foundation for visualization to support exploratory analysis, statistical modeling, machine learning, and presentation of results on structured and unstructured data. Students develop and present deep analyses for wider audiences.

DATA 512: Human-Centered Data Science

This course focuses on fundamental principles of data science and its human implications. We’ll cover data ethics; data privacy; differential privacy; algorithmic bias; legal frameworks and intellectual property; provenance and reproducibility; data curation and preservation; user experience design and usability testing for big data; ethics of crowd work; data communication; and societal impacts of data science.

DATA 514: Data Management for Data Science

This course introduces students to database management systems and techniques that use these systems. Topics covered include data models; query languages; database tuning and optimization; data warehousing; and parallel processing.

DATA 515: Software Design for Data Science

This course introduces students to software design and engineering practices and concepts, including version control, testing and automatic build management.

DATA 516: Scalable Data Systems and Algorithms

This course focuses on principles and algorithms for data management and analysis at scale. We’ll cover designs of and how to use traditional and modern big data systems, as well as the basics of cloud computing.

DATA 556: Introduction to Statistics and Probability

In this course, you’ll get an overview of probability; conditional probability and independence; Bayes’ theorem; discrete and continuous random variables, including jointly distributed random variables; key distributions, including normal distribution and its spin-offs; properties of expectation and variance; conditional expectation; covariance and correlation; central limit theorem; law of large numbers; and parameter estimation.

DATA 557: Applied Statistics and Experimental Design

This course focuses on inferential statistical methods for discrete and continuous random variables, including tests for difference in means and proportions; linear and logistic regression; causation versus correlation; confounding; resampling methods; and study design.

DATA 558: Statistical Machine Learning for Data Scientists

This course covers bias-variance trade-off; training versus test error; overfitting; cross-validation; subset selection methods; regularized approaches for linear/logistic regression: ridge and lasso; non-parametric regression: trees, bagging, random forests; local regression and splines; generalized additive models; support vector machines; k-means and hierarchical clustering; and principal components analysis.

DATA 590: Data Science Capstone I – Project Preparation

This course is part one of a two-course capstone sequence where students organize project teams, select project topics, write a project proposal and begin preparing project data sets.

DATA 591: Data Science Capstone II – Project Implementation

This course is part two of a two-course capstone sequence designed to build upon the student-driven project from DATA 590. Students synthesize and apply knowledge and techniques acquired throughout the Master of Science in Data Science program for working with large data sets, deriving insights from data and sharing insights with other people.

DATA 598: Special Topics in Data Science

This 1-credit course helps students practice staying on the cutting edge of the data science field by discussing current papers on a variety of essential data science concepts.