Data Science Masters

June 10, 2021

Meet the Organizers of the COVID-19 Hackathon

Recently, we caught up with current MSDS student Sepi Dibay and recent MSDS alumna Deepthi Hegde on their successful COVID-19 Hackathon from summer 2020.


Bios:

Sepideh (Sepi) Dibay immigrated to the United States in 2009 and pursued her Master of Public Health and Ph.D. in Epidemiology. She did her postdoctoral research at Fred Hutchinson Cancer Research Center and currently is interning at Amazon as a Research Scientist. Sepi has extensive experience designing/conducting research and analyzing observational and experimental data. She is also pursuing a Master of Science in Data Science at UW to enhance proficiency, and expand domain versatility.

 


Deepthi Hegde is a data scientist who is passionate about building real-time products that are scalable. She is currently with Microsoft and is a recent graduate of the Master of Science in Data Science program at UW. While at UW, she did research internships at Google and Nike, focusing on deep learning applications in computer vision. Before that, she was a researcher at Carnegie Mellon University where she worked on various machine learning projects. In her free time, she loves to mentor students on interviewing and jobs in data science.

 

What motivated you to organize the COVID-19 Hackathon?

It was 2 months into the pandemic and the situation wasn’t getting any better. We were bored of staying home and were looking for meaningful ways to contribute towards the cause in whatever way we could. We tried different channels such as volunteering for the State of Washington Health Department but since everything was new and the spread of this deadly virus was happening quickly we could not find a meaningful way to contribute.  As data scientists, we believed in the power of data in combating the situation. We thought that by coming together as a community and combining research efforts and sharing insights, we could create more impact than each of us could individually. That’s when we decided to organize an online hackathon.

 

How many participants were there in total? 

100+ students joined the competition and 42 participants made a submission.


How many teams submitted projects?

We had 13 teams submit their projects. Each team had 3-5 members.


The event was virtual because of the COVID-10 pandemic. How did the fact that it was completely online impact the event? 

This was a very new way of organizing a hackathon and required a lot of coordination and arrangements to spread the word and engage the participants. Even though in some sense the online format limited our power to collaborate in person, it definitely helped us get participation from around the world. We had several teams with members from different time-zones working around the clock. We were also able to bring in experts in the field of data science from different states to offer introductory workshops on the first day of the hackathon.


What platforms did you use to host the hackathon? Can you describe how participants and teams were able to participate virtually?

We used Slack extensively for offline communications with the participants before and during the hackathon. During the two days of the hackathon, all the workshops, events and check-ins were done via Zoom. For collaboration on the projects, we asked teams to use GitHub, which was also how we asked teams to make their submissions.


Who were the judges? 

  • Tim Randolph, Associate Member at Fred Hutchinson Cancer Research Center 
  • Anna Talman Rapp, Program Officer at the Bill & Melinda Gates Foundation
  • Duncan Wadsworth, Data Scientist at Microsoft
  • Ying Li, Chief Scientist at Giving Tech Labs

 

What workshops did you conduct? 

We conducted 2 introductory workshops on Day 1:

  • Intro to NLP (natural language processing) by Grishma Jena, Data Scientist at IBM
  • Intro to Time Series by Stanislav Panev, Project Scientist at Carnegie Mellon University

 

What kinds of datasets did teams use?

We provided two datasets (OxCGRT: COVID Policy Tracker, NYTimes: COVID-19 Data) for teams to explore. However in the spirit of open ended research and creativity, participants also had the option of using any other public dataset they liked and we did see several teams take advantage of it.

 

Describe the awards categories.

Track I: Best Storytelling/Data-Science Process

  • Clear hypotheses and assumptions
  • Exploratory data analysis
  • Problem solving
  • Comprehensive take-aways
  • Reproducibility

Track II: Best Prediction Model

  • Problem setup and metric definition
  • Quality of features 
  • Explanation of choice of model
  • Model evaluation
  • Explainability and model interpretation

Track III: Best Interactive Visualization/Dashboard

  • Simplicity and ease of navigation
  • Choice of encodings and colors
  • Ease of understanding
  • Impact and take-aways
  • Documentation


Describe the winning teams’ projects below.

The Unpredictables won the prediction model category. This group investigated the impact of governmental policies on rates of COVID-19 infections in three states with the highest number of cases at that time (California, New York, and Pennsylvania).

Curious Duo won the storytelling category. This group focused on two states, Washington and Florida, for their analysis. The objective was to identify and collect tweets from the states, and identify the sentiment trends for the state-specific user and how this impacted the spread of COVID-19.

Data visualization had two winners: 

Java’s Just Coffee visualization allows the user to interact and explore COVID-related data on the number of cases/deaths and policies on which governments have focused to counteract this pandemic. This visualization also allows the user to interact with how people have responded to COVID in the United States.

JiaLiDun did a visualization to show the effectiveness of governments’ policy responses towards the COVID-19 pandemic in different countries. This group looked at three different major categories of policies: containment and closure policies, economic policies, and health system policies. Within each category, there are different levels of stringency that were also taken into consideration.