UW Information Technology

September 23, 2019

New UW Data Collaborative seeks to bring latest computing tools and data to researchers

By Ignacio Lobos

Imagine a researcher at work in a small, windowless “cold room” with an automatic locking door and a desktop computer with zero chance of connecting to the internet in order to protect highly restricted health and population datasets.

Cold rooms offer a strict environment that keeps data safe. But in a highly collaborative institution such as the UW, where cutting-edge research performed by multidisciplinary teams is now the norm, confining data to a single cold room, and to a single investigator, limits its utility.

The usefulness of high-quality data is magnified when shared among multiple investigators so they can help make advances in their fields and offer evidence-driven solutions.

Fostering collaboration and data sharing is the impetus behind the new UW Data Collaborative (UWDC), initiated and hosted by the Center for Studies in Demography & Ecology and in partnership with the Population Health Initiative, Urban@UW and the Student Technology Fee program.

At the heart of the data collaborative is a modern computing cluster that provides access to restricted data in a secure and computationally sophisticated environment. UW Information Technology (UW-IT) plays a major role by providing infrastructure and services so UWDC can operate safely and smoothly.

Sensitive data is often expensive and difficult to share

The Center, through years of trusted research partnerships, has been asked to streamline the process of hosting datasets that are highly restricted and difficult to access. This type of data includes high-security confidential and proprietary data with information about individuals, households, communities and businesses culled from sources such as Gallup Micro-Level Polling Data, Medicare records, real estate transactions, and large-scale federally funded surveys, such as the Adolescent Health Survey.

Researchers use these datasets to investigate human migrations and settlements, environments and populations — as well as their health and wellbeing. Often, the datasets are expensive to acquire, with some costing tens of thousands of dollars, and well out of reach for many single investigators. If only one researcher uses a particular dataset, then it is a wasted opportunity for what could be used for other and more extensive research.

The data collaborative seeks to broaden the potential impact of these datasets by making them more widely available to the UW research community — without compromising their security and confidentiality.

“The launch of the collaborative was so exciting for me as a health services researcher,” said Tracy Mroz, assistant professor in the School of Medicine. “We now have a critical resource that will enhance my research program in multiple ways. Having a secure platform for large, sensitive datasets is a must for the work I do, and the state-of-the-art infrastructure of the UWDC will streamline the process for securing and using these datasets.”

Another plus, said Mroz, is that “partnering with the UWDC helps demonstrate our capacity for this type of research on grant proposals and increase the likelihood of external funding.”

UW-IT helps demography center build secured but accessible computing environment

Alan Li and Matt Weatherford, research IT staff with the Center, built the infrastructure that supports the data collaborative’s mission. An early design goal was to use as much off-the-shelf UW-IT services as possible.

“There was no way we could have built the infrastructure from scratch on the limited unit-level resources we have,” Weatherford said. So, they sought external funding and integrated many services from UW-IT.

In the last decade, UW-IT has invested heavily in infrastructure and software to support research at the UW. Many of its offerings have allowed multiple groups to collaborate and access their work more easily without having to reinvent the IT wheel.

The UWDC shows how building collaborative infrastructure, particularly by leveraging existing technology widely offered by UW-IT, can help strengthen the UW research community, said Brad Greer, associate vice president and chief technology officer.

“The UWDC is built on two critical UW research infrastructure IT principles — solid enterprise infrastructure and solid security,” Greer said.

Mroz agrees. “Housing data with the UWDC creates efficiencies by removing the burden from individual departments to build infrastructure and purchase data,” she said. “It encourages collaboration through data sharing.”