How to request datasets from dbGaP and other federal repositories

Federal Data Repositories

There are security and operational standards that must be in place in order to use controlled-access data from a federal repository. You may need to seek an IT environment that meets these standards prior to accessing. You can request assistance via UWIT intake form (UW NetID required).
For NIH Controlled-Access Data Repositories, review the:

dbGaP Overview

The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.

dbGaP provides two levels of access – open and controlled – to allow broad release of non-sensitive data, while providing oversight and investigator accountability for sensitive data sets involving personal health information.

Before you begin a Data Access Request

Review the following questions and guidance.

Accessible Accordion

  • Review UWIT guidance on Computing for restricted access data.
  • Identify whether you will use an existing UW environment, set up a new secure environment through UW, or use a third-party environment with UWIT approval. In some cases, it may be possible to use an NIH-hosted environment (e.g., AnVIL or BioData Catalyst).
  • Consult with the authorized IT Director for the environment you plan to use. If you will be using an NIH-hosted environment, please reach out to your department’s IT administrator.

If you are working with controlled-access data:

Start your Data Access Request (DAR)

After reviewing the previous guidance, follow these steps to begin your DAR.

  1. Choose datasets you wish to access.
  2. Select the Signing Official: Select the authorized official.
    • Your OSP reviewer will update the Signing Official to themselves after they receive the accompanying SAGE request. See steps to Prepare your Request in SAGE to OSP.
    • In the DAR, list the authorized IT Director who has firsthand knowledge of the IT environment you intend to use. This is the same person who signs the IT Director Confirmation.
  3. If using a Cloud Computing IT Environment (UW Government Community Cloud or UW GCC), upload the UW Cloud Computing IT Environment Statement into the DAR.
  4. Read the attestation language.
  5. Add other necessary attachments required by NIH, such as IRB Approval.
  6. Read and agree to the terms and conditions as the “Approved User”:
    • Investigators and their institutions are responsible for safeguarding the accessed datasets. Pay close attention to the Data Use Certification (DUC) being made by you as an Approved User.
  7. Review and approve the Data Access request so it begins routing to the Signing Official.
  8. Download a copy of the DAR, then proceed with next steps to prepare your SAGE request to OSP.

Prepare your SAGE Request to OSP

The type of SAGE request depends on whether your DAR is associated with an existing sponsored program. If it is associated with a sponsored program, route an OSP & GCA Modification Request (MOD) in SAGE.

If it is not associated with a sponsored program, route a Non-award Agreement (NAA) eGC1.

  1. Prepare and route the request in SAGE
    • For a MOD, select the “Federal Repository Data Access and Submission” subcategory.
    • For an eGC1, select the “Non-Award Agreement (new)” application type if you are requesting access to new data, or “Non-Award Agreement (continuation)” if it is a renewal request for data you have already been using.
  2. Attach the following to your SAGE request: .
  3. OSP will review the Award Modification Request (MOD) or NAA eGC1 together with the DAR in eRA Commons.
  4. Check status on “My Requests” page in eRA Commons.

Signing Official (OSP) Review

  • DAR is complete.
  • An authorized IT Director is identified.
  • A signed confirmation statement from IT Director is attached in SAGE to the NAA eGC1 or Award Modification,
  • Assurance statement signed by the Approved User is attached to SAGE item.
  • If the IT Environment used is “GCC High”, that PI has uploaded the UW Cloud Computing Statement in the DAR.
  • IRB approval, if needed, is attached to the DAR, and corresponds to the study in question.