Skip to main content
Research Responsibilities and Compliance

Submission of Data to dbGaP and Genomic Data Sharing

This guidance applies to the NIH Genomic Data Sharing Policy (GDS Policy) for researchers proposing to submit data from studies generating large-scale human genomic data, including genome-wide association studies (GWAS), to the National Institutes of Health (NIH) data repository dbGaP (database of Genotypes and Phenotypes) or another NIH GDS repository, e.g., Gene Expression Omnibus, Sequence Read Archive, or GenBank.

NIH requires all dbGaP and GDS submissions to include review by an IRB and institutional certification. The steps on this page address the information needed to facilitate the requirements for necessary IRB review and certification for new studies, existing (open) studies, and closed studies.

If approved, the completed institutional certification document will be sent to the principal investigator with the IRB approval letter. The study team can then work with the Office of Sponsored Programs and/or the Office of Innovation and Economic Development to execute the appropriate data and/or materials transfer agreements.

New Studies

The initial submission should include the researcher’s plan to submit genomic data (e.g., to dbGaP). Institutional certification will be granted as part of the initial IRB review. A copy of the certification document should be uploaded with the initial submission materials in the “Other Files/Comments” section in Buck-IRB.

The request should include the following:

Initial Protocol

Protocols should include the following information (preferably as a separate section):

  • Summary of the intent to contribute data to dbGaP or another NIH data repository
  • Specific sources of the data to be submitted (e.g., all participants in the study, a specific subset of individuals, participants from all sites, etc.)
  • List of the genotypic data that will be provided
  • List of the phenotypic data that will be provided (as applicable)
  • Statement of the proposed restrictions (if any) for access
    • Shared information will be available to for-profit entities unless restricted.
    • Aggregate information will be made available without restriction unless designated as sensitive in the institutional certificate.
  • Plan for removing identifiers from the data to be provided
    • The identities of research participants cannot be disclosed to NIH data repositories; only coded data with all 18 HIPAA identifiers removed will be accepted. The IRB must review researchers’ plans for data coding to determine the plan’s appropriateness for the specific dataset and to provide the assurances required by the institutional certification.
  • For multisite studies, a statement of whether Ohio State will be providing certification on behalf of all participating institutions, and if so, confirmation that the other institutions have agreed.
    • Certification must be provided for all sites contributing samples to dbGaP or another NIH genomic data repository. The lead site may submit one institutional certification on behalf of all collaborating sites. Alternatively, each site providing data may provide its own institutional certification.

Consent Form

Study participants must be informed in the consent process that their study-related materials will be submitted to public, scientific databases such as dbGaP. Note: NIH expects that informed consent for future research use and broad data sharing will have been obtained even if the cell lines or clinical specimens are de-identified. NIH recommends that the data provided to dbGaP or another NIH data repository are coded and that participants are informed that they are free to withdraw their data from the NIH database (although data that have been distributed for approved research cannot be retrieved).

Existing (Open) Studies

An amendment should be submitted to request data deposition into dbGaP or other NIH data repository. The IRB will determine whether the proposal to submit data is consistent with the protocol and the consent form(s) signed by or information presented to research participants.

If the IRB determines that the consent form(s) and information submitted is not consistent with the proposal to submit data to dbGaP or another NIH data repository, the IRB may take one or more of the following actions:

  1. Request revision of the consent form(s) to be consistent with data submission to dbGaP or another NIH data repository
  2. Request contact and re-consent of research participants
  3. Request additional information, as necessary
  4. Determine that the request is not consistent with Ohio State human research protection program principles concerning data submission to dbGaP or another NIH genomic data repository.

The request should include the following:

Amended Protocol

The protocol should be amended to include the following information:

  • Summary of the intent to contribute data to dbGaP or another NIH data repository
  • Specific sources of the data to be submitted (e.g., all participants in the study, a specific subset of individuals, participants from all sites, etc.)
  • List of the genotypic data that will be provided
  • List of the phenotypic data that will be provided (as applicable)
  • Statement of the proposed restrictions (if any) for access
    • Shared information will be available to for-profit entities unless restricted.
    • Aggregate information will be made available without restriction unless designated as sensitive in the institutional certificate.
  • Plan for removing identifiers from the data to be provided
    • The identities of research participants cannot be disclosed to NIH data repositories; only coded data with all 18 HIPAA identifiers removed will be accepted. The IRB must review researchers’ plans for data coding to determine the plan’s appropriateness for the specific dataset and to provide the assurances required by the institutional certification.
  • For multisite studies, a statement of whether Ohio State will be providing certification on behalf of all participating institutions, and if so, confirmation that the other institutions have agreed.
    • Certification must be provided for all sites contributing samples to dbGaP or another NIH genomic data repository. The lead site may submit one institutional certification on behalf of all collaborating sites. Alternatively, each site providing data may provide its own institutional certification.

Consent Form (for data previously collected)

To determine if the required assurances in the institutional certification can be made, the IRB must review all versions of the consent form signed by participants authorizing the researchers to collect participants’ data. If Ohio State will be providing certification on behalf of other participating institutions, consent forms used at these sites must also be reviewed. Previous consent form versions (or consent forms from other sites) should be uploaded as part of the amendment submission in the “Other Files/Comments” section in Buck-IRB.

Consent Form (for new participants)

If participant recruitment is ongoing, the consent form/process should be revised to inform new participants that their study-related data will be submitted to public, scientific databases such as dbGaP. NIH recommends that the data provided to dbGaP or another NIH data repository are coded and that participants are informed that they are free to withdraw their data from the NIH database (although data that have been distributed for approved research cannot be retrieved).

Closed Studies

As amendments are not possible for closed studies, researchers should submit requests and materials for data sharing directly to ORRP staff who will facilitate IRB review of the request. The IRB will determine whether the proposal to submit data is consistent with protocol and the consent form(s) signed by research participants. If the IRB determines that the consent form(s) and information submitted is not consistent with the proposal to submit data to dbGaP or another NIH data repository, the IRB may take one or more of the following actions:

  1. Request contact and re-consent of research participants (this may necessitate reopening the study)
  2. Request additional information, as necessary
  3. Determine that the request is not consistent with Ohio State human research protection program principles concerning data submission to dbGaP or another NIH genomic data repository.

The request should include the following:

Cover Letter

Provide a cover letter that includes the following information:

  • Summary of the intent to contribute data to dbGaP or another NIH data repository
  • Specific sources of the data to be submitted (e.g., all participants in the study, a specific subset of individuals, participants from all sites)
  • List of the genotypic data that will be provided
  • List of the phenotypic data that will be provided (as applicable)
  • Statement of the proposed restrictions (if any) for access
    • Shared information will be available to for-profit entities unless restricted.
    • Aggregate information will be made available without restriction unless designated as sensitive in the institutional certificate.
  • Plan for removing identifiers from the data to be provided
    • The identities of research participants cannot be disclosed to NIH data repositories; only coded data with all 18 HIPAA identifiers removed will be accepted. The IRB must review researchers’ plans for data coding to determine the plan’s appropriateness for the specific dataset and to provide the assurances required by the institutional certification.
  • For multisite studies, a statement of whether Ohio State will be providing certification on behalf of all participating institutions, and if so, confirmation that the other institutions have agreed.
    • Certification must be provided for all sites contributing samples to dbGaP or another NIH genomic data repository. The lead site may submit one institutional certification on behalf of all collaborating sites. Alternatively, each site providing data may provide its own institutional certification.

Protocol

Submit a copy of the original protocol.

Consent Form (data previously collected)

Provide all consent documents used. To determine if the required assurances in the institutional certification can be made, the IRB must review all versions of the consent form signed by participants authorizing the researchers to collect participants’ data. If Ohio State will be providing certification on behalf of other participating institutions, consent forms used at these sites must also be reviewed.

Institutional Certification

The institutional certification includes the following assurances:

  • The institution approves of the submission to the NIH data repository.
  • The proposed data submission is consistent, as appropriate, with applicable local, Ohio, tribal, and federal laws and regulations, as well as institutional policies.
  • The research uses of the data, as well as uses that are specifically excluded by the study and the informed consent documents, are described.
  • An Institutional Review Board and/or Privacy Board (or equivalent body), as applicable, has reviewed the investigator’s proposal for data submission and assures that:
    • The protocol for the collection of genomic and phenotypic data is consistent with 45 CFR Part 46.
    • Data submission and subsequent data sharing for research purposes are not inconsistent with the informed consent of study participants from whom the data were obtained.
    • Consideration was given to risks to individual participants and their families associated with data submitted to NIH-designated data repositories and subsequent sharing.
    • To the extent relevant and possible, consideration was given to risks to groups or populations associated with submitting data to NIH-designated data repositories and subsequent sharing.
    • The investigator’s plan for de-identifying datasets is consistent with the standards outlined in the NIH genomic data sharing policy and/or dbGaP guidelines.

Note: Certification must be provided for all sites contributing samples to dbGaP or another NIH data repository. The lead site may submit one institutional certification on behalf of all collaborating sites. Alternatively, each site providing data may provide its own institutional certification. For more information or to obtain institutional certification templates, see Institutional Certifications on the NIH Genomic Data Sharing website.