Skip to Main Content

Publish: NIH Data Management and Sharing Policy

NIH Data Management and Sharing Policy

NIH Policy for Data Management and Sharing

The new NIH Policy for Data Management and Sharing (NOT-OD-21-013) goes into effect on 2023-01-25. It applies to all grants that create scientific data, which includes:

  • Research projects
  • Some career development awards (Ks)
  • Small business SBIR/STTR
  • Research centers

The policy requires that:

  • Grant applications include a 2-page data management and sharing plan.
  • Researchers "maximize the appropriate sharing of scientific data"

There are several specific details to note about the new NIH policy:

  • Researchers should share grant findings, "regardless of whether a publication is produced."
  • Data should be shared in a repository.
  • Sharing will happen sooner: "shared scientific data should be made accessible as soon as possible, and no later than the time of an associated publication, or the end of performance period, whichever comes first.”
  • Researchers can ask for money to pre-pay long-term storage costs and for data management activities.
  • If a DMP needs to be revised, it should be updated and reviewed during regular progress reports.

NIH has a dedicated webpage, https://sharing.nih.gov/, containing information about the policy and a list of NIH-supported data repositories. If you have questions about the policy or where to share your research data, please contact library@caltech.edu.

Data Management and Sharing Plans (DMSPs)

Plan Format and Contents

NIH has a recommended (but not required) template for creating a DMSP, which is available here.

The Library created a handout of Caltech guidance for creating a DMSP.

The Library developed an example NIH DMSP based on the general template; please refer to your funding opportunity and ICO, as DMSP requirements may differ.

Here is a general DSMP checklist to ensure that your NIH DMSP contains all of the necessary elements.

Genomic Data Sharing

Please note that the NIH Genomic Data Sharing (GDS) Policy is being incorporated into the Data Management and Sharing Policy. Researchers are still expected to share genomic data in the ways outlined by the GDS Policy but plans for genomic data sharing should be written into the DMSP instead of a separate GDS Plan.

Budgeting for Data Management

Where data management costs will be included in the grant, be sure to include them in both the budget and budget justification. Allowable costs include:

  • Curating data and developing supporting documentation
  • Local data management considerations
  • Preserving and sharing data through established repositories

More information on allowable costs is available in NIH Notice NOT-OD-21-015.

Adding data management costs into the grant application:

  • For line-item budgets, report costs as "Data Management and Sharing Costs" in the R&R Budget Form: line item in section F. Other Direct Costs, and provide justification in R&R Budget Form: section L. Budget Justification.
  • For modular budgets, describe "Data Management and Sharing Costs" and their justification in PHS 398 Modular Budget Form: within Additional Narrative Justification.

Reviewing DMS Plans

Initially, grant reviewers will NOT review DMSPs except where sharing is "integral to the funding opportunity". Reviewers will see and can comment on budget justifications for data management and sharing costs. Program officers will review DMSPs and can ask for revisions to be made to DMSPs through the Just-in-Time process.

The Library is happy to review DMSP prior to grant submission. Please email us at data@caltech.edu.

Sharing Data

Picking a Repository

If there is a repository where your data is expected to be shared or subject repository that is a good match for your data, plan to share your data in that repository. NIH maintains a list of NIH-supported data repositories, which is a good starting point for choosing a subject repository.

If there is no obvious subject repository, Caltech Library offers a free data sharing service, CaltechDATA at https://data.caltech.edu, that accepts any type of data associated with Caltech projects. CaltechDATA offers standard data preservation and DOI (permanent identifier) services. For data > 500 GB, please see the section below on "Sharing Large Data".

Other repositories are available, including generalist repositories and other subject repositories. Contact us at data@caltech.edu and we will assist you in finding the best match for your data.

Timing for Sharing

Data associated with publication should be made available at the time for first publication, either electronic or in print. Preprints do not count as publications for this NIH requirements.

If not already shared, all data supporting grant findings should be made available by the end of the grant performance period. No-cost extensions push back this deadline for sharing.

Sharing Data Derived from Human Participants

The NIH DMS Policy wants researchers to responsibly share data derived from human subjects. Management and sharing of American Indian/Alaska Native participant data has extra guidance, detailed in this supplemental.

NIH has the following best practices for protecting participant privacy when sharing data:

  1. Apply appropriate de-identification.
  2. Establish scientific data sharing and use agreements. 
  3. Understand and communicate legal protections against disclosure and misuse.

Even with de-identification, data may need to be shared in a controlled way. NIH has a supplemental on "Protecting Privacy When Sharing Human Research Participant Data" containing more information.

Please note that CaltechDATA does not accept human-subjects data, which should instead be shared in a data repository with access restrictions.

We encourage researchers to be in contact with Caltech IRB about complying with the new NIH policy in a way that protects participant privacy.

Sharing Large Data

CaltechDATA provides researchers with 500GB of free storage. CaltechDATA can accommodate storing larger volumes of data, but please reach out to Caltech Library first at data@caltech.edu to ensure we can meet your needs. While Caltech Library intends for all data uploaded to CaltechDATA to be available in perpetuity, charges for large data storage are based on the length of time that we guarantee data will be available. 

Caltech Library will work with your research group to determine what storage options are the best fit for your project. All storage charges are one-time charges and generally processed when the data are uploaded and the supporting grant is still active. There are number of potential models; here are some examples.

  • Primary data files are stored utilizing the research group’s Caltech HPC storage allocation. Groups with large storage volumes pay a one-time $ 62.50 / TB charge to cover offsite backups for 5 years.
  • The group pays a one-time $300 / TB upload fee, which covers all storage costs for 5 years.
  • Files are stored on a storage allocation utilizing national resources such as ACCESS (https://access-ci.org/), with offsite backups paid for utilizing AWS credits from CloudBank (https://www.cloudbank.org/). There is often no additional charge associated with these storage allocations, but they may only be available  to certain types of data or grants. These options have a separate application and approval process outside of Caltech Library.