Main content

What is the CMR?

The Cohort Metadata Repository (CMR) is a tool that documents data harmonization across cohorts. Variables from each cohort can be searched and compared to determine if harmonization is possible. Once harmonization has occurred, the harmonized variables and the specifications used to create the variables are also documented in the CMR. The CMR contains only metadata (variable names, formats, codes, descriptions) and no individual-level data.


The purpose of this tool is to facilitate data harmonization across participating cohorts and document harmonization decisions in an accessible way. Centralizing this information should reduce the labor intensive effort of harmonization for future pooling projects.

Information Included

    • Brief descriptions of cohort data collection to provide context for the metadata.
    • Detailed metadata for cohort variables and harmonized variables along with the algorithms used to create the harmonize variables.

For more detailed descriptive information about participating cohorts, please go to the Cancer Epidemiology Descriptive Cohort Database (CEDCD).

Accessing the Site

If you have questions contact us or if you would like to request access to the CMR, please request a New Account.

Citing the CMR in Publications and Presentations

In order to show and maintain support for the CMR, documenting its use through publications is extremely useful to the National Cancer Institute (NCI). Please cite the CMR as follows: The Cohort Metadata Repository (CMR)-2016. Bethesda, MD: National Cancer Institute.

You may also consider including a sentence in your manuscript that acknowledges use of the CMR. For example: Data harmonization was facilitated using information available in the Cohort Metadata Repository (CMR), developed by the National Cancer Institute, Bethesda, MD.

Please wait ...