NDAR Overview
Home Data Data Dictionary About NDAR Data Sharing Standards Tools Policies & Procedures Training FAQ

Data Definition

Standards

Subject GUIDs | Clinical Assessments | Imaging | Genomics | Data Sharing Regimen

NDAR has established a foundation of initial standards and conventions for data-sharing within the ASD research community. As a research portal, NDAR provides a platform for investigators to define, refine, and standardize data definition for autism research.

NDAR has established the following special interest groups for community involvement:

  • The ASD Data Dictionary Group works through the requirements and procedures needed to support a community-wide ASD data dictionary.
  • The ASD Data Managers Group works through requirements and procedures for data submission to NDAR.

Please contact us at ndar@mail.nih.gov to join a special interest group. Meeting announcements will be sent to your e-mail inbox.


Subject GUIDs

One of the most important standards NDAR supports is the NDAR Global Unique Identifier (GUID). The GUID is a universal subject ID that allows researchers to share data specific to a study participant without exposing personally identifiable information (PII). The GUID has been approved by the NIH Office of General Counsel.

Simons Foundation Autism Research Initiative logo

The GUID system was conceived by the Simons Foundation Autism Research Initiative (SFARI). The GUID software was designed, developed and tested in close collaboration between SFARI and NDAR project teams.

The system is implemented as an NDAR Web service; an investigator inputs identifying information about a participant into a client application and sends encrypted information to a server application, which then returns a GUID.

Generic unique identifiers have the potential to link collections of research data, augment the amount and types of data available for individuals, support detection of overlap between collections and facilitate replication of research findings.

Four pieces of identifying information must be collected from each study participant to generate a valid GUID:

  • Full legal name of participant at birth (as it appears on the birth certificate)
  • Participant date of birth
  • Gender
  • Town or municipality of birth (as it appears on the birth certificate)

The GUID is generated using a free software application installed at the research site. The four items from the birth certificate are encrypted into a hash code, which is then transmitted to NDAR. NDAR then encrypts the hash code to generate a GUID and sends it back to the research site for use. The personally identifiable information (PII) about each participant remains at the research site.


Clinical Assessments

NDAR supports an unlimited number of clinical, demographic, and phenotypic data associated with ASD human subjects research. Although much smaller in size than rich data types like imaging and genomics, the definition of clinical data is especially important for data to be aggregated across projects and data repositories. Provided are some of NDAR's primary tools to be used for this purpose:

NDAR GUID — A common research subject identifier to be used across all research projects is essential for data aggregation. The NDAR GUID provides this capability.

NDAR Data Dictionary — Working with the community, NDAR has defined over 200 clinical, imaging, and genomic research data structures that are now being used in autism research including the common measures used by the NIH funded Autism Centers of Excellence (ACE). These measures include:

Researchers are encouraged to extend these definitions using the NDAR data dictionary tool, which is available within the NDAR Portal. Instructions on using the NDAR data dictionary to support investigator defined data is explained in the Data Dictionary Video Tutorial.

Data Validation Tool — Before data will be accepted by NDAR, all data must pass the formatting, value range, and intra-field validation checks defined in the NDAR Data Dictionary. The Validation Tool (Production Environment or Demonstration Environment) is publically available. It allows a researcher to check their data and fix any discrepancies prior to that data being submitted to NDAR. Refer to the video on Preparing Clinical Data for Submission to learn more about using the Validation Tool. NDAR supports clinical data submissions formats of tab delimited, comma separated value (CSV), or XML. Sample data using these formats compatible with NDAR's Demonstration Portal is available.

Preparing Clinical Data for Submission *

Thumbnail link to Preparing Clinical Data for Submission tutorial
[ Launch Video ]

For general information on preparing data for submission to NDAR, please follow the NDAR Data-Sharing Checklist.


Imaging

NDAR supports the receipt of unprocessed brain images in DICOM format. NDAR also supports processed images in a variety of formats including DICOM, MINC 1.0 and 2.0, Analyze, NIfTI-1, AFNI and SPM. If you are using a different file format, please contact ndarhelp@mail.nih.gov to allow us add it to our list of supported standards.

To submit imaging data to NDAR, researchers are required to run a component of the MIPAV (Medical Image Processing, Analysis, and Visualization) application. Using the MIPAV Component for NDAR, you can quickly prepare your image data for submission by following the steps below.

  1. Place each of your images into a common directory accessible by your computer.
  2. Select the link: http://mipav.cit.nih.gov/mipav_ndar_prod_jws.php. This link will launch a java web start application. A recent version of the Java Runtime Environment (JRE) is needed to run the application.
  3. Thumbnail link to screenshot of MIPAV tool showing how to select files to addSelect to Add files.
  4. Thumbnail link to screenshot of MIPAV tool showing how to open selected filesSelect the files you would like to submit and then select open.
  5. Thumbnail link to screenshot of MIPAV tool showing how to enter metadataEnter the appropriate metadata. Age in months, date of interview, extent, GUID, Dimensions, and site subject ID are required. You may need to scroll down to access all entries.
  6. Thumbnail link to screenshot of MIPAV tool showing how to add or remove filesAdd or remove files, select the output directory, and select finish when all files have the meta data items completed.
  7. Thumbnail link to screenshot of MIPAV tool showing how to complete the process and generate files for validationFinish will generate the files necessary for validation. Included are:
    • The compressed image
    • A jpg of the image to be used allowing others to easily see the type of image in NDAR's query tool result set.
    • An XML file of the data structure. This structure is queryable within the NDAR query tool. The format of the standard NDAR image structure is defined at DataStructures.go?short_name=image01.

  8. Validate the results using the NDAR Validation Tool.

Imaging Submission Tool *

Thumbnail link to image submission tutorial
[ Launch Video ]

NDAR supports multiple imaging data structures.
For questions or more information on working with the NDAR image submission tool, please contact us at ndarhelp@mail.nih.gov.


Genomics

NDAR piloted genomic data submission in July 2010 using the Minimal Information About a Microarray Experiment (MIAME) format. The results of the pilot indicated that a more precise definition of genomics is needed to define and query genomics before it will be possible to aggregate genomics data across research projects and repositories.

Beginning in January 2011, NDAR has created a genomics definition tool for investigators to use for raw genomics data submission. The use of the tool is designed to make genomics submissions very simple. The steps for genomics submission are:

  1. Login to NDAR and select your Collection and choose to Edit Collection
  2. Find the Supporting Documentation section and choose Create Experiment to enter the Genomics Experiment Definition (GED) tool.
  3. Give your Experiment a name and select the nine (9) attributes specific to your experiment:
    1. Molecule and Sub-molecule
    2. Experiment Technology
    3. Vendor and Platform
    4. Extraction Protocols and Kits
    5. Processing Protocols and Kits
    6. Analysis Software
    7. Equipment
    All data is presented for you to choose the fields containing your experiement. NDAR will curate these selections. If any selections for your research are not provided, simply type in the entry and we will get it added.
  4. Once the experiment is saved, you will be returned an Experiment_ID, which is needed for your genomics submission package.
  5. After completing the experiment definition, the two (2) files required for package submission:
    1. Genomic_Sample — Experiment ID needs to be included along with links to the absolute paths of the genomic files.
    2. Genomic_Subject — Definition of GUIDs and other information about the subject.

Defining a Genomics Experiment in NDAR *

Thumbnail link to Defining a Genomics Experiment in NDAR tutorial
[ Launch Video ]

For any questions about this process, contact NDAR at ndarhelp@mail.nih.gov.

Please note that excel files must be saved as tab delimited, csv, or XML to create an NDAR submission package.


Data Sharing Regimen

NDAR has outlined a schedule that includes separate timelines for "descriptive data" and "experimental data," as defined in the [Data Sharing Policy]. This policy is included in the terms and conditions of most ASD-related grant awards. Please contact ndarhelp@mail.nih.gov if more specific guidance is needed.



This page was last updated: Jul 12, 2011