Loading...

Reset Password

NDAR provides a single access to de-identified autism research data. For permission to download data, you will need an NDAR account with approved access to NDAR or a connected repository (AGRE, IAN, or the ATP). For NDAR access, you need to be a research investigator sponsored by an NIH recognized institution with federal wide assurance. See Request Access for more information.

Warning Notice

This is a U.S. Government computer system, which may be accessed and used only for authorized Government business by authorized personnel. Unauthorized access or use of this computer system may subject violators to criminal, civil, and/or administrative action.

All information on this computer system may be intercepted, recorded, read, copied, and disclosed by and to authorized personnel for official purposes, including criminal investigations. Such information includes sensitive data encrypted to comply with confidentiality and privacy requirements. Access or use of this computer system by any person, whether authorized or unauthorized, constitutes consent to these terms. There is no right of privacy in this system.

You have logged in with a temporary password. Please update your password. Passwords must contain 8 or more characters and must contain at least 3 of the following types of characters:

Subscribe to our mailing list

Mailing List(s)
Email Format

You are now leaving the National Database for Autism Research (NDAR) web site to go to:

Click on the address above if the page does not change within 10 seconds.

Disclaimer

NDAR is not responsible for the content of this external site and does not monitor other web sites for accuracy.

NDAR GUID

GUID Training

The NDAR GUID is a universal subject ID that allows researchers to share data specific to a study participant without exposing personally identifiable information (PII) and makes it possible to match participants across labs and research data repositories. NDAR is designed to use the GUID, which is the subject ID standard in autism research. Every data structure in the autism Data Dictionary includes this identifier (called subjectkey). Additionally, the GUID is used to associate subjects to cohorts allowing a researcher to link publications directly to raw/analyzed data in NDAR (see NDAR Study).

To create a GUID requires an individual's Legal Name at Birth, Date of Birth, Sex, and City/Municipality of Birth. It is very important to include the information as it appears on the birth certificate. Otherwise, a subject mismatch will occur if the research subject enrolls in other autism research studies. When generating GUIDs for twin subjects, the Get GUIDs for Multiple Subjects function must be used as described below.

The following resources are available for researchers to provide more information to subjects who are consenting to share their data with NDAR.

Steps to Create a GUID:

  1. Get Access - The investigator or data manager in a research lab requests an NDAR account with the GUID privilege through signup or by emailing The NDA Help Desk. No signature or other access is required.
  2. Run the GUID tool - Access the NDAR public website, click "GUID Tool" on the Quick Navigation menu (under the Resources sub-menu), and in step #4 you will find "Launch GUID" which will download and run the software. A recent version of Java is needed to run the tool (see http://java.com/en/download/manual.jsp) and you must be logged into NDAR.
  3. Enter the PII - After the software loads, read and accept the software transfer agreement, enter the PII from the birth certificate, and select "Generate GUID". Using the software requires double entry. If you will be entering more than one GUID, you will likely want to run the "Get GUIDs for Multiple Subjects" using the GUID sample template.
  4. Generate GUID - When you select "Generate GUID", the software creates one-way hash codes that are sent to NDAR. No PII ever leaves your computer. Based upon these hash codes, NDAR will create a new GUID (if the hash codes were never seen before) or return an existing GUID (if the hash codes were seen before).

GUID Tool

Useful GUID Features and Functions

  • Get PseudoGUID - The NDAR GUID is generated based upon a subject's personally identifiable information (PII). However, for some projects the consent given is not sufficient or the PII collected did not include all of the fields needed to generate a standard NDAR GUID. To account for these occurrences, the Get PseudoGUID function is provided. A PseudoGUID is a random identifier that can be promoted to a GUID once the appropriate consent is provided and the necessary PII fields are completed. If hundreds of pseudoGUIDs are needed, contact The NDA Help Desk.
  • Get GUIDs for Multiple Subjects - The GUID interface requires double entry making it useful when entering a few subjects at a time. For research sites that have already collected participant PII, the software can generate GUIDs for multiple subjects at a time. A CSV file using the GUID sample template is required. For twins, enter NO in the Use Existing GUID column for the second twin. Enter YES for all other subjects.
  • Promote PseudoGUID(s) - Using this function allows you to specify a PseudoGUID along with the PII. This will link a PseudoGUID to a standard NDAR GUID, essentially recognizing the two identifiers as the same individual in NDAR. Multiple PseudoGUIDS can be lined using the PseudoGUID sample template.

The GUID system was conceived by the Simons Foundation Autism Research Initiative (SFARI) and implemented by NDAR.

Simons Foundation Autism Research Initiative logo

Review the publication Using Global Unique Identifiers to Link Autism Collections (Johnson et al. 2010) for more information on the GUID. Note that the GUID matching sensitivity has been reduced from the original design.

The following must match - excluding case and special characters - to produce the same GUID:

  • Sex, Legal Name, Date of Birth, and Community of Birth

The following matches will produce the same GUID if the following occur, but the user will be notified to confirm issuance of the same GUID:

  • Sex, Legal Name, Date of Birth
  • Sex, Legal First and Last Name, Date of Birth, and Community of Birth
  • Sex, Legal First Name First Initial, Middle Name, Last Name, Date of Birth, and Community of Birth
  • Sex, Legal Name, Date of Birth excluding Year, and Community of Birth

Resolve Subject Identifiers

The autism Research Community has standardized on the NDAR GUID for cross project subject identifiers. However, many other identifiers will remain in use by the research community. To resolve the appropriate GUID/subjectkey and ensure that no duplicate subjects exist in data retrieved from NDAR, use the Resolve Subject Identifiers interface (single or multiple entries using the csv template).

If a match is not found, NDAR has not yet received that subject identifier. To add subject identifier associations to NDAR, include the subject identifier and submit to NDAR using one of our Resolve Identifiers data structures (e.g. ndar_subject, genomics_subject). We will then resolve the identifiers for you within NDAR allowing us to collectively fix duplicate subject identifiers that are being used. Note that NDAR PseudoGUID promotions are automatically applied to the repository. Once submitted and shared, you will receive all related subjects when using the ndar.nih.gov query and download. If a source exists that we currently don't support, please contact us at The NDA Help Desk so that we can add it.


Data Definition

Working with the autism research community, NDAR has created a data dictionary containing data standards for hundreds of assessments. To contribute data to NDAR, a data structure must be defined that will support the data within your lab. We encourage the definition of new data structures ( see instructions), but whenever possible, existing data structures should be used. Should the current definition fail to meet your needs, please contact us and we will resolve the discrepancy.

Here are a few notes about the data dictionary:

  • You can browse assessments by Type (e.g. genomics, imaging, clinical), Source (e.g. NDAR, PediatricMRI, AGRE) or Category (e.g. Behavior, IQ) to identify available data structures.
  • By clicking a data structure, you can view early versions of the assessment, download the detailed definition, and see any related URLs.
  • IQ descriptions have been removed by request but are available if needed.
  • Submission indicates that NDAR will accept this data. If Submission is "Not Allowed" a more current measure usually exists.
  • Change History shows all changes to the structure within the last six months.
  • Aliases are other names that the Validation Tool will recognize for a specific data element.
  • NDAR has a translate feature allowing data to be converted from a labs value to the NDAR recognized value (e.g. Male to M). While this may be helpful to labs that have already collected data using a different set of values, most labs that are not using the autism standard should consider performing this conversion prior to submission.
  • A web service into our Data Dictionary is available. Contact us for instructions to access.

Data Validation

NDAR requires all data to be successfully validated prior to submission. The Validation Tool is freely available and does not require an account allowing it to be used between sites or by other repositories. A recent version of Java is needed to run the tool (see http://java.com/en/download/manual.jsp). Essentially, the tool allows you to specify where your data resides and will inspect your files for the appropriate data structure (short_name and version are used by NDAR to identify the data structure). Where the Validation Tool finds a match, it will download the associated data dictionary and then validate your data to ensure harmonization. When your data passes validation, you can then create a submission package recognized by NDAR.

A few notes about the Validation Tool:

  • Required fields must be a column in your data and cannot be null/blank. If you do not have the data for that element, you must positively identify that the data is not available as defined by the valid values. If the valid values do not provide such an entry, contact us and we will have it added.
  • Recommended and optional fields will not prevent submission. We would like to receive the data for recommended fields, but they are not required. The Validation Tool provides a warning if data for a recommended field is null and no warning if data for an optional field is null.
  • The Validation Tool will test each cell to ensure that it is harmonized to the autism data dictionary and will validate that the GUID/subjectkey exists within NDAR. Each field must conform to the data element's value range, if one has been defined.
  • The notation of "::" is used to indicate a range. For instance, 0::1200 for interview_age means within a range of 0 to 1200 months old.
  • Associated files (e.g. genomic and imaging files) are not immediately recognized. When you run validation, the tool will check for the existence of the file.
Validation Tool

For information on how to prepare your data for submission to NDAR, see Contribute.


Clinical/Phenotypic Data Standards

NDAR supports an unlimited number of clinical, demographic, and phenotypic data associated with autism human subjects research (see autism Data Dictionary). To ensure harmonization, all data submitted to NDAR must conform to this definition. See the steps to data sharing for more information.

Researchers are encouraged to extend these by providing NDAR with any new assessments. Simply send new definitions of your data to The NDA Help Desk in a format similar to this example. It is helpful to also provide an electronic copy of the assessment with any instructions and supporting documentation. NDAR staff will then curate the definition and make it available to the research community for data submission ( see instructions).


Common Measures

The autism data dictionary contains the measures that are available for data submission. The following measures are encouraged to ensure that common data are acquired - as much as possible - across projects.

For Autism Centers of Excellence II grantees, the ACE Common Measures Version 2 will be used consistently across projects, replacing the original ACE Common Measures, which will be deprecated. The data definition and paper based forms are available at the links below:

Additionally, the following common measures are encouraged allowing us to properly identify and are used to help phenotype subjects, especially control groups.

  • NDAR Subject - This is a general measure allowing you to provide NDAR clinical diagnosis - if a clinical diagnosis is not provided elsewhere, which is typical for control subjects, a diagnosis can be provided using this measure. Additionally, this data structure is used to provide NDAR other subject identifiers (e.g., AGRE, Rutgers, SFARI), allowing us to match subjects across repositories (see Resolve Identifiers).
  • Genomics_Subject - Required for genomics submission, this data structure is similar to NDAR Subject, but accepts other data/bio-repository identifiers allowing us to match subjects across repositories (see Resolve Identifiers).
  • Genetic Test - This form allows you to specify the genetic test used and its result, encompassing all known genetic tests (contact us if a new test is not represented in this form).

Neuro-signal Recordings

NDAR now supports evoked response/event based data from EEG, fMRI, and eye tracking experiments. Submission of this data requires you to first create an experiment by editing your collection, selecting [Edit] from within Data from Labs, login, and then select the Experiment tab. There, select [Add New] to create a new experiment. Once the experiment is created, include the experiment ID that is returned by NDAR in your submission of neuro-signal recordings data. Contact us at The NDA Help Desk for more information on how to provide and share your neuro-signal recordings data.


Imaging Standards

The NIMH Data Archive (NDA) supports many different types of imaging formats. For structural MRI, resting state fMRI, DTI, and spectroscopy, use the NDA Image Data Definition (image03). This definition expects certain header information to be provided. Additionally, the NDA now provides imaging QA for submitted imaging files using the FSL Fast/First computational pipelines. These quality assurance results are now available for query and download under 'Evaluated Data' in the Data Dictionary.


Genomics and -omics Definition

For -omics, use the experiment definition tool to define your experimental parameters. Within Data from Labs, select [Edit], login, and then select the Experiment tab. There, select [Add New] to create a new experiment. Once the experiment is created, include the experiment ID that is returned by NDAR in your submission of -omics data (see -omics definition). Contact us at The NDA Help Desk for more information.