Reset Password

NDAR provides a single access to de-identified autism research data. For permission to download data, you will need an NDAR account with approved access to NDAR or a connected repository (AGRE, IAN, or the ATP). For NDAR access, you need to be a research investigator sponsored by an NIH recognized institution with federal wide assurance. See Request Access for more information. Request an account here.

Warning Notice

This is a U.S. Government computer system, which may be accessed and used only for authorized Government business by authorized personnel. Unauthorized access or use of this computer system may subject violators to criminal, civil, and/or administrative action.

All information on this computer system may be intercepted, recorded, read, copied, and disclosed by and to authorized personnel for official purposes, including criminal investigations. Such information includes sensitive data encrypted to comply with confidentiality and privacy requirements. Access or use of this computer system by any person, whether authorized or unauthorized, constitutes consent to these terms. There is no right of privacy in this system.

You have logged in with a temporary password. Please update your password. Passwords must contain 8 or more characters and must contain at least 3 of the following types of characters:

1 Numbers reported are subjects by age
New Trial
New Project

Format should be in the following format: Activity Code, Institute Abbreviation, and Serial Number. Grant Type, Support Year, and Suffix should be excluded. For example, grant 1R01MH123456-01A1 should be entered R01MH123456

Please select an experiment type below

New Documentation
New Funding Source
New Publication
Evan Eichler eee@gs.washington.edu Shared

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit:


Title, investigators, and Collection Description may be edited along with the Collection Phase. For Collection Phase, the options Pre-enrollment, Enrollment, and Completed can be chosen allowing the Collection Owner to indicate the stage of data collection. Collection State of Private will not be visible to the community. NDAR Staff will change the Collection State to Shared once the agreed date of Data Expected has been reached for any measure. The Collection State of Ongoing Study indicates that the Data Collection is ongoing and will not be shared as defined by NDAR SOP #9

Funding Source

The ability to associate the funding source for the project is provided. For NIH funded grants, this information will be completed for the investigator by linking the grant using the project information (e.g. R01MH123456). Non NIH funded projects will become available here to link that data with the appropriate funding agency.

Clinical Trials

For clinical trials, the clinical trial associated with the collection data is provided.

Collection Summary Collection Charts
Collection Title Collection Investigators Collection Description
Genomic Identification of Autism Loci
Evan E Eichler  (Owner: Eichler, Evan)
The goal of this grant is to use a targeted approach to identify the genes responsible for autism. Three different approaches were put forward to focus on genomic regions associated with autism, each was a dedicated specific aim.

No Data Shared



No Data Shared


Chart Expander
NIH None

http://www.nature.com/ng/journal/v43/n6/full/ng.835.html Software Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations Qualified Researchers
http://www.nature.com/ng/journal/v43/n6/full/ng.835.html Publication Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations Qualified Researchers
sequencing_files_readme_col_1878.pdf Background Readme for sequencing files Qualified Researchers

R01HD65285-01 Genomic Identification of Autism Loci 09/30/2009 08/31/2011 UNIVERSITY OF WASHINGTON

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit:


Omics and EEG experiments can be defined here by selecting "Add New". fMRI and Eye Tracking experiments will be offered in fiscal year 2014. Once an experiment is created, then raw files for these types of experiments should be provided, associating the experiment – through Experiment_ID – with the file (see category experimental)

ID Name Creation Date Status Type
45 Exome Sequencing of 20 Sporadic Cases of Autism Spectrum Disorder Jul 15, 2011 Approved Omics
64 Refinement and discovery of new hotspots of copy number variation associated with autism spectrum disorder Jul 24, 2012 Approved Omics
83 Molecular Inversion Probe Resequencing ASD1 Probe Set May 23, 2013 Approved Omics
84 Molecular Inversion Probe Resequencing ASD2 Probe Set May 23, 2013 Approved Omics
85 Molecular Inversion Probe Resequencing ASD1/2 Combined Probe Set May 23, 2013 Approved Omics
92 Whole Exome Sequencing of SSC samples Jul 26, 2013 Approved Omics

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit:

Shared Data

Data structures that are currently shared are listed.

Genomics Sample Genomics 5618
Genomics Subject Genomics 5107

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit:


Publications relevant to NDAR data are listed below. Use the (+) to add new publications. Once saved, these publications can then be linked to the underlying data in NDAR by selecting the Create Study link providing the ability to define cohorts, assign subjects, define outcome measures and lists the study type, data analysis and results. Analyzed data and results may be shared with the Community through NDAR in this way.

R01HD65285-01 Genomic Identification of Autism Loci 09/30/2009 08/31/2011 UNIVERSITY OF WASHINGTON
Zerr, Troy; Cooper, Gregory M; Eichler, Evan E; Nickerson, Deborah A  "Bioinformatics (Oxford, England)"  Targeted interrogation of copy number variation using SCIMMkit.  (E)PubDate: 10/21/2009
Girirajan, Santhosh; Rosenfeld, Jill A; Cooper, Gregory M; Antonacci, Francesca; Siswara, Priscillia; Itsara, Andy; Vives, Laura; Walsh, Tom; McCarthy, Shane E; Baker, Carl; Mefford, Heather C; Kidd, Jeffrey M; Browning, Sharon R; Browning, Brian L; Dickel, Diane E; Levy, Deborah L; Ballif, Blake...  "Nature genetics"  A recurrent 16p12.1 microdeletion supports a two-hit model for severe developmental delay.  (E)PubDate: 02/14/2010
Rosenfeld, Jill A; Coppinger, Justine; Bejjani, Bassem A; Girirajan, Santhosh; Eichler, Evan E; Shaffer, Lisa G; Ballif, Blake C  "Journal of neurodevelopmental disorders"  Speech delays and behavioral problems are the predominant features in individuals with developmental delays and 16p11.2 microdeletions and microduplications.  (E)PubDate: 03/19/2010

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit:

Data Expected

Below are the data expected, received and shared by NDAR. For any changes to data expected, please contact the NDAR helpdesk at ndarmail.nih.gov. Soon, you will be able to edit these data yourself.

Data Expected Targeted Enrollment Initial Submission Date Subjects Submitted Initial Share Date Subjects Shared
aCGH Raw 3088 Jul 22, 2012 13147 Dec 31, 2012 10725
Sequencing (Raw) 1320 Dec 20, 2012 13147 Jan 15, 2012 10725
Sequencing Analyzed 1320 Dec 20, 2012 0 Apr 20, 2013 0
aCGH Analyzed 3088 Jul 31, 2013 0 Dec 31, 2013 0

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit:


Specific submissions are listed.

Predictors of self-injurious behaviour exhibited by individuals with autism spectrum disorder Presence of an autism spectrum disorder is a risk factor for development of self-injurious behaviour (SIB) exhibited by individuals with developmental disorders. The most salient SIB risk factors historically studied within developmental disorders are level of intellectual disability, communication deficits and presence of specific genetic disorders. Recent SIB research has expanded the search for risk factors to include less commonly studied variables for people with developmental disorders: negative affect, hyperactivity and impulsivity. 612
Refinement and discovery of new hotspots of copy-number variation associated with autism spectrum disorder. Rare copy-number variants (CNVs) have been implicated in autism and intellectual disability. These variants are large and affect many genes but lack clear specificity toward autism as opposed to developmental-delay phenotypes. We exploited the repeat architecture of the genome to target segmental duplication-mediated rearrangement hotspots (n = 120, median size 1.78 Mbp, range 240 kbp to 13 Mbp) and smaller hotspots flanked by repetitive sequence (n = 1,247, median size 79 kbp, range 3-96 kbp) in 2,588 autistic individuals from simplex and multiplex families and in 580 controls. Our analysis identified several recurrent large hotspot events, including association with 1q21 duplications, which are more likely to be identified in individuals with autism than in those with developmental delay (p = 0.01; OR = 2.7). Within larger hotspots, we also identified smaller atypical CNVs that implicated CHD1L and ACACA for the 1q21 and 17q12 deletions, respectively. Our analysis, however, suggested no overall increase in the burden of smaller hotspots in autistic individuals as compared to controls. By focusing on gene-disruptive events, we identified recurrent CNVs, including DPP10, PLCB1, TRPM1, NRXN1, FHIT, and HYDIN, that are enriched in autism. We found that as the size of deletions increases, nonverbal IQ significantly decreases, but there is no impact on autism severity; and as the size of duplications increases, autism severity significantly increases but nonverbal IQ is not affected. The absence of an increased burden of smaller CNVs in individuals with autism and the failure of most large hotspots to refine to single genes is consistent with a model where imbalance of multiple genes contributes to a disease state. 3,285
Transmission disequilibrium of small CNVs in simplex autism. Cohorts: 411 ASD Quads from Simons Simplex Collection 177 Quads from Sanders et al. (PubMed ID: 22495306) 166 Quads from I. Iossifov et al. (PubMed ID: 22542183) 71 Quads from O'Roak et al. (PubMed ID: 22495309) Publication Abstract: We searched for disruptive, genic rare copy-number variants (CNVs) among 411 families affected by sporadic autism spectrum disorder (ASD) from the Simons Simplex Collection by using available exome sequence data and CoNIFER (Copy Number Inference from Exome Reads). Compared to high-density SNP microarrays, our approach yielded ¿2× more smaller genic rare CNVs. We found that affected probands inherited more CNVs than did their siblings (453 versus 394, p = 0.004; odds ratio [OR] = 1.19) and that the probands' CNVs affected more genes (921 versus 726, p = 0.02; OR = 1.30). These smaller CNVs (median size 18 kb) were transmitted preferentially from the mother (136 maternal versus 100 paternal, p = 0.02), although this bias occurred irrespective of affected status. The excess burden of inherited CNVs among probands was driven primarily by sibling pairs with discordant social-behavior phenotypes (p < 0.0002, measured by Social Responsiveness Scale [SRS] score), which contrasts with families where the phenotypes were more closely matched or less extreme (p > 0.5). Finally, we found enrichment of brain-expressed genes unique to probands, especially in the SRS-discordant group (p = 0.0035). In a combined model, our inherited CNVs, de novo CNVs, and de novo single-nucleotide variants all independently contributed to the risk of autism (p < 0.05). Taken together, these results suggest that small transmitted rare CNVs play a role in the etiology of simplex autism. Importantly, the small size of these variants aids in the identification of specific genes as additional risk factors associated with ASD. 1,643
Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. In addition to cohorts of Parents (N=418), Siblings (N=50), and Probands (N=209), the publication publication describes a subset of male (N=47) and female (N=26) autistic subjects, with significant impairment and intellectual disability, and with cognitive impairment, respectively. These subsets have been defined based on hi/low IQ value in Supplementary Table 1 from the publication. Publication Abstract: It is well established that autism spectrum disorders (ASD) have a strong genetic component; however, for at least 70% of cases, the underlying genetic cause is unknown. Under the hypothesis that de novo mutations underlie a substantial fraction of the risk for developing ASD in families with no previous history of ASD or related phenotypes--so-called sporadic or simplex families--we sequenced all coding regions of the genome (the exome) for parent-child trios exhibiting sporadic ASD, including 189 new trios and 20 that were previously reported. Additionally, we also sequenced the exomes of 50 unaffected siblings corresponding to these new (n = 31) and previously reported trios (n = 19), for a total of 677 individual exomes from 209 families. Here we show that de novo point mutations are overwhelmingly paternal in origin (4:1 bias) and positively correlated with paternal age, consistent with the modest increased risk for children of older fathers to develop ASD. Moreover, 39% (49 of 126) of the most severe or disruptive de novo mutations map to a highly interconnected ¿-catenin/chromatin remodelling protein network ranked significantly for autism candidate genes. In proband exomes, recurrent protein-altering mutations were observed in two genes: CHD8 and NTNG1. Mutation screening of six candidate genes in 1,703 ASD probands identified additional de novo, protein-altering mutations in GRIN2B, LAMC3 and SCN1A. Combined with copy number variant (CNV) data, these results indicate extreme locus heterogeneity but also provide a target for future discovery, diagnostics and therapeutics. 677
Multiplex Targeted Sequencing Identifies Recurrently Mutated Genes in Autism Spectrum Disorders Abstract: Exome sequencing studies of autism spectrum disorders (ASDs) have identified many de novo mutations but few recurrently disrupted genes. We therefore developed a modified molecular inversion probe method enabling ultra-low-cost candidate gene resequencing in very large cohorts. To demonstrate the power of this approach, we captured and sequenced 44 candidate genes in 2446 ASD probands. We discovered 27 de novo events in 16 genes, 59% of which are predicted to truncate proteins or disrupt splicing. We estimate that recurrent disruptive mutations in six genes-CHD8, DYRK1A, GRIN2B, TBR1, PTEN, and TBL1XR1¿may contribute to 1% of sporadic ASDs. Our data support associations between specific genes and reciprocal subphenotypes (CHD8-macrocephaly and DYRK1A-microcephaly) and replicate the importance of a B-catenin-chromatin-remodeling network to ASD etiology. 3,262
Derivation of Quality Measures for Structural Images by Neuroimaging Pipelines Using the National Database for Autism Research cloud platform, MRI data were analyzed using neuroimaging pipelines that included packages available as part of the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC) Computational Environment to derive standardized measures of MR image quality. Structural QA was performed according to Haselgrove, et al (http://journal.frontiersin.org/Journal/10.3389/fninf.2014.00052/abstract) to provide values for Signal to Noise (SNR) and Contrast to Noise (CNR) Ratios that can be compared between subjects within NDAR and between other public data releases. 923
Derivation of Brain Structure Volumes from MRI Neuroimages hosted by NDAR using C-PAC pipeline and ANTs An automated pipeline was developed to reference Neuroimages hosted by the National Database for Autism Research (NDAR) and derive volumes for distinct brain structures using Advanced Normalization Tools (ANTs) and the Configurable-Pipeline for the Analysis of Connectomes (C-PAC) platform. This pipeline utilized the ANTs cortical thickness methodology discuessed in "Large-Scale Evaluation of ANTs and Freesurfer Cortical Tchickness Measurements" [http://www.ncbi.nlm.nih.gov/pubmed/24879923] to extract a cortical thickness volume from T1-weighted anatomical MRI data gathered from the NDAR database. This volume was then registered to an stereotaxic-space anatomical template (OASIS-30 Atropos Template) which was acquired from the Mindboggle Project webpage [http://mindboggle.info/data.html]. After registration, the mean cortical thickness was calculated at 31 ROIs on each hemisphere of the brain using the Desikan-Killiany-Tourville (DKT-31) cortical labelling protocol [http://mindboggle.info/faq/labels.html] over the OASIS-30 template. As a result, each subject that was processed has a cortical thickness volume image and a text file with the mean thickness ROIs (in mm) stored in Amazon Web Services (AWS) Simple Storage Service (S3). Additionally, these results were tabulated in an AWS-hosted database (through NDAR) to enable simple, efficient querying and data access. All of the code used to perform this analysis is publicly available on Github [https://github.com/FCP-INDI/ndar-dev]. Additionally, as a computing platform, we developed an Amazon Machine Image (AMI) that comes fully equipped to run this pipeline on any dataset. Using AWS Elastic Cloud Computing (EC2), users can launch our publicly available AMI ("C-PAC with benchmark", AMI ID: "ami-fee34296", N. Virginia region) and run the ANTs cortical thickness pipeline. The AMI is fully compatible with Sun Grid Engine as well; this enables users to perform many pipeline runs in parallel over a cluster-computing framework. 1,943
The burden of de novo coding mutations in autism spectrum disorder We have sequenced exomes from ~2,500 simplex families with children on the autistic spectrum. The de novo (DN) mutation rate increases as either parent ages, and three-quarters come from the father. Affected children have increased incidence of de novo (DN) missense and 'likely gene-disrupting' (LGD) mutations compared to siblings. Virtually all LGD mutations occur opposite wild-type alleles. We estimate that 42% of DN LGD and 13% of DN missense mutations contribute to 9% and 12% of diagnoses, respectively. Including copy number variants, DN mutation in coding sequence contributes nearly 30% overall and 45% of female diagnoses. Males with DN LGDs have lower IQ. Their targets overlap with targets in females on the spectrum and individuals with intellectual disability or schizophrenia, but not significantly with targets in males with higher IQ. We estimate the size of targets of DN LGD mutation in affected females or low IQ males to be ~400 genes, with a similarly sized set of missense targets. The two sets overlap, and are enriched for genes associated with the Fragile X mental retardation protein and embryonically expressed genes. Targets expressed in early embryonic development are enriched mainly in females. DOI for study: 10.15154/1149697 15
Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations Evidence for the etiology of autism spectrum disorders (ASDs) has consistently pointed to a strong genetic component complicated by substantial locus heterogeneity. We sequenced the exomes of 20 individuals with sporadic ASD (cases) and their parents, reasoning that these families would be enriched for de novo mutations of major effect. We identified 21 de novo mutations, 11 of which were protein altering. Protein-altering mutations were significantly enriched for changes at highly conserved residues. We identified potentially causative de novo events in 4 out of 20 probands, particularly among more severely affected individuals, in FOXP1, GRIN2B, SCN1A and LAMC3. In the FOXP1 mutation carrier, we also observed a rare inherited CNTNAP2 missense variant, and we provide functional support for a multi-hit model for disease risk. Our results show that trio-based exome sequencing is a powerful approach for identifying new candidate genes for ASDs and suggest that de novo mutations may contribute substantially to the genetic etiology of ASDs. 60
Detection of structural variants and indels within exome data. We report an algorithm to detect structural variation and indels from 1 base pair (bp) to 1 Mbp within exome sequence data sets. Splitread uses one end-anchored placements to cluster the mappings of subsequences of unanchored ends to identify the size, content and location of variants with high specificity and sensitivity. The algorithm discovers indels, structural variants, de novo events and copy number-polymorphic processed pseudogenes missed by other methods. 60
Test Study for SSC_Subject01 Testing that SSC_Subject01 will be packaged as part of study if subjects from SFARI (collection 2068) are selected when adding subjects to cohort. Cohorts not on individual subject level
Chromosome 15 genes in autism Human chromosome 15q11-q13 is a hotspot for autism genes. Three different genetic aberrations increase autism susceptibility: duplication of 15q11-q13, mutations to UBE3A or maternally inherited deletion of 15q11-q13 (Angelman syndrome) or deletion of the paternally inherited region 15q11-q13. At least one autism susceptibility gene is thought to lie in this interval. We are interested in variations in regulatory regions, including coding regions of genes in 15q11-q13. We propose that these variations may dysregulate the autism susceptibility genes. This may cause a change in their epigenetic state (i.e. imprinting and allele specific expression) or may change their level of expression. 10
Evan Eichler

A Study allows you to define the specific subjects associated with your experimental data allowing you to share data on subjects specific to a publication or result. When created, a Study will be private and available only to person who created it. However, basic information (e.g. description, name, investigators) will be available using the link to copy the URL and share that link in your publication. A study will remain private until you choose to have it shared, ideally at the time of your publication of a result.

Note that it is now possible to share your private data at the subject level using the Data from Papers capability allowing subject/measure level data to be shared at time of publication, without sharing all data from a lab.