NIMH Data Archive - Data

Genomics

Neuroimaging

Phenotype¹

New Trial
Clinical Trial

¹ Numbers reported are subjects by age

New Project
Grant/Project Number

Format should be in the following format: Activity Code, Institute Abbreviation, and Serial Number. Grant Type, Support Year, and Suffix should be excluded. For example, grant 1R01MH123456-01A1 should be entered R01MH123456

Collection - Use Existing Experiment

To associate an experiment to the current collection, just select an axperiment from the table below then click the associate experiment button to persist your changes (saving the collection is not required). Note that once an experiment has been associated to two or more collections, the experiment will not longer be editable.

The table search feature is case insensitive and targets the experiment id, experiment name and experiment type columns. The experiment id is searched only when the search term entered is a number, and filtered using a startsWith comparison. When the search term is not numeric the experiment name is used to filter the results.

Select	Experiment Id	Experiment Name	Experiment Type	Created On

24	HI-NGS_R1	Omics	02/16/2011
475	MB1-10 (CHOP)	Omics	06/07/2016
490	Illumina Infinium PsychArray BeadChip Assay	Omics	07/07/2016
501	PharmacoBOLD Resting State	fMRI	07/27/2016
506	PVPREF	Omics	08/05/2016
509	ABC-CT Resting v2	EEG	08/18/2016
13	Comparison of FI expression in Autistic and Neurotypical Homo Sapiens	Omics	12/28/2010
18	AGRE/Broad Affymetrix 5.0 Genotype Experiment	Omics	01/06/2011
22	Stitching PCR Sequencing	Omics	02/14/2011
26	ASD_Methylation	Omics	03/01/2011
29	Microarray family 03 (father, mother, sibling)	Omics	03/24/2011
37	Standard paired-end sequencing of BCRs	Omics	04/19/2011
38	Illumina Mate-Pair BCR sequencing	Omics	04/19/2011
39	Custom Jumping Libraries	Omics	04/19/2011
40	Custom CapBP	Omics	04/19/2011
41	Immunofluorescence	Omics	05/11/2011
43	Autism brain sample genotyping, Illumina	Omics	05/16/2011
47	ARRA Autism Sequencing Collaboration at Baylor. SOLiD 4 System	Omics	08/01/2011
53	AGRE Omni1-quad	Omics	10/11/2011
59	AGP genotyping	Omics	04/03/2012
60	Ultradeep 454 sequencing of synaptic genes from postmortem cerebella of individuals with ASD and neurotypical controls	Omics	06/23/2012
63	Microemulsion PCR and Targeted Resequencing for Variant Detection in ASD	Omics	07/20/2012
76	Whole Genome Sequencing in Autism Families	Omics	01/03/2013
519	Resting	fMRI	11/08/2016
90	Genotyped IAN Samples	Omics	07/09/2013
91	NJLAGS Axiom Genotyping Array	Omics	07/16/2013
93	AGP genotyping (CNV)	Omics	09/06/2013
106	Longitudinal Sleep Study. H20 200. Channel set 2	EEG	11/07/2013
107	Longitudinal Sleep Study. H20 200. Channel set 3	EEG	11/07/2013
108	Longitudinal Sleep Study. AURA 200	EEG	11/07/2013
105	Longitudinal Sleep Study. H20 200. Channel set 1	EEG	11/07/2013
109	Longitudinal Sleep Study. AURA 400	EEG	11/07/2013
116	Gene Expression Analysis WG-6	Omics	01/07/2014
131	Jeste Lab UCLA ACEii: Charlie Brown and Sesame Street - Project 1	Eye Tracking	02/27/2014
132	Jeste Lab UCLA ACEii: Animacy - Project 1	Eye Tracking	02/27/2014
133	Jeste Lab UCLA ACEii: Mom Stranger - Project 2	Eye Tracking	02/27/2014
134	Jeste Lab UCLA ACEii: Face Emotion - Project 3	Eye Tracking	02/27/2014
145	AGRE/FMR1_Illumina.JHU	Omics	04/14/2014
146	AGRE/MECP2_Sanger.JHU	Omics	04/14/2014
147	AGRE/MECP2_Junior.JHU	Omics	04/14/2014
151	Candidate Gene Identification in familial Autism	Omics	06/09/2014
152	NJLAGS Whole Genome Sequencing	Omics	07/01/2014
154	Math Autism Study - Vinod Menon	fMRI	07/15/2014
155	Resting	fMRI	07/25/2014
156	Speech	fMRI	07/25/2014
159	Emotion	fMRI	07/25/2014
160	syllable contrast	EEG	07/29/2014
167	School-age naturalistic stimuli	Eye Tracking	09/19/2014
44	AGRE/Broad Affymetrix 5.0 Genotype Experiment	Omics	06/27/2011
45	Exome Sequencing of 20 Sporadic Cases of Autism Spectrum Disorder	Omics	07/15/2011

Collection - Add Experiment

Add Supporting Documentation

Funding Source:
URL:

To add an existing Data Structure, enter its title in the search bar. If you need to request changes, select the indicator "No, it requires changes to meet research needs" after selecting the Structure, and upload the file with the request changes specific to the selected Data Structure. Your file should follow the Request Changes Procedure. If the Data Structure does not exist, select "Request New Data Structure" and upload the appropriate zip file.

Use/Modify Existing Data Structure

Request New Data Structure

Targeted Enrollment:

Initial Submission Date:

Initial Share Date:

Data Structure Search:

Data Structures:

Submit

Request Submission Exemption

Not Eligible

The Data Expected list for this Collection shows some raw data as missing. Contact the NDA Help Desk with any questions.

Please confirm that you will not be enrolling any more subjects and that all raw data has been collected and submitted.

Collection Updated

Your Collection is now in Data Analysis phase and exempt from biannual submissions. Analyzed data is still expected prior to publication or no later than the project end date.

[CMS] Error

[CMS]

Unable to change collection phase where targeted enrollment is less than 90%

You have requested to move the sharing dates for the following assessments:

Data Expected Item	Original Sharing Date	New Sharing Date

Please provide a reason for this change, which will be sent to the Program Officers listed within this collection:

Explanation must be between 20 and 200 characters in length.

Please press Save or Cancel

Sequenced Treatment Alternatives to Relieve Depression (STAR*D) #2148

General
Experiments (0)
Shared Data
Publications (1)
Associated Studies (16)

Collection Title	Collection Investigators	Collection Description
Collection Title:	Sequenced Treatment Alternatives to Relieve Depression (STAR*D)
Collection Investigators:	A. John Rush
Collection Description:	The STARD was a multicenter longitudinal NIMH sponsored study. STARD was to determine the short- and long-term effects of different sequences of medication and/or psychotherapy for the treatment of unipolar depressions that have not responded adequately to an initial standard antidepressant trial. In particular, STAR*D assessed and compared the effectiveness of different sets of treatment options: 1) augmenting the first antidepressant with another medication or psychotherapy, 2) Discontinuing the first antidepressant and switching to another antidepressant or psychotherapy
Data Repository:	NIMH Data Archive
Permission Group:
Collection Creation Date:	09/27/2015
NIH Research Initiative:	NIMH Repository & Genomics Resource (NRGR)
Collection Phase:	Funding Completed
Collection Sub-Phase:	Close Out
Blinded Clinical Trial:	No
Subjects Shared:	4,266

{"values":[]}

Loading Chart...

Funding Sources:

Funding Source Name	Funding Source URL
NIH - Contract	None

Supporting Documentation:

File Name	File Type	Description	Audience
Published papers.pdf	Publication	Published Papers	Qualified Researchers
STAR-D Protocol.pdf	Analysis Protocol	Protocol	Qualified Researchers
Clinical Procedures Manual.pdf	Methods	Clinical Procedures Manual	Qualified Researchers
01_Introduction.pdf	Background	Introduction to CT	Qualified Researchers
02_Overview_of_Theoretical_Assumptions.pdf	Background	Theoretical Assumptions	Qualified Researchers
03_STARD_Cognitive_Therapy_Training_Program.pdf	Methods	Training	Qualified Researchers
04_Data_Collection_Procedures_for_Therapists.pdf	Methods	Collection Procedures	Qualified Researchers
05_Cognitive_Therapy_Description_(for_subjects).pdf	Methods	Cog Therapy	Qualified Researchers
Enrollment Matrix.xls	Methods	Assessment Schedules	Qualified Researchers
FollowUp Matrix.xls	Methods	Assessment Schedules	Qualified Researchers
Level 1 Matrix.xls	Methods	Assessment Schedules	Qualified Researchers
Level 2 Matrix.xls	Methods	Assessment Schedules	Qualified Researchers
Level 2a Matrix.xls	Methods	Assessment Schedules	Qualified Researchers
Level 3 Matrix.xls	Methods	Assessment Schedules	Qualified Researchers
Level 4 Matrix.xls	Methods	Assessment Schedules	Qualified Researchers
MedicationLog.xlsx	Other	Concomitant Medication Log	Qualified Researchers
Supporting Documentation.zip	Methods	Supporting Documentation: Clinics and Re-Entry	Qualified Researchers
NDCT Dictionary to STARD Instruments.pdf	Background	NDCT Dictionary to STARD Instruments	Qualified Researchers
IVR_calltype.xlsx	Results	Additional timepoint information	Qualified Researchers

Grant Information:

Clinical Trials:

Brief Summary	Status	Clinical Trial ID	Study ID	Principal Investigator	Start Date	End Date
STAR*D focuses on non-psychotic major depressive disorder in adults who are seen in outpatient settings. The primary purpose of this research study is to determine which treatments work best if the first treatment with medication does not produce an acceptable response. Participants will first receive citalopram, an SSRI medication; if symptoms remain after 8-12 weeks of treatment, up to four other levels of treatment will be offered, including cognitive therapy and other medications. There are no placebo treatments. Some patients may require a combination of two or more treatments to obtain full benefit. Participation could last from 15 to 27 months and involve up to 30 clinic visits. Participants will be interviewed by telephone throughout the study about their symptoms, daily functioning, treatment side effects, use of the health care system, and satisfaction with treatment. There will be a one-year follow up for participants once their depression has been successfully treated	Completed	NCT00021528	N01 MH90003	A. John Rush, MD	July 2001	September 2006

helpcenter.collection.general-tab

Collection - General Tab

Fields available for edit on the top portion of the page include:

Collection Title
Investigators
Collection Description
Collection Phase
Funding Source
Clinical Trials

Collection Phase: The current status of a research project submitting data to an NDA Collection, based on the timing of the award and/or the data that have been submitted.

Pre-Enrollment: The default entry made when the NDA Collection is created.
Enrolling: Data have been submitted to the NDA Collection or the NDA Data Expected initial submission date has been reached for at least one data structure category in the NDA Collection.
Data Analysis: Subject level data collection for the research project is completed and has been submitted to the NDA Collection. The NDA Collection owner or the NDA Help Desk may set this phase when they’ve confirmed data submission is complete and submitted subject counts match at least 90% of the target enrollment numbers in the NDA Data Expected. Data submission reminders will be turned off for the NDA Collection.
Funding Completed: The NIH grant award (or awards) associated with the NDA Collection has reached its end date. NDA Collections in Funding Completed phase are assigned a subphase to indicate the status of data submission.
- The Data Expected Subphase indicates that NDA expects more data will be submitted
- The Closeout Subphase indicates the data submission is complete.
- The Sharing Not Met Subphase indicates that data submission was not completed as expected.

Blinded Clinical Trial Status:

This status is set by a Collection Owner and indicates the research project is a double blinded clinical trial. When selected, the public view of Data Expected will show the Data Expected items and the Submission Dates, but the targeted enrollment and subjects submitted counts will not be displayed.
Targeted enrollment and subjects submitted counts are visible only to NDA Administrators and to the NDA Collection or as the NDA Collection Owner.
When an NDA Collection that is flagged Blinded Clinical Trial reaches the maximum data sharing date for that Data Repository (see https://nda.nih.gov/nda/sharing-regimen.html), the embargo on Data Expected information is released.

Funding Source

The organization(s) responsible for providing the funding is listed here.

Supporting Documentation

Users with Submission privileges, as well as Collection Owners, Program Officers, and those with Administrator privileges, may upload and attach supporting documentation. By default, supporting documentation is shared to the general public, however, the option is also available to limit this information to qualified researchers only.

Grant Information

Identifiable details are displayed about the Project of which the Collection was derived from. You may click in the Project Number to view a full report of the Project captured by the NIH.

Clinical Trials

Any data that is collected to support or further the research of clinical studies will be available here. Collection Owners and those with Administrator privileges may add new clinical trials.

Frequently Asked Questions

How does the NIMH Data Archive (NDA) determine which Permission Group data are submitted into?

During Collection creation, NDA staff determine the appropriate Permission Group based on the type of data to be submitted, the type of access that will be available to data access users, and the information provided by the Program Officer during grant award.
How do I know when a NDA Collection has been created?

When a Collection is created by NDA staff, an email notification will automatically be sent to the PI(s) of the grant(s) associated with the Collection to notify them.
Is a single grant number ever associated with more than one Collection?

The NDA system does not allow for a single grant to be associated with more than one Collection; therefore, a single grant will not be listed in the Grant Information section of a Collection for more than one Collection.
Why is there sometimes more than one grant included in a Collection?

In general, each Collection is associated with only one grant; however, multiple grants may be associated if the grant has multiple competing segments for the same grant number or if multiple different grants are all working on the same project and it makes sense to hold the data in one Collection (e.g., Cooperative Agreements).

Glossary

Administrator Privilege

A privilege provided to a user associated with an NDA Collection or NDA Study whereby that user can perform a full range of actions including providing privileges to other users.
Collection Owner

Generally, the Collection Owner is the contact PI listed on a grant. Only one NDA user is listed as the Collection owner. Most automated emails are primarily sent to the Collection Owner.
Collection Phase
The Collection Phase provides information on data submission as opposed to grant/project completion so while the Collection phase and grant/project phase may be closely related they are often different. Collection users with Administrative Privileges are encouraged to edit the Collection Phase. The Program Officer as listed in eRA (for NIH funded grants) may also edit this field. Changes must be saved by clicking the Save button at the bottom of the page. This field is sortable alphabetically in ascending or descending order. Collection Phase options include:
- Pre-Enrollment: A grant/project has started, but has not yet enrolled subjects.
- Enrolling: A grant/project has begun enrolling subjects. Data submission is likely ongoing at this point.
- Data Analysis: A grant/project has completed enrolling subjects and has completed all data submissions.
- Funding Completed: A grant/project has reached the project end date.
Collection Title

An editable field with the title of the Collection, which is often the title of the grant associated with the Collection.
Grant

Provides the grant number(s) for the grant(s) associated with the Collection. The field is a hyperlink so clicking on the Grant number will direct the user to the grant information in the NIH Research Portfolio Online Reporting Tools (RePORT) page.
Supporting Documentation

Various documents and materials to enable efficient use of the data by investigators unfamiliar with the project and may include the research protocol, questionnaires, and study manuals.
NIH Research Initiative

NDA Collections may be organized by scientific similarity into NIH Research Initiatives, to facilitate query tool user experience. NIH Research Initiatives map to one or multiple Funding Opportunity Announcements.
Permission Group

Access to shared record-level data in NDA is provisioned at the level of a Permission Group. NDA Permission Groups consist of one or multiple NDA Collections that contain data with the same subject consents.
Planned Enrollment

Number of human subject participants to be enrolled in an NIH-funded clinical research study. The data is provided in competing applications and annual progress reports.
Actual Enrollment

Number of human subjects enrolled in an NIH-funded clinical research study. The data is provided in annual progress reports.
NDA Collection

A virtual container and organization structure for data and associated documentation from one grant or one large project/consortium. It contains tools for tracking data submission and allows investigators to define a wide array of other elements that provide context for the data, including all general information regarding the data and source project, experimental parameters used to collect any event-based data contained in the Collection, methods, and other supporting documentation. They also allow investigators to link underlying data to an NDA Study, defining populations and subpopulations specific to research aims.
Data Use Limitations

Data Use Limitations (DULs) describe the appropriate secondary use of a dataset and are based on the original informed consent of a research participant. NDA only accepts consent-based data use limitations defined by the NIH Office of Science Policy.
Total Subjects Shared

The total number of unique subjects for whom data have been shared and are available for users with permission to access data.

Contact NDA Help Desk

ID	Name	Created Date	Status	Type
No records found.

helpcenter.collection.experiments-tab

Collection - Experiments

The number of Experiments included is displayed in parentheses next to the tab name. You may download all experiments associated with the Collection via the Download button. You may view individual experiments by clicking the Experiment Name and add them to the Filter Cart via the Add to Cart button.

Collection Owners, Program Officers, and users with Submission or Administrative Privileges for the Collection may create or edit an Experiment.

Please note: The creation of an NDA Experiment does not necessarily mean that data collected, according to the defined Experiment, has been submitted or shared.

Frequently Asked Questions

Can an Experiment be associated with more than one Collection?
Yes -see the “Copy” button in the bottom left when viewing an experiment. There are two actions that can be performed via this button:
1. Copy the experiment with intent for modifications.
2. Associate the experiment to the collection. No modifications can be made to the experiment.

Glossary

Experiment Status

An Experiment must be Approved before data using the associated Experiment_ID may be uploaded.
Experiment ID

The ID number automatically generated by NDA which must be included in the appropriate file when uploading data to link the Experiment Definition to the subject record.

Contact NDA Help Desk

Shared Data:

Title	Type	Number of Subjects
Adverse Events	Clinical Assessments	75
Clinic Visit	Clinical Assessments	3680
Cumulative Illness Rating Scale	Clinical Assessments	4041
Demographics Form	Clinical Assessments	4040
End of Study Form	Clinical Assessments	1872
Hamilton Rating Scale for Depression	Clinical Assessments	4041
IVR Form	Clinical Assessments	4041
Inventory of Depressive Symptomatology	Clinical Assessments	3890
Level Exit Form	Clinical Assessments	4041
Medication History	Clinical Assessments	4040
Patient-Rated Inventory of Side Effects	Clinical Assessments	3672
Pregnancy Outcome Form	Clinical Assessments	18
Protocol Eligibility	Clinical Assessments	4173
Protocol Violators	Clinical Assessments	3060
Psychiatric Diagnostic Screening Questionnaire	Clinical Assessments	3999
Psychiatric History	Clinical Assessments	4040
Quality of Life Enjoyment and Satisfaction Questionnaire	Clinical Assessments	3818
Quick Inventory of Depressive Symptomatology	Clinical Assessments	4039
Research Outcomes Assessors	Clinical Assessments	1857
Screening Form	Clinical Assessments	4037
Serious Adverse Events	Clinical Assessments	229
Short Form Health Survey	Clinical Assessments	3818
Side Effects	Clinical Assessments	3671
Therapist Checklist	Clinical Assessments	132
Utilization and Cost Questionnaire	Clinical Assessments	3818
Work Productivity and Activity Impairment	Clinical Assessments	3818
Work and Social Adjustment Scale Depression	Clinical Assessments	3818

helpcenter.collection.shared-data-tab

Collection - Shared Data

This tab provides a quick overview of the Data Structure title, Data Type, and Number of Subjects that are currently Shared for the Collection. The information presented in this tab is automatically generated by NDA and cannot be edited. If no information is visible on this tab, this would indicate the Collection does not have shared data or the data is private.

The shared data is available to other researchers who have permission to access data in the Collection's designated Permission Group(s). Use the Download button to get all shared data from the Collection to the Filter Cart.

Frequently Asked Questions

How will I know if another researcher uses data that I shared through the NIMH Data Archive (NDA)?

To see what data your project have submitted are being used by a study, simply go the Associated Studies tab of your collection. Alternatively, you may review an NDA Study Attribution Report available on the General tab.
Can I get a supplement to share data from a completed research project?

Often it becomes more difficult to organize and format data electronically after the project has been completed and the information needed to create a GUID may not be available; however, you may still contact a program staff member at the appropriate funding institution for more information.
Can I get a supplement to share data from a research project that is still ongoing?

Unlike completed projects where researchers may not have the information needed to create a GUID and/or where the effort needed to organize and format data becomes prohibitive, ongoing projects have more of an opportunity to overcome these challenges. Please contact a program staff member at the appropriate funding institution for more information.

Glossary

Data Structure

A defined organization and group of Data Elements to represent an electronic definition of a measure, assessment, questionnaire, or collection of data points. Data structures that have been defined in the NDA Data Dictionary are available at https://nda.nih.gov/general-query.html?q=query=data-structure
Data Type

A grouping of data by similar characteristics such as Clinical Assessments, Omics, or Neurosignal data.
Shared

The term 'Shared' generally means available to others; however, there are some slightly different meanings based on what is Shared. A Shared NDA Study is viewable and searchable publicly regardless of the user's role or whether the user has an NDA account. A Shared NDA Study does not necessarily mean that data used in the NDA Study have been shared as this is independently determined. Data are shared according the schedule defined in a Collection's Data Expected Tab and/or in accordance with data sharing expectations in the NDA Data Sharing Terms and Conditions. Additionally, Supporting Documentation uploaded to a Collection may be shared independent of whether data are shared.

Contact NDA Help Desk

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit on this page:

Publications

Publications relevant to NDA data are listed below. Most displayed publications have been associated with the grant within Pubmed. Use the "+ New Publication" button to add new publications. Publications relevant/not relevant to data expected are categorized. Relevant publications are then linked to the underlying data by selecting the Create Study link. Study provides the ability to define cohorts, assign subjects, define outcome measures and lists the study type, data analysis and results. Analyzed data and results are expected in this way.

PubMed ID	Study	Title	Journal	Authors	Date	Status
23480315	Study (417)	The clinical relevance of self-reported premenstrual worsening of depressive symptoms in the management of depressed outpatients: a STAR*D report.	Journal of women's health (2002)	Haley CL, Sung SC, Rush AJ, Trivedi MH, Wisniewski SR, Luther JF, Kornstein SG	March 2013	Relevant

helpcenter.collection.publications-tab

Collection - Publications

The number of Publications is displayed in parentheses next to the tab name. Clicking on any of the Publication Titles will open the Publication in a new internet browsing tab.

Collection Owners, Program Officers, and users with Submission or Administrative Privileges for the Collection may mark a publication as either Relevant or Not Relevant in the Status column.

Frequently Asked Questions

How can I determine if a publication is relevant?

Publications are considered relevant to a collection when the data shared is directly related to the project or collection.
Where does the NDA get the publications?

PubMed, an online library containing journals, articles, and medical research. Sponsored by NiH and National Library of Medicine (NLM).

Glossary

Create Study

A link to the Create an NDA Study page that can be clicked to start creating an NDA Study with information such as the title, journal and authors automatically populated.
Not Determined Publication

Indicates that the publication has not yet been reviewed and/or marked as Relevant or Not Relevant so it has not been determined whether an NDA Study is expected.
Not Relevant Publication

A publication that is not based on data related to the aims of the grant/project associated with the Collection or not based on any data such as a review article and, therefore, an NDA Study is not expected to be created.
PubMed

PubMed provides citation information for biomedical and life sciences publications and is managed by the U.S. National Institutes of Health's National Library of Medicine.
PubMed ID

The PUBMed ID is the unique ID number for the publication as recorded in the PubMed database.
Relevant Publication

A publication that is based on data related to the aims of the grant/project associated with the Collection and, therefore, an NDA Study is expected to be created.

Contact NDA Help Desk

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit on this page:

Associated Studies

Studies that have been defined using data from a Collection are important criteria to determine the value of data shared. The number of subjects column displays the counts from this Collection that are included in a Study, out of the total number of subjects in that study. The Data Use column represents whether or not the study is a primary analysis of the data or a secondary analysis. State indicates whether the study is private or shared with the research community.

Study NameFilter by Study Name	AbstractFilter by Abstract	Collection/Study SubjectsFilter by Collection/Study Subjects	Data UsageFilter by Data Usage	StateFilter by State
Towards Outcome-Driven Patient Subgroups: A Machine Learning Analysis Across Six Depression Treatment Studies	Importance: Major depressive disorder (MDD) is a heterogeneous condition; multiple underlying neurobiological substrates could be associated with treatment response variability. Understanding the sources of this variability and predicting outcomes has been elusive. Machine learning (ML) has shown promise in predicting treatment response in MDD, but one limitation has been the lack of clinical interpretability of machine learning models, limiting clinician confidence in model results. Objective: To develop a machine learning model to derive treatment-relevant patient profiles using clinical and demographic information. Design: We analyzed data from six clinical trials of pharmacological treatment for depression (total n = 5438) using the Differential Prototypes Neural Network (DPNN), a neural network model that derives patient prototypes which can be used to derive treatment-relevant patient clusters while learning to generate probabilities for differential treatment response. A model classifying remission and outputting individual remission probabilities for five first-line monotherapies and three combination treatments was trained using clinical and demographic data. Setting: Previously-conducted clinical trials of antidepressant medications. Participants: Patients with MDD. Main outcomes and measures: Model validity and clinical utility were measured based on area under the curve (AUC) and expected improvement in sample remission rate with model-guided treatment, respectively. Post-hoc analyses yielded clusters (subgroups) based on patient prototypes learned during training. Prototypes were evaluated for interpretability by assessing differences in feature distributions (e.g. age, sex, symptom severity) and treatment-specific outcomes. Results: A 3-prototype model achieved an AUC of 0.66 and an expected absolute improvement in population remission rate of 6.5% (relative improvement of 15.6%). We identified three treatment-relevant patient clusters. Cluster A patients tended to be younger, to have increased levels of fatigue and more severe symptoms. Cluster B patients tended to be older, female with less severe symptoms, and the highest remission rates. Cluster C patients had more severe symptoms, lower remission rates, more psychomotor agitation, more intense suicidal ideation, more somatic genital symptoms, and showed improved remission with venlafaxine. Conclusion and Relevance: It is possible to produce novel treatment-relevant patient profiles using machine learning models; doing so may improve precision medicine for depression. Note: This model is not currently the subject of any active clinical trials and is not intended for clinical use.	4262/6074	Secondary Analysis	Shared
Treatment selection using prototyping in latent-space with application to depression treatment	Machine-assisted treatment selection commonly follows one of two paradigms: a fully personalized paradigm which ignores any possible clustering of patients; or a sub-grouping paradigm which ignores personal differences within the identified groups. While both paradigms have shown promising results, each of them suffers from important limitations. In this article, we propose a novel deep learning-based treatment selection approach that is shown to strike a balance between the two paradigms using latent-space prototyping. Our approach is specifically tailored for domains in which effective prototypes and sub-groups of patients are assumed to exist, but groupings relevant to the training objective are not observable in the non-latent space. In an extensive evaluation, using both synthetic and Major Depressive Disorder (MDD) real-world clinical data describing 4754 MDD patients from clinical trials for depression treatment, we show that our approach favorably compares with state-of-the-art approaches. Specifically, the model produced an 8% absolute and 23% relative improvement over random treatment allocation. This is potentially clinically significant, given the large number of patients with MDD. Therefore, the model can bring about a much desired leap forward in the way depression is treated today.	4134/5946	Secondary Analysis	Shared
Analysis of Features Selected by a Deep Learning Model for Differential Treatment Selection in Depression	Background: Deep learning has utility in predicting differential antidepressant treatment response among patients with major depressive disorder, yet there remains a paucity of research describing how to interpret deep learning models in a clinically or etiologically meaningful way. In this paper, we describe methods for analyzing deep learning models of clinical and demographic psychiatric data, using our recent work on a deep learning model of STARD and CO-MED remission prediction. Methods: Our deep learning analysis with STARD and CO-MED yielded four models that predicted response to the four treatments used across the two datasets. Here, we use classical statistics and simple data representations to improve interpretability of the features output by our deep learning model and provide finer grained understanding of their clinical and etiological significance. Specifically, we use representations derived from our model to yield features predicting both treatment non-response and differential treatment response to four standard antidepressants, and use linear regression and t-tests to address questions about the contribution of trauma, education, and somatic symptoms to our models. Results: Traditional statistics were able to probe the input features of our deep learning models, reproducing results from previous research, while providing novel insights into depression causes and treatments. We found that specific features were predictive of treatment response, and were able to break these down by treatment and non-response categories; that specific trauma indices were differentially predictive of baseline depression severity; that somatic symptoms were significantly different between males and females, and that education and low income proved important psycho-social stressors associated with depression. Conclusion: Traditional statistics can augment interpretation of deep learning models. Such interpretation can lend us new hypotheses about depression and contribute to building causal models of etiology and prognosis. We discuss dataset-specific effects and ideal clinical samples for machine learning analysis aimed at improving tools to assist in optimizing treatment.	4132/4800	Secondary Analysis	Shared
Differential Treatment Benefit Prediction for Treatment Selection in Depression: A Deep Learning Analysis of STAR*D and CO-MED Data	Depression affects one in nine people, but treatment response rates remain low. There is significant potential in the use of computational modeling techniques to predict individual patient responses and thus provide more personalized treatment. Deep learning is a promising computational technique that can be used for differential treatment selection based on predicted remission probability. Using Sequenced Treatment Alternatives to Relieve Depression (STARD) and Combining Medications to Enhance Depression Outcomes (CO-MED) trial data, we employed deep neural networks to predict remission after feature selection. Treatments included were citalopram, escitalopram, bupropion SR plus escitalopram, and venlafaxine plus mirtazapine. Differential treatment benefit was estimated in terms of improvement of population remission rates after application of the model for treatment selection using two approaches: (1) using predictions generated directly from the model (the predicted improvement approach) and (2) using bootstrapping for sample generation and then estimating population remission rate for patients who actually received the drug predicted by the model compared to the general population (the actual improvement approach). Our deep learning model predicted remission in a pooled CO-MED/STARD dataset (including four treatments) with an area under the curve of 0.69 using 17 input features. Our actual improvement analysis showed a statistically significant 2.48% absolute improvement (corresponding to a 7.2% relative improvement) in population remission rate (p = 0.01, CI 2.48% ± 0.5%). Our model serves as proof-of-concept that deep learning approaches, with further refinement and work to address concerns about differences between studies when multiple datasets are used for training, may have utility in differential prediction of antidepressant response when selecting from a number of treatment options.	4132/4800	Secondary Analysis	Shared
Randomized Trials with Repeatedly Measured Outcomes: Handling Irregular and Potentially Informative Assessment Times	Randomized trials are often designed to collect outcomes at fixed points in time after randomization. In practice, the number and timing of outcome assessments can vary among participants. (i.e., irregular). In fact, the timing of assessments may be associated with the outcome of interest (i.e., informative). For example, in a trial evaluating the effectiveness of housing services for homeless people with mental illness, not only did the timings of outcome measurements vary among participants, but more days spent homeless were associated with less frequent observation. This type of informative observation requires appropriate statistical analysis. While analytic methods have been developed, they are rarely used. The purpose of this paper is to review the methods available with a view to developing recommendations for analyzing trials with irregular and potentially informative observation times. We show how the choice of analytic approach hinges on assumptions about the relationship between the observation and outcome processes. We argue that irregular observation should be treated with the same care as missing data, and propose that trialists: adopt strategies to minimize the extent of irregularity; describe the extent of irregularity in observation times; make their assumptions about the relationships between observation times and outcomes explicit; adopt analytic techniques that are appropriate to their assumptions; rigorously assess sensitivity of trial results to their assumptions.	4262/4262	Secondary Analysis	Shared
Variable Selection in Semiparametric Regression Models for Longitudinal Data with Informative Observation Times	A common issue in longitudinal studies is that subjects' visits are irregular and may depend on observed outcome values which is known as longitudinal data with informative observation times (follow-up). Semiparametric regression modelling for this type of data has received much attention as it provides more flexibility in studying the association between regression factors and a longitudinal outcome. An important problem here is how to select relevant variables and estimate their coefficients in semiparametric regression models when the number of covariates at baseline is large. The current penalization procedures in semiparametric regression models for longitudinal data does not account for informative observation times. We propose a variable selection procedure that is suitable for the estimation methods based on pseudo-score functions. We investigate the asymptotic properties of penalized estimators and conduct simulation studies to illustrate the theoretical results. We also use the procedure for variable selection in a semiparametric model for the STAR*D dataset from a multistage randomized clinical trial for treating major depressive disorder.	4134/4134	Secondary Analysis	Shared
Predictors of change in suicidal ideation across treatment phases of major depressive disorder: analysis of the STAR*D data	The effects of common antidepressants on suicidal ideation (SI) is unclear. In the landmark STARD trial antidepressants were effective for Major Depressive Disorder (MDD) in early treatment phases, but less effective in later phases. The effects of antidepressants on SI across the entire sample of the STARD trial has never been investigated. We performed a secondary analysis of the STAR*D data with the primary outcome of change in score on the suicide item (item three) of the Hamilton Rating Scale for Depression (HRSD17) across all four study levels. We used descriptive statistics and logistic regression analyses. Pearson correlation was used for change in SI versus change in depression (HRSD16). Reduction in mean (SD) SI was greater in levels one: 0.29 (±0.78) (p<0.001) and two: 0.26 (±0.88) (p<0.001) than in levels three: 0.16 (±0.92) (p=0.005) and four: 0.18 (±0.93) (p=0.094). A history of past suicide attempts (OR 1.72, p=0.007), comorbid medical illness (OR 2.23, p=0.005), and a family history of drug abuse (OR 1.69, p=0.008) was correlated with worsening of SI across level one. Treatment with bupropion (OR 0.24, p<0.001) or buspirone (OR 0.24, p=0.001) were correlated with lowering of SI across level two. Improvement in SI was correlated with improvement in overall depression (HRSD16) at level one: r(3756)=0.48; level two: r(1027)=0.38; level three: r(249)=0.31; and level four: r(75)=0.42 (p<0.001 for all levels). Improvement in SI is limited with pharmacotherapy in patients with treatment-resistant depression. Treatments with known anti-suicidal effects in MDD, such as ECT, should be considered in these patients.	4130/4130	Secondary Analysis	Shared
The bias of parameters in Inverse-Intensity Weighted GEEs when excluding subjects with no follow-up visits	Longitudinal data can be used to study disease progression and often features irregular visit times. Traditional methods such as generalized estimating equations (GEEs) and mixed effect models lead to biased estimates when visit and outcome processes are related. Inverse-intensity weighed GEEs (IIW-GEEs) account for \textcolor{blue}{dependency} between the visit and outcome processes. A common issue is that subjects with no visits are excluded from the dataset in practice. We \textcolor{blue}{aim} to examine the bias of regression parameters in IIW-GEEs when subjects without a visit are excluded. We show analytically that there is bias when subjects with no visits are excluded, and verify this in a simulation study. Moreover, we show that decreasing visit frequency, decreasing maximum follow-up time, increasing proportion of subjects with no visits lead to \textcolor{blue}{increase} in bias on omitting subjects with no visits. We recommend that everyone should be included in the dataset when analyzing, regardless of whether there is follow-up visit.	4130/4130	Secondary Analysis	Shared
Bayesian likelihood-based regression for estimation of optimal dynamic treatment regimes	Clinicians often make sequences of treatment decisions that can be framed as dynamic treatment regimes. In this paper, we propose a Bayesian likelihood-based dynamic treatment regime model that incorporates regression specifications to yield interpretable relationships between covariates and stage-wise outcomes. We define a set of probabilistically-coherent properties for dynamic treatment regime processes and present the theoretical advantages that are consequential to these properties. We justify the likelihood-based approach by showing that it guarantees these probabilistically-coherent properties, whereas existing methods lead to process spaces that typically violate these properties and lead to modelling assumptions that are infeasible. Through a numerical study, we show that our proposed method can achieve superior performance over existing state-of-the-art methods.	4120/4120	Secondary Analysis	Shared
Predicting Antidepressant Response with the STAR*D and CAN-BIND-1 Datasets	Collection for the paper published in PLOS One as "Replication of Machine Learning Methods to Predict Treatment Outcome with Antidepressant Medications in Patients with Major Depressive Disorder from STARD and CAN-BIND-1" Objectives: Antidepressants are first-line treatments for major depressive disorder (MDD), but 40-60% of patients will not respond, hence, predicting response would be a major clinical advance. Machine learning algorithms hold promise to predict treatment outcomes based on clinical symptoms and episode features. We sought to independently replicate recent machine learning methodology predicting antidepressant outcomes using the Sequenced Treatment Alternatives to Relieve Depression (STARD) dataset, and then externally validate these methods to train models using data from the Canadian Biomarker Integration Network in Depression (CAN-BIND-1) dataset. Methods: We replicated methodology from Nie et al (2018) using common algorithms based on linear regressions and decision trees to predict treatment-resistant depression (TRD, defined as failing to respond to 2 or more antidepressants) in the STARD dataset. We then trained and externally validated models using the clinical features found in both datasets to predict response (≥50% reduction on the Quick Inventory for Depressive Symptomatology, Self-Rated [QIDS-SR]) and remission (endpoint QIDS-SR score ≤5) in the CAN-BIND-1 dataset. We evaluated additional models to investigate how different outcomes and features may affect prediction performance. Results: Our replicated models predicted TRD in the STARD dataset with slightly better balanced accuracy than Nie et al (70%-73% versus 64%-71%, respectively). Prediction performance on our external methodology validation on the CAN-BIND-1 dataset varied depending on outcome; performance was worse for response (best balanced accuracy 65%) compared to remission (77%). Using the smaller set of features found in both datasets generally improved prediction performance compared to using all the STAR*D features. Conclusion: We successfully replicated prior work predicting antidepressant treatment outcomes using machine learning methods and clinical data. We found similar prediction performance using these methods on an external database, although prediction of remission was better than prediction of response. Future work is needed to improve prediction performance to be clinically useful. November 30 2021 update - Some minor bugs were found in our processing code. Please see our github for more details. We have uploaded a new copy of the processed STARD data after these bug fixes	4045/4045	Secondary Analysis	Shared
Summary Measures for Quantifying the Extent of Visit Irregularity in Longitudinal Data: The STAR*D Study	This chapter applies the measures of irregularity from this thesis to the Sequenced Treatment Alternatives to Relieve Depression (STARD) study. The STARD study is the largest randomized clinical trial on patients suffering from major depression. This chapter focuses on the first phase of the study which pre-specified a common set of scheduled measurement occasions at weeks 2, 4, 6, 9, 12 post-baseline where individuals had their Quick Inventory of Depression Symptomatology (QIDS) questionnaire score recorded; however there were individuals who missed scheduled visits, and had unscheduled visits. Therefore, interest lies in determining whether visits can be treated as repeated measures. This is followed by a demonstration on how to select the appropriate modelling approach for the study outcome, and how to interpret the resulting parameter estimates. The target of inference of this chapter is to evaluate the mean QIDS score over the first 12 weeks of the trial.	4036/4036	Secondary Analysis	Shared
Cross-trial prediction of treatment outcome in depression: a machine learning approach	Background: Antidepressant treatment efficacy is low, but might be improved by matching patients to interventions. At present, clinicians have no empirically validated mechanisms to assess whether a patient with depression will respond to a specific antidepressant. We aimed to develop an algorithm to assess whether patients will achieve symptomatic remission from a 12-week course of citalopram. Methods: We used patient-reported data from patients with depression (n=4041, with 1949 completers) from level 1 of the Sequenced Treatment Alternatives to Relieve Depression (STARD; ClinicalTrials.gov, number NCT00021528) to identify variables that were most predictive of treatment outcome, and used these variables to train a machine-learning model to predict clinical remission. We externally validated the model in the escitalopram treatment group (n=151) of an independent clinical trial (Combining Medications to Enhance Depression Outcomes [COMED]; ClinicalTrials.gov, number NCT00590863). Findings: We identified 25 variables that were most predictive of treatment outcome from 164 patient-reportable variables, and used these to train the model. The model was internally cross-validated, and predicted outcomes in the STARD cohort with accuracy significantly above chance (64·6% [SD 3·2]; p<0·0001). The model was externally validated in the escitalopram treatment group (N=151) of COMED (accuracy 59·6%, p=0.043). The model also performed significantly above chance in a combined escitalopram-buproprion treatment group in COMED (n=134; accuracy 59·7%, p=0·023), but not in a combined venlafaxine-mirtazapine group (n=140; accuracy 51·4%, p=0·53), suggesting specificity of the model to underlying mechanisms. Interpretation: Building statistical models by mining existing clinical trial data can enable prospective identification of patients who are likely to respond to a specific antidepressant.	1949/1949	Secondary Analysis	Shared
Construction of the Design Matrix for Generalized Linear Mixed-Effects Models in the Context of Clinical Trials of Treatment Sequences	The estimation of carry-over effects is a difficult problem in the design and analysis of clinical trials of treatment sequences including cross-over trials. Except for simple designs, carry-over effects are usually unidentifiable and therefore nonestimable. Solutions such as imposing parameter constraints are often unjustified and produce differing carry-over estimates depending on the constraint imposed. Generalized inverses or treatment-balancing often allow estimating main treatment effects, but the problem of estimating the carry-over contribution of a treatment sequence remains open in these approaches. Moreover, washout periods are not always feasible or ethical. A common feature of designs with unidentifiable parameters is that they do not have design matrices of full rank. Thus, we propose approaches to the construction of design matrices of full rank, without imposing artificial constraints on the carry-over effects. Our approaches are applicable within the framework of generalized linear mixed-effects models. We present a new model for the design and analysis of clinical trials of treatment sequences, called Antichronic System, and introduce some special sequences called Skip Sequences. We show that carry-over effects are identifiable only if appropriate Skip Sequences are used in the design and/or data analysis of the clinical trial. We explain how Skip Sequences can be implemented in practice, and present a method of computing the appropriate Skip Sequences. We show applications to the design of a cross-over study with 3 treatments and 3 periods, and to the data analysis of the STAR*D study of sequences of treatments for depression. See the paper, which available in this web site. Reference: Diaz, F.J. (2018). "Construction of the Design Matrix for Generalized Linear Mixed-Effects Models in the Context of Clinical Trials of Treatment Sequences". Revista Colombiana de Estadística (Colombian Journal of Statistics). Vol. 41, 191-233.	1440/1440	Secondary Analysis	Shared
Early Remission is Associated with Lower Risk of Relapse: Analysis of Major Depressive Disorder using STAR*D	OBJECTIVES: Major depressive disorder (MDD) contributes to a significant burden in the US, where it is the third leading cause of disability. For patients with MDD who benefit from anti-depressant therapies (ADTs), time to (and in) response or remission can vary greatly. Prior studies have indicated that those who experience response or remission earlier have better long-term MDD-related outcomes. This study sought to quantify the relationship between time to acute treatment-induced remission and the risk of relapse of MDD symptoms in the STARD trial (NCT00021528). METHODS: The STARD dataset was analyzed to assess whether early remitters (i.e, patients experiencing remission ≤28 days following step start) exhibited reduced risk of a subsequent relapse during a 12-month naturalistic follow-up compared to late remitters (>28 days). A self-reported Quick Inventory of Depressive Symptomatology (QIDS-SR16) score of ≤5 sustained until the end of any treatment step and a score of ≥11 during the 12-month follow-up defined remission and relapse, respectively. A hazard ratio quantifying the relationship between remission timing and risk of subsequent MDD relapse was estimated using Cox regression modeling, adjusted for patient’s age, treatment step, QIDS-SR16 score at step start, and additional forward-selected demographic factors. RESULTS: Among 1130 patients with MDD who achieved remission (n=231 early remitters; n=899 late remitters), a significantly greater proportion of late remitters (39.3%) relapsed during the 12-month follow-up phase compared to early remitters (24.7%, P<0.0001). Late remitters had a nearly 50% higher risk of relapse than early remitters during the 12-month follow-up phase (adjusted hazard ratio=1.48, P=0.01). CONCLUSIONS: Patients in STAR*D who remitted earlier showed significantly reduced risk of relapse compared to those remitting later. These findings highlight the importance of quickly inducing remission– both for the immediate relief of symptoms and the improvement of long-term outcomes.	1130/1130	Secondary Analysis	Shared
The clinical relevance of self-reported premenstrual worsening of depressive symptoms in the management of depressed outpatients: a STAR*D report.	OBJECTIVE: To determine the incidence, clinical and demographic correlates, and relationship to treatment outcome of self-reported premenstrual exacerbation of depressive symptoms in premenopausal women with major depressive disorder who are receiving antidepressant medication. METHOD: This post-hoc analysis used clinical trial data from treatment-seeking, premenopausal, adult female outpatients with major depression who were not using hormonal contraceptives. For this report, citalopram was used as the first treatment step. We also used data from the second step in which one of three new medications were used (bupropion-SR [sustained release], venlafaxine-XR [extended release], or sertraline). Treatment-blinded assessors obtained baseline treatment outcomes data. We hypothesized that those with reported premenstrual depressive symptom exacerbation would have more general medical conditions, longer index depressive episodes, lower response or remission rates, and shorter times-to-relapse with citalopram, and that they would have a better outcome with sertraline than with bupropion-SR. RESULTS: At baseline, 66% (n=545/821) of women reported premenstrual exacerbation. They had more general medical conditions, more anxious features, longer index episodes, and shorter times-to-relapse (41.3 to 47.1 weeks, respectively). Response and remission rates to citalopram, however, were unrelated to reported premenstrual exacerbation. Reported premenstrual exacerbation was also unrelated to differential benefit with sertraline and bupropion-SR. CONCLUSIONS: Self-reported premenstrual exacerbation has moderate clinical utility in the management of depressed patients, although it is not predictive of overall treatment response. Factors that contribute to a more chronic or relapsing course may also play a role in premenstrual worsening of major depressive disorder (MDD).	1017/1017	Secondary Analysis	Shared
Measuring the individual benefit of a medical or behavioral treatment using generalized linear mixed-effects models	A statistical measure of the individual benefit of medical or behavioral treatment and of the severity of a chronic illness is proposed, which are used to develop a graphical method that can be used by statisticians and clinicians in the data analysis of clinical trials from the perspective of personalized medicine. The method focuses on assessing and comparing individual effects of treatments rather than average effects and can be used with continuous and discrete responses under generalized linear mixed-effects models framework. Analyses of data from the Sequenced Treatment Alternatives to Relieve Depression clinical trial of sequences of treatments for depression and data from a clinical trial of respiratory treatments are used for illustration.	170/170	Secondary Analysis	Shared

* Data not on individual level

helpcenter.collection.associated-studies-tab

Collection - Associated Studies

Clicking on the Study Title will open the study details in a new internet browser tab. The Abstract is available for viewing, providing the background explanation of the study, as provided by the Collection Owner.

Primary v. Secondary Analysis: The Data Usage column will have one of these two choices. An associated study that is listed as being used for Primary Analysis indicates at least some and potentially all of the data used was originally collected by the creator of the NDA Study. Secondary Analysis indicates the Study owner was not involved in the collection of data, and may be used as supporting data.

Private v. Shared State: Studies that remain private indicate the associated study is only available to users who are able to access the collection. A shared study is accessible to the general public.

Frequently Asked Questions

How do I associate a study to my collection?

Studies are associated to the Collection automatically when the data is defined in the Study.

Glossary

Associated Studies Tab

A tab in a Collection that lists the NDA Studies that have been created using data from that Collection including both Primary and Secondary Analysis NDA Studies.

Contact NDA Help Desk

Edit

Choose File:	Select File
File Type:
Description:

Exemption Type*
From Date*
To Date*
Reason*	Characters Remaining:

Disclaimer

Filter Cart

Frequently Asked Questions

Glossary