10 Citations (Scopus)
122 Downloads (Pure)


Background: The typical approach to identify blood-derived gene expression signatures as a biomarker for Alzheimer's disease (AD) have relied on training classification models using AD and healthy controls only. This may inadvertently result in the identification of markers for general illness rather than being disease-specific. Objective: Investigate whether incorporating additional related disorders in the classification model development process can lead to the discovery of an AD-specific gene expression signature. Methods: Two types of XGBoost classification models were developed. The first used 160 AD and 127 healthy controls and the second used the same 160 AD with 6,318 upsampled mixed controls consisting of Parkinson's disease, multiple sclerosis, amyotrophic lateral sclerosis, bipolar disorder, schizophrenia, coronary artery disease, rheumatoid arthritis, chronic obstructive pulmonary disease, and cognitively healthy subjects. Both classification models were evaluated in an independent cohort consisting of 127 AD and 687 mixed controls. Results: The AD versus healthy control models resulted in an average 48.7% sensitivity (95% CI = 34.7-64.6), 41.9% specificity (95% CI = 26.8-54.3), 13.6% PPV (95% CI = 9.9-18.5), and 81.1% NPV (95% CI = 73.3-87.7). In contrast, the mixed control models resulted in an average of 40.8% sensitivity (95% CI = 27.5-52.0), 95.3% specificity (95% CI = 93.3-97.1), 61.4% PPV (95% CI = 53.8-69.6), and 89.7% NPV (95% CI = 87.8-91.4). Conclusions: This early work demonstrates the value of incorporating additional related disorders into the classification model developmental process, which can result in models with improved ability to distinguish AD from a heterogeneous aging population. However, further improvement to the sensitivity of the test is still required.

Original languageEnglish
Pages (from-to)545-561
Number of pages17
JournalJournal of Alzheimer's Disease
Issue number2
Early online date24 Mar 2020
Publication statusPublished - 2020


  • Age-related memory disorders
  • Alzheimer's disease
  • biomarkers
  • dementia
  • gene expression
  • human
  • machine learning
  • microarray analysis
  • neurodegenerative disorders


Dive into the research topics of 'Working Towards a Blood-Derived Gene Expression Biomarker Specific for Alzheimer's Disease'. Together they form a unique fingerprint.

Cite this