Title:hdWGCNA and Cellular Communication Identify Active NK Cell
Subtypes in Alzheimer's Disease and Screen for Diagnostic Markers
through Machine Learning
Volume: 21
Issue: 2
Author(s): Guobin Song, Haoyang Wu, Haiqing Chen, Shengke Zhang, Qingwen Hu, Haotian Lai, Claire Fuller, Guanhu Yang*Hao Chi*
Affiliation:
- Department of Specialty Medicine, Ohio University,
Athens, OH, United States
- Clinical Medical College, Southwest Medical University, Luzhou, China
Keywords:
Alzheimer’s disease, NK cell, machine learning, diagnostic signature, hdWGCNA, cellchat, single-cell RNA-seq, immune cell subtype distribution pattern.
Abstract:
Background: Alzheimer's disease (AD) is a recognized complex and severe neurodegenerative
disorder, presenting a significant challenge to global health. Its hallmark pathological
features include the deposition of β-amyloid plaques and the formation of neurofibrillary tangles.
Given this context, it becomes imperative to develop an early and accurate biomarker model for
AD diagnosis, employing machine learning and bioinformatics analysis.
Methods: In this study, single-cell data analysis was employed to identify cellular subtypes that exhibited
significant differences between the diseased and control groups. Following the identification
of NK cells, hdWGCNA analysis and cellular communication analysis were conducted to pinpoint
NK cell subset with the most robust communication effects. Subsequently, three machine
learning algorithms-LASSO, Random Forest, and SVM-RFE-were employed to jointly screen for
NK cell subset modular genes highly associated with AD. A logistic regression diagnostic model
was then designed based on these characterized genes. Additionally, a protein-protein interaction
(PPI) networks of model genes was established. Furthermore, unsupervised cluster analysis was
conducted to classify AD subtypes based on the model genes, followed by the analysis of immune
infiltration in the different subtypes. Finally, Spearman correlation coefficient analysis was utilized
to explore the correlation between model genes and immune cells, as well as inflammatory
factors.
Results: We have successfully identified three genes (RPLP2, RPSA, and RPL18A) that exhibit a
high association with AD. The nomogram based on these genes provides practical assistance in diagnosing
and predicting patients' outcomes. The interconnected genes screened through PPI are intricately
linked to ribosome metabolism and the COVID-19 pathway. Utilizing the expression of
modular genes, unsupervised cluster analysis unveiled three distinct AD subtypes. Particularly
noteworthy is subtype C3, characterized by high expression, which correlates with immune cell infiltration
and elevated levels of inflammatory factors. Hence, it can be inferred that the establishment
of an immune environment in AD patients is closely intertwined with the heightened expression
of model genes.
Conclusion: This study has not only established a valuable diagnostic model for AD patients but
has also delved deeply into the pivotal role of model genes in shaping the immune environment of
individuals with AD. These findings offer crucial insights into early AD diagnosis and patient management
strategies.