Large-scale CSF and plasma proteomics reveal dysregulation of immune system, synaptic impairment, and extracellular matrix related pathways in neurodegeneration

Introduction

This repository contains the code for bioinformatics analyses described in the article "Large-scale CSF and plasma proteomics reveal dysregulation of immune system, synaptic impairment, and extracellular matrix related pathways in neurodegeneration".

This project investigated CSF and plasma proteomics data from the SomaScan assay to identify proteins associated with Alzheimer disease (AD), Parkinson's disease (PD), dementia with Lewy bodies (DLB), and Frontotemporal Dementia (FTD). Idnetified proteins were leveraged to characterize disease-specific and shared molecular signatures, and create disease-spcific prediction models. Biological pathway and cell type enrichment analyses were performed to understand underlying biology.

Content

The code covers the following main analysis steps:

Data pre-processing: Proteomics data preparation and pricipal component analysis
Differential expression analysis
Prediction model development using LASSO regression
Pathways and cell type enrichment analyses

Data

Proteomics data analysed in this study is available at:

ADNI: http://adni.loni.usc.edu/
Knight-ADRC (CSF): https://dss.niagads.org/ (Accession: ng00130)
Knight-ADRC (Plasma): https://live-knightadrc-washu.pantheonsite.io/professionals-clinicians/request-center-resources/
FACE and Barcelona-1 cohorts: http://www.fundacioace.com/
PPMI: https://www.ppmi-info.org/
Stanford-ADRC: https://live-knightadrc-washu.pantheonsite.io/professionals-clinicians/request-center-resources/
GNPC: Members of the global research community can place a data use request via the AD Discovery Portal (https://discover.alzheimersdata.org/). Access is contingent upon adherence to the GNPC Data Use Agreement and the Publication Policies.

Requirements

The code was written in R (version 4.3.0) and relies on multiple R and Bioconductor packages, including:

caret
glmnet
nlme
pROC
ROCR
dplyr
clusterProfiler
ReactomePA
ggplot2
EnhancedVolcano
Additional packages listed at the beginning of each R script

License

The code is available under the MIT License.

Instructions

The code was tested on R 4.3.0 on Linux operating systems, but should be compatible with later versions of R installed on current Linux, Mac, or Windows systems.

To run the code, the correct working directory containing the input data must be specified at the beginning of the R-scripts, otherwise the scripts can be run as-is.

The scripts should be run in the following order:

data_preparation.R

differential_expression_analysis.R

prediction_models.R

pathway_and_celltype_enrichment_analysis.R

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
data_preparation.R		data_preparation.R
differential_expression_analysis.R		differential_expression_analysis.R
pathway_and_celltype_enrichment_analysis.R		pathway_and_celltype_enrichment_analysis.R
prediction_models.R		prediction_models.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large-scale CSF and plasma proteomics reveal dysregulation of immune system, synaptic impairment, and extracellular matrix related pathways in neurodegeneration

Table of contents

Introduction

Content

Data

Requirements

License

Instructions

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Large-scale CSF and plasma proteomics reveal dysregulation of immune system, synaptic impairment, and extracellular matrix related pathways in neurodegeneration

Table of contents

Introduction

Content

Data

Requirements

License

Instructions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages