msqrob2SCP: a flexible workflow for addressing the hierarchical correlation in SCP data
Author(s): Christophe Vanderaa,Stijn Vandenbulcke,Lieven Clement
Affiliation(s): Ugent
The field of mass spectrometry (MS)-based single-cell proteomics (SCP) is gaining momentum, with recent studies demonstrating the application of SCP by measuring thousands of proteins across hundreds to thousands of cells. However, the presence of hierarchical correlations (HC) in real-life SCP experiments poses a significant challenge to reliable data analysis and biomarker identification. This challenge arises because multiple cells are typically acquired from the same sample, and multiple samples are needed to extract reproducible protein markers. Consequently, the protein abundances in SCP are bound to be correlated as they are expected to be more alike between cells of the same sample. Moreover, additional levels of correlations are implied by the technology as quantification does not happen at the protein level, but at the level of its constituent peptides. Again, peptide abundances from the same sample are more similar than those of peptides from multiple samples. Finally, labeling technologies that multiplex multiple single cells in each MS run induce run-level correlation. To address these challenges, we present msqrob2SCP, a flexible workflow designed to facilitate accurate differential protein abundance analysis while accommodating hierarchical correlations. This workflow seamlessly integrates the `scp` Bioconductor package for SCP data manipulation and processing with the `msqrob2` Bioconductor package for data modeling using mixed models. By leveraging these tools, msqrob2SCP empowers robust data analysis, particularly as the SCP field transitions toward applied experiments.