Poster

Zero-configuration genomic data science with bioc2u

Zero-configuration genomic data science with bioc2u Author(s): Vincent James Carey,Alexandru Mahmoud Affiliation(s): Channing Division of Network Medicine, Harvard Medical School R/Bioconductor users on Linux platforms must often handle complex configuration tasks for packages that depend on runtime libraries completely external to R. Even when packages with compiled code are self-contained, source compilation can be time-consuming. The r2u system (https://github.com/eddelbuettel/r2u) produces Debian-style packages for all packages on CRAN. Installation of self-sufficient binaries is handled for Ubuntu 20.

Continue reading

Unsupervised learning techniques detect clinically relevant structure in human gut microbiota

Unsupervised learning techniques detect clinically relevant structure in human gut microbiota Author(s): Himmi Lindgren,Leo M Lahti,Aki Havulinna,Teemu Niiranen,Rob Knight,Guillaume Meric Affiliation(s): Department of Computing, University of Turku, Turku, Finland Unsupervised learning techniques can detect clinically relevant structure in a population cohort data of human gut microbiota. While the gut microbiota composition is influenced by individual factors such as diet, medication, and development of the immune system during early childhood, it is proposed that individuals maintain a relatively stable microbiota ecosystem throughout adulthood.

Continue reading

Trajectory-based differential expression analysis for single cell proteomics data with msqrob2

Trajectory-based differential expression analysis for single cell proteomics data with msqrob2 Author(s): Stijn Vandenbulcke,Christophe Vanderaa,Lieven Clement Affiliation(s): UGent Many biological processes are dynamic, e.g. cell differentiation, tissue development, and responses to external stimuli. Traditionally they were studied with time-course experiments. With the advent of single cell technologies, however, they can also be unraveled by taking a snapshot of the transcriptome or proteome of hundreds to millions of single cells in a cell population, which are each at distinct points in a dynamic process.

Continue reading

The analysis of tumour microenvironment cell composition from a single-cell perspective.

The analysis of tumour microenvironment cell composition from a single-cell perspective. Author(s): Laura Masatti,Stefania Pirrotta,Nicolò Gnoato,Matteo Marchetti,Robert Fruscio,Lorenzo Ceppi,Laura Mannarino,Chiara Romualdi,Maurizio D'Incalci,Sergio Marchini,Roberto Tozzi,Enrica Calura Affiliation(s): Department of Biology, University of Padova, Italy. Ovarian cancer (OC) is a common form of gynecologic cancer and is a major concern in women's health. Despite ongoing efforts, it remains a significant challenge to diagnose and treat effectively. Especially in its most aggressive subtype, high-grade serous ovarian cancer (HGSOC), represents one of the leading causes of mortality among women, due to invasiveness and tendency to metastasis.

Continue reading

Streamlining LC-MS/MS Data Analysis in R with Open-Source *xcms* and *RforMassSpectrometry*: An End-to-End Workflow

Streamlining LC-MS/MS Data Analysis in R with Open-Source *xcms* and *RforMassSpectrometry*: An End-to-End Workflow Author(s): Philippine Louail Affiliation(s): Eurac Research, Biomedicine Institute Despite untargeted LC-MS/MS data being a powerful approach for large-scale metabolomics analysis, a significant challenge in the field lies in the reproducible and efficient analysis of such data, in particular. The power of R-based analysis workflows lies in their high customizability and adaptability to specific instrumental and experimental setups, but, while various specialized packages exist for individual analysis steps, their seamless integration and application to large cohort datasets remains elusive.

Continue reading

scpGUI and QFeaturesGUI: Graphical Interfaces for Single-Cell and Bulk Proteomics

scpGUI and QFeaturesGUI: Graphical Interfaces for Single-Cell and Bulk Proteomics Author(s): Léopold Guyot,Christophe Vanderaa,Laurent Gatto Affiliation(s): Computational Biology and Bioinformatics, de Duve Institute, Belgium In recent years, significant advancements have been made in the field of proteomics data analysis. However, the complexity of workflows involving programming languages such as R and Python can pose challenges for practitioners without any coding backgrounds. To address this issue, we introduce two user-friendly packages: *scpGUI* and *QFeaturesGUI*.

Continue reading

SAMURAI: Shallow Analysis of copy nuMber alterations Using a Reproducible And Integrated bioinformatics pipeline

SAMURAI: Shallow Analysis of copy nuMber alterations Using a Reproducible And Integrated bioinformatics pipeline Author(s): Sara Potente,Sergio Marchini,Dino Paladin,Diego Boscarino,Luca Beltrame,Chiara Romualdi Affiliation(s): University of Padua Introduction: Shallow Whole Genome Sequencing (sWGS) has become a cost-effective method for genomic analysis, particularly in identifying copy number alterations (CNAs). However, the lack of standardized pipelines for sWGS data analysis presents a significant challenge towards robustness and reproducibility of the results. To address this gap, we have developed SAMURAI (Shallow Analysis of copy nuMber alterations Using a Reproducible And Integrated bioinformatics pipeline).

Continue reading

PASTA: Pattern Analysis for Spatial Omics Data

PASTA: Pattern Analysis for Spatial Omics Data Author(s): Martin Emons,Samuel Gunz,Helena Lucia Crowell,Mark Robinson Affiliation(s): University of Zurich Most spatial omics approaches can be classified under high-throughput sequencing (HTS) based or imaging-based approaches. In HTS-based approaches, positional information is recorded according to the predetermined array of spots measured. Imaging-based approaches, however, either target the molecules of interest with hybridising fluorescent probes, ablate regions stained with a cocktail of antibodies via metal tag readouts, or target sequences are amplified and sequenced in situ.

Continue reading

Pancancer network analysis reveals key master regulators for cancer invasiveness

Pancancer network analysis reveals key master regulators for cancer invasiveness Author(s): Mahesh Jethalia,Siddhi P. Jani,Michele Ceccarelli,Raghvendra Mall Affiliation(s): ISGLOBAL Barcelona Abstract : Background Tumor invasiveness reflects numerous biological changes, including tumorigenesis, progression, and metastasis. To decipher the role of transcriptional regulators (TR) involved in tumor invasiveness, we performed a systematic network-based pan-cancer assessment of master regulators of cancer invasiveness. Materials and methods : We stratified patients in The Cancer Genome Atlas (TCGA) into invasiveness high (INV-H) and low (INV-L) groups using consensus clustering based on an established robust 24-gene signature to determine the prognostic association of invasiveness with overall survival (OS) across 32 different cancers.

Continue reading

Optimizing Machine Learning Models for Enhanced Prediction of Cardiometabolic Diseases from Multiomics Data

Optimizing Machine Learning Models for Enhanced Prediction of Cardiometabolic Diseases from Multiomics Data Author(s): Eliana Ibrahimi,Nicholas Cauwenberghs,Tatiana Kouznetsova Affiliation(s): Department of Biology, Faculty of Natural Sciences, University of Tirana This study aims to optimize machine learning models for enhanced prediction of cardiometabolic diseases from multiomics data for an accurate and personalized risk assessment. By identifying and prioritizing key features and biomarkers from multiomics and clinical data, the study seeks to create models that offer an improved understanding of individual patient profiles, incorporating factors such as gut microbiome, metabolome, proteome, lifestyle, and clinical history.

Continue reading

NetworkHub: a one-stop-shop to retrieve and use protein-protein interaction network data in Bioconductor

NetworkHub: a one-stop-shop to retrieve and use protein-protein interaction network data in Bioconductor Author(s): Lotta Wagner,Federico Marini Affiliation(s): Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany Proteins are crucial for regulating and maintaining cellular functions, often acting in a concerted manner in physiological and pathological contexts. The cell interactomes can be influenced by the temporal chronology, spatial relationships between interaction partners and various external factors, and have been the object of many initiatives (including IntAct, STRING, BioGRID and many more) aiming to detect, collect and curate large sets of protein-protein interactions (PPI).

Continue reading

mosdef: a collection of MOSt frequently used and useful Differential Expression Functions

mosdef: a collection of MOSt frequently used and useful Differential Expression Functions Author(s): Leon Dammer,Federico Marini Affiliation(s): Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany The mosdef R package offers a comprehensive toolkit for simplifying and streamlining some common operations in the workflow of differential expression analysis. By providing a unified interface for executing various enrichment analysis tools, mosdef allows researchers to effortlessly run such functions (e.

Continue reading

iSEEfier: Starting to use iSEE became even easier

iSEEfier: Starting to use iSEE became even easier Author(s): Najla Abassi,Federico Marini Affiliation(s): Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Mainz, Germany Effectively exploring and visualizing omics data is one paramount step for uncovering new biological insights. A variety of software solutions have been developed for this end, among them iSEE, which is a tool that can be implemented for various purposes at any stage of the data analysis pipeline.

Continue reading

Inferring residue resolved hydrogen deuterium exchange using Rex

Inferring residue resolved hydrogen deuterium exchange using Rex Author(s): Oliver Crook Affiliation(s): University of Oxford Hydrogen-Deuterium Exchange Mass-Spectrometry (HDX-MS) has emerged as a powerful technique to explore the conformational dynamics of proteins and protein complexes in solution. The bottom-up approach to MS uses peptides to represent an average of residues, leading to reduced resolution of deuterium exchange and complicates the interpretation of the data. Here, we introduce ReX, a method to infer residue-level uptake patterns leveraging the overlap in peptides, the temporal component of the data and the correlation along the sequence dimension.

Continue reading

gINTomics visualizer, a powerful shiny app for multiomics data integration visualization.

gINTomics visualizer, a powerful shiny app for multiomics data integration visualization. Author(s): Angelo Velle,Francesco Patanè,Stefania Pirrotta,Chiara Romualdi Affiliation(s): University of Padova Large datasets containing different omics are increasingly available in public databases. However, capturing all the information contained in these data is a major challenge. To solve this need, we developed gINTomics, an easy-to-use R package for omics data integration. However, the interpretation of the integration results is a fundamental step of the analysis, that’s why we decided to include in gINTomics a powerful shiny app for an easy and visually appealing interpretation of the statistical models.

Continue reading

GeDi - improving gene set distances accounting for network-based information

GeDi - improving gene set distances accounting for network-based information Author(s): Annekathrin Silvia Ludt,Federico Marini Affiliation(s): Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany Functional enrichment analysis, performed either via scripted analysis or with web-based tools, is one of the most frequently adopted steps in computational biology, especially when aiming to identify the systems level mechanisms captured by high-dimensional molecular datasets.

Continue reading

From Cancer-Testis genes to Cancer-Testis enhancers

From Cancer-Testis genes to Cancer-Testis enhancers Author(s): Julie Devis,Axelle Loriot,Charles De Smet,Laurent Gatto Affiliation(s): UCLouvain Cancer-Testis (CT) genes are normally expressed only in germ cells and not in healthy somatic tissues. However, they are aberrantly activated in many tumours. Many CT genes are regulated by methylation. Their promoters are highly methylated in all healthy somatic tissues and demethylated in germ cells. They are also demethylated in tumours in which they are activated.

Continue reading

Facilitating multi-omic analyses in microbiome research with MultiAssayExperiment

Facilitating multi-omic analyses in microbiome research with MultiAssayExperiment Author(s): Tuomas Borman,Artur Sannikov,Kati Hanhineva,Leo M Lahti Affiliation(s): University of Turku, Finland The demand for data analytical strategies for multi-omic integration is steadily increasing in various application domains. The Bioconductor MultiAssayExperiment data container provides optimized and extensively tested tools to deal with heterogeneous multi-domain data. The availability of methods extending this container has been steadily increasing. Working with different combinations of omics data often requires additional customization, however, and developing standardized general-purpose methods remains challenging.

Continue reading

Enhancing Robustness in Differential Abundance Testing for Microbiome Data Analysis through Consensus-Based Approach

Enhancing Robustness in Differential Abundance Testing for Microbiome Data Analysis through Consensus-Based Approach Author(s): Francesc Català Moll,Marc Noguera-Julian,Roger Paredes Affiliation(s): IrsiCaixa AIDS Research Institute, Hospital Universitari Germans Trias i Pujol, Campus Can Ruti, Badalona, Spain Introduction: The task of Differential Abundance (DA) testing in microbiome data poses significant challenges for both parametric and non-parametric statistical methods due to the data’s sparsity, high variability, and compositional nature. Microbiome-specific statistical methods often resort to classical distribution models or consider compositional specifics.

Continue reading

Enhancing Plasmodium Research through PlasmoRUtils: A one-stop R Package for Apicomplexan Biology

Enhancing Plasmodium Research through PlasmoRUtils: A one-stop R Package for Apicomplexan Biology Author(s): Rohit Satyam Affiliation(s): King Abdullah University of Science & Technology, Saudi Arabia The major bottleneck in understanding Plasmodium biology is the relative lack of omics-driven experimental data and the presence of attention biases. Compounding this issue, half of the Plasmodium proteome encodes proteins whose functions remain unknown, aggravating the challenges. This is also true for other apicomplexan parasites.

Continue reading

Differential Correlation Analysis and Biological Function Inference on Single Cell Proteomics

Differential Correlation Analysis and Biological Function Inference on Single Cell Proteomics Author(s): Enes Sefa Ayar,Laurent Gatto Affiliation(s): Computational Biology and Bioinformatics Unit, de Duve Institute, Université Catholique de Louvain, Brussels, Belgium Proteins are the key molecules in executing biological functions within cells. They operate in cooperation with other proteins to carry out these functions as part of protein complexes, or biological pathways. Thus, the correlation among these proteins implies a functional interdependence, offering insights into both biological functions and mechanisms.

Continue reading

Diagnostic and filtering of genomic DNA contamination in RNA-seq data with gDNAx

Diagnostic and filtering of genomic DNA contamination in RNA-seq data with gDNAx Author(s): Beatriz Calvo-Serra,Robert Castelo Affiliation(s): Universitat Pompeu Fabra Total RNA sequencing (RNA-seq) is the most unbiased approach to characterize the whole transcriptome, and often the only available choice with degraded samples of clinical or biological interest. Unfortunately, it is also prone to genomic DNA (gDNA) contamination due to the fluctuating efficiency of the gDNA digestion step (i.e., DNase treatment), or the complete lack thereof, specially with low input samples.

Continue reading

Deep Immune Phenotyping of Heart Failure to identify molecular signatures of Metabolically Induced dysregulations in Peripheral Blood Mononuclear Cells

Deep Immune Phenotyping of Heart Failure to identify molecular signatures of Metabolically Induced dysregulations in Peripheral Blood Mononuclear Cells Author(s): Maximilian Nuber,Ekaterina Esenkova,Katrin Bauer,Siva Karunanithi,Pablo Llavona Juez,Vincent ten Cate,Elisa Araldi,Philipp Wild Affiliation(s): University Medical Center Mainz Heart failure (HF) is a major global health issue with an increasingly recognized role of metabolic dysregulation and systemic inflammation in its pathogenesis. The present study aims to elucidate the perturbations in molecular pathways induced by metabolic dysregulation in HF.

Continue reading

Deciphering tumour cell interactions and communications from the gene expression profiles of single cells RNA sequencing data

Deciphering tumour cell interactions and communications from the gene expression profiles of single cells RNA sequencing data Author(s): Nicolò Gnoato,Laura Masatti,Stefania Pirrotta,Paolo Martini,Chiara Romualdi,Enrica Calura Affiliation(s): University of Padova Cancer is a complex pathological condition that originates from the accumulation of genetic mutations, which can manifest as both point mutations in single nucleotides and structural modifications of the genome, such as copy number variations (CNVs). Lines of evidence have shown that specific genes' copy number variations disrupt their gene expression levels, and the normal cellular physiological mechanisms, triggering uncontrolled growth and division.

Continue reading

Deciphering Organ-Specific Lipid Signatures in Zebrafish with a Spatial Lipidomics Pipeline

Deciphering Organ-Specific Lipid Signatures in Zebrafish with a Spatial Lipidomics Pipeline Author(s): Prateek Arora,Nick Kirschke,Simon Isofort,Mojgan Masoodi,Nadia Mercader Affiliation(s): Institute of Anatomy, University of Bern, Bern, Switzerland Spatial lipidomics analysis offers a powerful approach to understanding the lipid composition and distribution within biological tissues. In this study, we present a comprehensive pipeline utilizing Desorption Electrospray Ionization (DESI) mass spectrometry imaging to analyze the lipidomic profiles of zebrafish organs. Our methodology involves seamlessly integrating various R packages, including clustering from Cardinal, to delineate distinct organ regions and identify organ-specific lipid signals.

Continue reading

Comprehensive and standardised workflow for single-cell proteomics data analysis using scp and scplainer.

Comprehensive and standardised workflow for single-cell proteomics data analysis using scp and scplainer. Author(s): Samuel Grégoire,Christophe Vanderaa,Laurent Gatto Affiliation(s): UCLouvain Single cell proteomics (SCP) via mass spectrometry has become achievable thanks to technological advancements innovated by various research teams, resulting in a broad landscape of cutting-edge methodologies [1]. While this progress has enabled the measurement of thousands of proteins at the single cell resolution, it has also resulted in various complex and divergent analysis workflows.

Continue reading

CENTRE: A Bioconductor package for cell type specific enhancer-promoter prediction

CENTRE: A Bioconductor package for cell type specific enhancer-promoter prediction Author(s): Sara Lopez Ruiz de Vargas,Trisevgeni Rapakoulia,Persia Akbari-Omgba,Verena Laupert,Martin Vingron Affiliation(s): Max Planck Institute For Molecular Genetics Identifying active enhancer-promoter pairs is a crucial step to understand gene regulation, phenotypes and diseases. Up to now, several computational methods were developed to predict enhancer gene interactions, but they require many epigenomic and transcriptomic experimental assays to generate cell-type specific predictions. Thus, inferring enhancer gene interactions becomes a laborious and costly task, especially when looking for cell type (CT) specific contacts.

Continue reading

Bulk vs single-cell proteomics: is there a need for identification optimization?

Bulk vs single-cell proteomics: is there a need for identification optimization? Author(s): Guillaume Deflandre,Samuel Grégoire,Laurent Gatto Affiliation(s): UCLouvain Single-cell proteomics (SCP) has emerged as a powerful tool for elucidating cellular heterogeneity, offering opportunities beyond traditional bulk sample analysis. However, the application of current peptide identifications crafted for bulk samples may lead to false discoveries in SCP. Challenges such as reduced peak counts, lower peak intensities, and degraded signal-to-noise ratios (as identified by Boekweg et al.

Continue reading

Big-data scheme for inquiring the shared mobilome in livestock and human gut microbiomes

Big-data scheme for inquiring the shared mobilome in livestock and human gut microbiomes Author(s): Shivang Bhanushali,Tuomas Borman,Katariina Pärnänen,Leo M Lahti Affiliation(s): Department of Computing, University of Turku. Background: Livestock farms serve as focal points for the emergence and dissemination of Antibiotic Resistance Genes (ARGs) due to constricted space compared to the volume of livestock and extensive antibiotic use. Nevertheless, only a handful of studies have conducted a thorough evaluation of the spread of antibiotic resistance originating from food systems, especially poultry farms which promote zoonosis.

Continue reading

Benchmark of single-cell batch correction methods available in the R and Python ecosystems.

Benchmark of single-cell batch correction methods available in the R and Python ecosystems. Author(s): Elena Zuin,Chiara Romualdi,Davide Risso,Gabriele Sales Affiliation(s): Department of Biology, University of Padova, Italy Single-cell datasets often include samples collected from multiple laboratories and conditions, leading to complex batch effects. This unwanted technical variation overlaps with biological effects of interest and confuses downstream analyses. A key challenge in the study of single-cell data is to correctly align various datasets while preserving biological variations.

Continue reading

atena: an R/Bioconductor package for the analysis of transposable elements

atena: an R/Bioconductor package for the analysis of transposable elements Author(s): Beatriz Calvo-Serra,Robert Castelo Affiliation(s): Universitat Pompeu Fabra The quantification of RNA expression of transposable elements (TEs) requires specialized software and annotations outside the standardised pipelines and data sources employed in the analysis of RNA sequencing (RNA-seq) data. This often puts a burden on the users of such software, who first need to pull and combine input annotations from heterogeneous sources and formats and, second, parse the output quantifications before they can be fed into the next tool for a downstream analysis, such as a differential expression.

Continue reading

Analysis of differential genomic interactions between experimental conditions in capture Hi-C data: Sharing data between neighbouring restriction fragments

Analysis of differential genomic interactions between experimental conditions in capture Hi-C data: Sharing data between neighbouring restriction fragments Author(s): Marco Geigges,Charlotte Soneson,Filippo M Rijli,Michael B Stadler Affiliation(s): Friedrich Miescher Institute for Biomedical Research and SIB Swiss Institute of Bioinformatics, Basel, Switzerland Capture Hi-C (CHi-C) is a sequencing-based method to study three-dimensional chromosomal interactions of pre-selected genomic regions like promoters with other genomic regions. For example, it is widely used to identify interactions of promoters with regulatory elements.

Continue reading

Analysis and visualization of ChIP-seq and ATAC-seq data using epiwraps

Analysis and visualization of ChIP-seq and ATAC-seq data using epiwraps Author(s): Pierre-Luc Germain,Mark Robinson Affiliation(s): ETH and University of Zürich, Switzerland Workflows for epigenomics data, especially ATAC/ChIP-like data, typically involve a (sometimes clunky) mix of tools, within and outside R. This can create consistency or reproducibility issues, difficulties when trying to combine elements of different workflows, and complicates teaching. Although excellent R/Bioconductor-based solutions are available for many steps, some critical steps lack good R-based alternatives, and their integration often lacks the smoothness that Bioconductor has allowed us to enjoy in other subfields.

Continue reading

An Open Software Development-based Ecosystem of R Packages for Proteomics Data Analysis

An Open Software Development-based Ecosystem of R Packages for Proteomics Data Analysis Author(s): Laurent Gatto,RforMassSpectrometry contributors Affiliation(s): de Duve Institute, UCLouvain, Belgium A frequent problem with scientific research software is the lack of support, maintenance and further development. In particular, development by a single researcher can easily result in orphaned and dysfunctional software packages, especially if combined with poor documentation, missing unit tests or lack of adherence to open software development standards.

Continue reading

An Open Software Development-based Ecosystem of R Packages for Metabolomics Data Analysis

An Open Software Development-based Ecosystem of R Packages for Metabolomics Data Analysis Author(s): Johannes Rainer,RforMassSpectrometry contributors Affiliation(s): Institute for Biomedicine, Eurac Research, Bolzano, Italy A frequent problem with scientific research software is the lack of support, maintenance and further development. In particular, development by a single researcher can easily result in orphaned and dysfunctional software packages, especially if combined with poor documentation, missing unit tests or lack of adherence to open software development standards.

Continue reading

A streamlined platform for cell type-specific prediction of TF binding

A streamlined platform for cell type-specific prediction of TF binding Author(s): Emanuel Sonder,Mark Robinson,Pierre-Luc Germain Affiliation(s): Institute for Neuroscience, ETH Zurich Transcription factors (TFs) mediate transcription by binding specific sites in the genome, i.e., transcription factor binding sites (TFBS). These vary across cell types and conditions, determining cell fate and response to stimuli. The binding of a TF is influenced by its affinity for certain DNA sequences, but also by local chromatin accessibility and the presence and activity of other TFs acting as cofactors.

Continue reading