Our principal research interest is the design and application of computational biology and bioinformatics methods to organize, analyze, compare, interpret, and visualize -omics data. From a methodological standpoint, our current research lines comprise the development of methods, resources and tools for:

From an applicative perspective, our research supports wet biologists in investigating the genomic bases of complex biological systems, with particular emphasis on onco-genomics, immunogenomics, and neurosciences. With some of these groups, we are operating like one extended laboratory, where we provide key support for the bioinformatics analyses of -omics data.

Integrative analysis of multi-omics and phenotype data. We developed pioneering methods to merge genomics data obtained from different technologies and to integrate multi-omics with phenotype characteristics, clinical information, outcomes, and drug responses. Briefly, we implemented integrative approaches to complement gene expression data with genome annotations1, copy number and chromosomal localization non-coding RNAs4,5, genome-wide binding sites6,7, and genomic interactions8,9. We designed tools to retrieve, locally organize, and re-annotate genomics data and related meta-information of samples available in public repositories. This resulted in-silico databases that have been used to stratify cancer samples into molecularly different subtypes and for the identification, testing, and validation of prognostic and predictive signatures10–12.

Development of methods, resources and tools. We designed and deployed bioinformatics packages for marker identification from gene expression data (SIMCA13), automatic genotype calling (BCGA14), identification of genomic imbalanced regions in cancer (SODEGIR2), and detection of regional variations in genomics data (PREDA1,15). We also created user-friendly web applications to handle and analyzed large volumes of -omics data (A-MADMAN16, UCbase 2.017, APTANI18, WoPPER19, and GDA12).

Bioinformatics for epigenomics and 3D genome. We developed and applied computational methods for the analysis of linear epigenomic marks and regulatory elements and their integration with transcriptional profiles in different physiological and pathological cellular systems6,7,20–22. We also introduced algorithmic approaches to superimpose chromosome conformation data to genome-wide maps of expression levels, epigenomic marks, regulatory elements, and transcription factor binding sites8,9. We quantitatively compared the performances of Hi-C data analysis methods for the identification of multi-scale chromatin structures23,24 and evidenced some crucial limitations of existing methods (e.g., their inefficacy in capturing subtle interaction patterns and changes in the chromatin architecture). Currently, in line with projects to study the 3D genome organization in the nucleus, we are working on novel algorithms to analyze Hi-C data and study the dynamics of epigenetic landscapes.

Computational systems biology. We have been active in developing bioinformatics and computational biology approaches for reverse engineering and reconstruction of cell regulatory networks25–28. Methodologically, we introduced the concept of the critical analysis of network components to inspect the transcriptional and post-transcriptional regulatory networks reconstructed from mRNA and microRNA expression data in pathological samples4,28.

Methods for single cell genomics. Recent advances in single-cell techniques are providing exciting opportunities for dissecting cell heterogeneity and investigating cell identity, fate, and function. However, the analysis and modeling of single cell data -omics poses incredible computational challenges and needs entirely new bioinformatics techniques and methods. We are currently working on the development of i) new multi-dimensional approaches to extract, from the background noise, the higher-order information embedded into the 3D spatial, architectural and mutual organization of cells; ii) novel multi-scale algorithms to identify the molecular connections among cell regulatory circuits, dynamics and functional output; and iii) visualization tools to display and navigate cell atlases. Specifically, we are using machine-learning, deconvolution, and projection methods to associate variations in single cell gene expression profiles with specific regulatory mechanisms, define transcriptional fingerprints associated with tissues and phenotypes, and assess the spatial distribution of gene expression signatures within cellular subpopulations29.