Linkedin

LASIGE researcher published in Biefings in Bioinformatics

Date: 16/07/2024

The paper “Biclustering data analysis: a comprehensive survey”, authored by LASIGE’s PhD student Eduardo N. Castanho, and integrated researchers Sara C. Madeira and Helena Aidos, has been published in Briefings in Bioinformatics, a top-ranked journal (SCIMAGO Q1).

Biclustering, the simultaneous clustering of rows and columns of a data matrix, evolved from a key technique used in gene expression data analysis into one of the most used approaches for pattern discovery and identification of biological modules, used in both descriptive and predictive learning tasks. This survey presents a comprehensive overview of biclustering. It proposes an updated taxonomy for its fundamental components (bicluster, biclustering solution, biclustering algorithms, and evaluation measures) and applications. The researchers unify scattered concepts in the literature with new definitions to accommodate the diversity of data types (such as tabular, network, and time series data) and the specificities of biological and biomedical data domains. They further propose a pipeline for biclustering data analysis and discuss practical aspects of incorporating biclustering in real-world applications. The researchers highlight prominent application domains, particularly in bioinformatics, and identify typical biclusters to illustrate the analysis output.

Moreover, the paper discuss important aspects to consider when choosing, applying, and evaluating a biclustering algorithm. It also relates biclustering with other data mining tasks (clustering, pattern mining, classification, triclustering, N-way clustering, and graph mining). Thus, it provides theoretical and practical guidance on biclustering data analysis, demonstrating its potential to uncover actionable insights from complex datasets.

The paper is available here.