Max-variance Clustering and Biclustering of Microarray Data

C. Cano, S. Blanco, F. García, A. Blanco

Microarray technology allows to measure the expression of thousands of genes simultaneously, and under tens of specific conditions. Clustering and Biclustering are the main tools to analyze gene expression data, since they reveal genes with the same behavior across samples. In this paper we present three novel approaches for Clustering and Biclustering based on Estimation of Distribution Algorithms (EDA) and Principal Components Analysis. The goal is to find nonexclusive (potentially overlapping) groups of genes with similar behavior and maximum between-sample variance. We tested the proposed methods on two real datasets, outperforming previous results in terms of quality and size of revealed patterns.

PDF full paper