Abstract
The availability of large microarray data has led to a growing interest inbiclustering methods in the past decade. Several algorithms have been proposedto identify subsets of genes and conditions according to different similaritymeasures and under varying constraints. In this paper we focus on the exclusiverow biclustering problem for gene expression data sets, in which each row canonly be a member of a single bicluster while columns can participate inmultiple ones. This type of biclustering may be adequate, for example, forclustering groups of cancer patients where each patient (row) is expected to becarrying only a single type of cancer, while each cancer type is associatedwith multiple (and possibly overlapping) genes (columns). We present a novelmethod to identify these exclusive row biclusters through a combination ofexisting biclustering algorithms and combinatorial auction techniques. Wedevise an approach for tuning the threshold for our algorithm based oncomparison to a null model in the spirit of the Gap statistic approach. Wedemonstrate our approach on both synthetic and real-world gene expression dataand show its power in identifying large span non-overlapping rows sub matrices,while considering their unique nature. The Gap statistic approach succeeds inidentifying appropriate thresholds in all our examples.