PROCEEDINGS IPMU '08

Discovery of factors in binary data triangular decomposition of matrices

Radim Belohlavek.

We present new methods of decomposition
of an *n x m* binary matrix
I into a product *A * B* of an *n x k*
binary matrix A and a *k x m* binary
matrix B. These decompositions are
alternative to the usual one which
is sought in Boolean factor analysis
(BFA), where * is a Boolean product
of matrices. In the new decompositions,
are the left and the right
triangular products of Boolean matrices.
In BFA, *I* is interpreted as object x attribute matrix, A and B are interpreted as object x factor and factor x attribute matrices, and the aim is to find a decomposition with the number *k* of factors as small
as possible. The new decompositions
have different semantics from
the one with Boolean matrix product.
The presented methods are optimal
in that they provide the smallest
number k possible. We provide a
geometric insight showing that the
factors correspond to I-beam- and
H-beam-shaped patterns in the input
matrix. We present an approximation
algorithm for computing the
optimal decomposition. The algorithm
is based on a greedy approximation
algorithm for set covering
optimization problem. We demonstrate
our results by illustrative examples.

PDF full paper |