PROCEEDINGS IPMU '08
Discovery of factors in binary data triangular decomposition of matrices
Radim Belohlavek.
We present new methods of decomposition
of an n x m binary matrix
I into a product A * B of an n x k
binary matrix A and a k x m binary
matrix B. These decompositions are
alternative to the usual one which
is sought in Boolean factor analysis
(BFA), where * is a Boolean product
of matrices. In the new decompositions,
are the left and the right
triangular products of Boolean matrices.
In BFA, I is interpreted as object x attribute matrix, A and B are interpreted as object x factor and factor x attribute matrices, and the aim is to find a decomposition with the number k of factors as small
as possible. The new decompositions
have different semantics from
the one with Boolean matrix product.
The presented methods are optimal
in that they provide the smallest
number k possible. We provide a
geometric insight showing that the
factors correspond to I-beam- and
H-beam-shaped patterns in the input
matrix. We present an approximation
algorithm for computing the
optimal decomposition. The algorithm
is based on a greedy approximation
algorithm for set covering
optimization problem. We demonstrate
our results by illustrative examples.
PDF full paper |