Abstract. Addressing the interpretability problem of NMF on Boolean data, Boolean Matrix Factorization (BMF) uses Boolean algebra to decompose the input into low-rank Boolean factor matrices. These matrices are highly interpretable and very useful in practice, but they come at the high computational cost of solving an NP-hard combinatorial optimization problem. To reduce the computational burden, we continuously relax BMF using a novel elastic-binary regularizer, from which we derive a proximal gradient algorithm. Through an extensive set of experiments, we demonstrate that our method works well in practice: On synthetic data, we show that our algorithm converges quickly, recovers the ground truth precisely, and estimates the simulated rank robustly. On real-world data, we improve upon the state of the art in recall, loss, and runtime, and a case study from the medical domain confirms that our results are easily interpretable and semantically meaningful.
Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent. In: Proceedings of Neural Information Processing Systems (NeurIPS), PMLR, 2022. (25.7% acceptance rate) |
|
mdl4bmf: Minimal Description Length for Boolean Matrix Factorization. Transactions on Knowledge Discovery from Data vol.8(4), pp 1-30, ACM, 2014. (IF 1.68) |
|
Model Order Selection for Boolean Matrix Factorization. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 51-59, ACM, 2011. |