The Relaxed Maximum Entropy Distribution and its Application to Pattern Discovery

Abstract. The maximum entropy principle uniquely identifies the distribution that models our knowledge about the data, but is otherwise maximally unbiased. As soon as we include non-trivial observations in our model, however, exact inference quickly becomes intractable. We propose a relaxation that permits efficient inference by dynamically factorizing the joint distribution into factors. In particular, we show that these factors are learnable from data and that it is consistent with standard maximum entropy distribution. Through an extensive set of experiments we show that the relaxation is scalable, approximates the vanilla distribution closely, allows for a classification that is as good, as well as results in a concise set of patterns.

Implementation

the replication package including code and data for Dalleiger & Vreeken (ICDM 2020)

the C++, Python and R source code (June 2020) by Sebastian Dalleiger.

the datasets used in the paper

Related Publications

Dalleiger, S & Vreeken, J The Relaxed Maximum Entropy Distribution and its Application to Pattern Discovery. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'20), IEEE, 2020. (19.7% acceptance rate)