Jilles is Faculty (W3) at the CISPA Helmholtz Center for Information Security, where he leads the Exploratory Data Analysis group. He is Honorary Professor of the Department of Computer Science of Saarland University, as well as ELLIS Fellow of the ELLIS Unit Saarbrücken on Artificial Intelligene and Machine Learning.
My research is mainly concerned with causality and unsupervised learning. In particular, I enjoy developing theory and algorithms for answering exploratory questions, such as 'what is going on in my data?' or 'what is going on in my model?' without having to make unnecessary or unjustified assumptions. To identify what is worth knowing, I often employ well-founded statistical methods based on information theory, and then proceed to develop efficient algorithms for extracting useful interpretable results. I like all data types equally much.
Currently I'm investigating techniques for identifying informative and ideally causal structures in large collections of complex data; how to efficiently mine easily interpretable summaries from data; how to determine and discover causal dependencies from observational data; the theoretical and practical foundations of interactive exploration of very large data, discovering things by serendipity; how to mine large relational databases; how to mine very large graphs, including characterising influence propagation in social networks; as well as to study well-founded approaches for meaningfully comparing between, and validation of, explorative results.
2026 | |
Causal Discovery from Interval-Based Event Sequences. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2026. (oral presentation, 5% acceptance rate; 17.6% overall) |
|
Seqret: Mining Rule Sets from Event Sequences. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2026. (oral presentation, 5% acceptance rate; 17.6% overall) |
|
When Flatness Does (Not) Guarantee Adversarial Robustness. In: Proceedings of the International Conference on Representation Learning (ICLR), OpenReview, 2026. (28.2% acceptance rate) |
|
Explainable Mixture Models through Differentiable Rule Learning. In: Proceedings of the International Conference on Representation Learning (ICLR), OpenReview, 2026. (28.2% acceptance rate) |
|
Learning and Naming Subgroups with Exceptional Survival Characteristics. Technical Report 2602.22179, arXiv, 2026. |
|
2025 | |
Federated Binary Matrix Factorization using Proximal Optimization. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 16144-16152, AAAI, 2025. (23,4% acceptance rate) |
|
Accurately Estimating Unreported Infections using Information Theory. In: SIAM International Conference on Data Mining (SDM), pp 457-466, SIAM, 2025. (26.7% acceptance rate) |
|
From Your Block to Our Block: How to Find Shared Structure between Stochastic Block Models over Multiple Graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 11987-11994, AAAI, 2025. (23,4% acceptance rate) |
|
SpaceTime: Causal Discovery from Non-Stationary Time Series. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 19405-19413, AAAI, 2025. (23,4% acceptance rate) |
|
Causal Mixture Models: Characterization and Discovery. In: Proceedings of Neural Information Processing Systems (NeurIPS), PMRL, 2025. (24.5% acceptance rate) |
|
Succinct Interaction-Aware Explanations. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 1715-1726, ACM, 2025. (19% acceptance rate) |
|
Information-Theoretic Causal Discovery in Topological Order. In: Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS), pp 2008-2016, PMLR, 2025. (31.3% acceptance rate) |
|
Neural Rule Lists: Learning Discretizations, Rules, and Order in One Go. In: Proceedings of Neural Information Processing Systems (NeurIPS), PMRL, 2025. (24.5% acceptance rate) |
|
Snor: Simpler Descriptions through Overlapping Patterns. In: van Leeuwen, M & Vreeken, J (eds) Challenges and Algorithms for Knowledge Discovery from Data, pp 56-74, pp 56-74, Springer, 2025. |
|
Efficient Greedy Equivalence Search for Non-Score-Equivalent Criteria using Sampling. In: Uncovering Causality in Science (CauScien@NeurIPS), 2025. |
|
Efficient Greedy Equivalence Search for Non-Score-Equivalent Criteria using Sampling. In: Causality for Impact (CI@EurIPS), 2025. |
|
Hidden in Plain Sight - Class Competition Focuses Attribution Maps. Technical Report 2503.07346, arXiv, 2025. |
|
Proceedings of the IEEE International Conference on Data Mining (ICDM). IEEE, 2025. |
|
Challenges and Algorithms for Knowledge Discovery from Data. Springer, 2025. |
|
2024 | |
All the World's a (Hyper)Graph: A Data Drama. Digital Scholarship in the Humanities vol.39(1), pp 74-96, Oxford Academic Press, 2024. (IF 0.8) |
|
Discovering Sequential Patterns with Predictable Inter-Event Delays. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 8346-8353, AAAI, 2024. (23.8% acceptance rate) |
|
Causal Discovery from Event Sequences by Local Cause-Effect Attribution. In: Proceedings of Neural Information Processing Systems (NeurIPS), PMRL, 2024. (25.8% acceptance rate) |
|
Identifying Confounding from Causal Mechanism Shifts. In: Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS), pp 4897-4905, PMLR, 2024. (27.6% acceptance rate) |
|
Learning Causal Networks from Episodic Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 2224-2235, ACM, 2024. (20% acceptance rate) |
|
Data is Moody: Discovering Data Modification Rules from Process Event Logs. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Data (ECMLPKDD), pp 285-302, Springer, 2024. (24.0% acceptance rate) |
|
Finding Interpretable Class-Specific Patterns through Efficient Neural Search. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 9062-9070, AAAI, 2024. (23.8% acceptance rate) |
|
What are the Rules? Discovering Constraints from Data. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 8182-8190, AAAI, 2024. (oral presentation, 2,3% acceptance rate; 23.8% overall) |
|
Learning Exceptional Subgroups by End-to-End Maximizing KL-divergence. In: Proceedings of the International Conference on Machine Learning (ICML), PMLR, 2024. (spotlight, 3.5% acceptance rate; 27.5% overall) |
|
The Uncanny Valley: Exploring Adversarial Robustness from a Flatness Perspective. Technical Report 2405.16918, arXiv, 2024. |
|
2023 | |
Below the Surface: Summarizing Event Sequences with Generalized Sequential Patterns. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 348-357, ACM, 2023. (22.1% acceptance rate) |
|
Causal Discovery with Hidden Confounders using the Algorithmic Markov Condition. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI), pp 1016-1026, AUAI, 2023. (31.2% acceptance rate) |
|
Nonlinear Causal Discovery with Latent Confounders. In: Proceedings of the International Conference on Machine Learning (ICML), pp 15639-15654, PMLR, 2023. (27.9% acceptance rate) |
|
Identifying Selection Bias from Observational Data. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 8177-8185, AAAI, 2023. (oral presentation, 10.8% acceptance rate; 19.6% overall) |
|
Federated Learning from Small Datasets. In: Proceedings of the International Conference on Representation Learning (ICLR), OpenReview, 2023. (31.8% acceptance rate) |
|
Learning Causal Models under Independent Changes. In: Proceedings of Neural Information Processing Systems (NeurIPS), PMRL, 2023. (26.1% acceptance rate) |
|
Nothing but Regrets — Privacy-Preserving Federated Causal Discovery. In: Proceedings of the 26nd International Conference on Artificial Intelligence and Statistics (AISTATS), pp 8263-8278, PMLR, 2023. (29% acceptance rate) |
|
Information-Theoretic Causal Discovery and Intervention Detection over Multiple Environments. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp 9171-9179, AAAI, 2023. (19.6% acceptance rate) |
|
Towards Concept-Aware Large Language Models. In: Findings of the Association for Computational Linguistics (EMNLP Findings), pp 13158-13170, ACL, 2023. |
|
Why Are We Waiting? Discovering Interpretable Models for Predicting Sojourn and Waiting Times. In: SIAM International Conference on Data Mining (SDM), pp 352-360, SIAM, 2023. (27.4% acceptance rate) |
|
2022 | |
Omen: Discovering Sequential Patterns with Reliable Prediction Delays. Knowledge and Information Systems vol.64(4), pp 1013-1045, Springer, 2022. (IF 2.822) |
|
Differentially Describing Groups of Graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2022. (oral presentation 5.5% acceptance rate; overall 15.0%) |
|
Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent. In: Proceedings of Neural Information Processing Systems (NeurIPS), PMLR, 2022. (25.7% acceptance rate) |
|
Discovering Significant Patterns under Sequential False Discovery Control. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 263-272, ACM, 2022. (15.0% acceptance rate) |
|
Label-Descriptive Patterns and their Application to Characterizing Classification Errors. In: Proceedings of the International Conference on Machine Learning (ICML), PMLR, 2022. (21.9% acceptance rate) |
|
Naming the most anomalous cluster in Hilbert Space for structures with attribute information. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2022. (15.0% acceptance rate) |
|
Discovering Invariant and Changing Mechanisms from Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 1242-1252, ACM, 2022. (15.0% acceptance rate) |
|
Mining Interpretable Data-to-Sequence Generators. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2022. (15.0% acceptance rate) |
|
Inferring Cause and Effect in the Presence of Heteroscedastic Noise. In: Proceedings of the International Conference on Machine Learning (ICML), PMLR, 2022. (21.9% acceptance rate) |
|
Formally Justifying MDL-based Inference of Cause and Effect. In: Proceedings of the AAAI Workshop on Information Theoretic Causal Inference and Discovery (ITCI'22), 2022. |
|
Causal Inference with Heteroscedastic Noise Models. In: Proceedings of the AAAI Workshop on Information Theoretic Causal Inference and Discovery (ITCI'22), 2022. |
|
2021 | |
Data-driven Equation for Drug-Membrane Permeability across Drugs and Membranes. Journal of Chemical Physics vol.24(154), AIP, 2021. (IF 2.991) |
|
Integrative Analysis of Epigenetics Data Identifies Gene-Specific Regulatory Elements. Nucleic Acids Research, Oxford University Press, 2021. (IF 16.97) |
|
Discovering Reliable Causal Rules. In: Proceedings of the SIAM International Conference on Data Mining (SDM), SIAM, 2021. (21.2% acceptance rate) |
|
Graph Similarity Description: How Are These Graphs Similar?. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 185-195, ACM, 2021. (15.4% acceptance rate) |
|
What's in the Box? Explaining Neural Networks with Robust Rules. In: Proceedings of the International Conference on Machine Learning (ICML), PMLR, 2021. (21.4% acceptance rate) |
|
Differentiable Pattern Set Mining. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 383-392, ACM, 2021. (15.4% acceptance rate) |
|
SUSAN: The Structural Similarity Random Walk Kernel. In: Proceedings of the SIAM International Conference on Data Mining (SDM), SIAM, 2021. (21.2% acceptance rate) |
|
Discovering Fully Oriented Causal Networks. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2021. (21.3% acceptance) |
|
Mining Easily Understandable Models from Complex Event Data. In: SIAM International Conference on Data Mining (SDM), SIAM, 2021. (21.2% acceptance rate) |
|
2020 | |
Discovering Dependencies with Reliable Mutual Information. Knowledge and Information Systems vol.62, pp 4223-4253, Springer, 2020. (IF 2.936) |
|
Identifying Domains of Applicability of Machine Learning Models for Materials Science. Nature Communications vol.11(4428), pp 1-9, Nature Research, 2020. (IF 12.12) |
|
What is Normal, What is Strange, and What is Missing in a Knowledge Graph. In: Proceedings of the Web Conference (WWW), ACM, 2020. (oral presentation; overall acceptance rate 19.2%) |
|
Just Wait For It... Mining Sequential Patterns with Reliable Prediction Delays. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'20), IEEE, 2020. (full paper, 9.8% acceptance rate; overall 19.7%) (invited for the KAIS Special Issue on the Best of IEEE ICDM 2020) |
|
The Relaxed Maximum Entropy Distribution and its Application to Pattern Discovery. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'20), IEEE, 2020. (19.7% acceptance rate) |
|
Explainable Data Decompositions. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'20), AAAI, 2020. (oral presentation 4.5% acceptance rate; overall 20.6%) |
|
Discovering Succinct Pattern Sets Expressing Co-Occurrence and Mutual Exclusivity . In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2020. (16.8% acceptance rate) |
|
Discovering Functional Dependencies from Mixed-Type Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2020. (16.8% acceptance rate) |
|
Discovering Approximate Functional Dependencies using Smoothed Mutual Information . In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2020. (16.8% acceptance rate) |
|
Towards Plausible Graph Anonymization. In: Proceedings of the Network and Distributed System Security Symposium (NDSS), The Internet Society, 2020. (17.4% acceptance rate) |
|
2019 | |
Telling Cause from Effect by Local and Global Regression. Knowledge and Information Systems vol.60(3), pp 1277-1305, IEEE, 2019. (IF 2.397) |
|
Sets of Robust Rules, and How to Find Them. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Data (ECMLPKDD), Springer, 2019. (17.7% acceptance rate) |
|
Discovering Robustly Connected Subgraphs with Simple Descriptions. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), IEEE, 2019. (18.5% acceptance rate) |
|
We Are Not Your Real Parents: Telling Causal From Confounded by MDL. In: SIAM International Conference on Data Mining (SDM), SIAM, 2019. (22.9% acceptance rate) |
|
Discovering Reliable Correlations in Categorical Data. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'19), IEEE, 2019. (18.5% acceptance rate) |
|
Discovering Reliable Dependencies from Data: Hardness and Improved Algorithms (Extended Abstract). In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), IJCAI, 2019. (Invited contribution to the IJCAI Sister Conference Best Paper Track) |
|
Identifiability of Cause and Effect using Regularized Regression. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2019. (oral presentation 9.2% acceptance rate; overall 14.2%) |
|
Testing Conditional Independence on Discrete Data using Stochastic Complexity. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR, 2019. (31% acceptance rate) |
|
Discovering Robustly Connected Subgraphs with Simple Descriptions. In: Proceedings of the ECMLPKDD Workshop on Graph Embedding and Mining (GEM), 2019. (oral presentation, 21% acceptance rate) |
|
Discovering Robustly Connected Subgraphs with Simple Descriptions. In: Proceedings of the ACM SIGKDD Workshop on Mining and Learning from Graphs (MLG), 2019. |
|
Approximating Algorithmic Conditional Independence for Discrete Data. In: Proceedings of the the First AAAI Spring Symposium Beyond Curve Fitting: Causation, Counterfactuals, and Imagination-based AI, AAAI, 2019. |
|
Summarizing Dynamic Graphs using MDL. In: Proceedings of the ECMLPKDD Workshop on Graph Embedding and Mining (GEM), 2019. (oral presentation, 21% acceptance rate) |
|
Proceedings of the ACM SIGKDD Workshop on Learning and Mining for Cybersecurity (LEMINCS). , 2019. |
|
2018 | |
Origo: Causal Inference by Compression. Knowledge and Information Systems vol.56(2), pp 285-307, Springer, 2018. (IF 2.247) |
|
JAMI — Fast computation of Conditional Mutual Information for ceRNA network analysis. Bioinformatics vol.34(17), pp 3050-3051, Oxford University Press, 2018. (IF 7.307) |
|
Generating Realistic Synthetic Population Datasets. Transactions on Knowledge Discovery from Data vol.12(4), pp 1-45, ACM, 2018. (IF 1.68) |
|
Accurate Causal Inference on Discrete Data. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'18), IEEE, 2018. (19.9% acceptance rate) |
|
Causal Inference on Event Sequences. In: Proceedings of the SIAM Conference on Data Mining (SDM), pp 55-63, SIAM, 2018. (23.2% acceptance rate) |
|
Discovering Reliable Dependencies from Data: Hardness and Improved Algorithms. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'18), IEEE, 2018. (full paper, 8.9% acceptance rate; overall 19.9%) (Best Paper Award) |
|
Causal Inference on Multivariate and Mixed Type Data. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Data (ECMLPKDD), Springer, 2018. (25% acceptance rate) |
|
Rule Discovery for Exploratory Causal Reasoning. In: Proceedings of the NeurIPS 2018 workshop on Causal Learning, pp 1-14, 2018. |
|
Stochastic Complexity for Testing Conditional Independence on Discrete Data. In: Proceedings of the NeurIPS 2018 workshop on Causal Learning, pp 1-12, 2018. |
|
2017 | |
Identifying Consistent Statements about Numerical Data with Dispersion-Corrected Subgroup Discovery. Data Mining and Knowledge Discovery vol.31(5), pp 1391-1418, Springer, 2017. (IF 3.160) (ECML PKDD'17 Journal Track) |
|
Beyond Pairwise Similarity: Quantifying and Characterizing Linguistic Similarity between Groups of Languages by MDL. Computación y Sistemas vol.21(4), 2017. (Special Issue for the 18th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing'17) |
|
Uncovering Structure-Property Relationships of Materials by Subgroup Discovery. New Journal of Physics vol.19, IOP Publishing Ltd and Deutsche Physikalische Gesellschaft, 2017. (IF 3.57) (Included in the NJP Highlights of 2017) |
|
Efficiently Discovering Unexpected Pattern-Co-Occurrences. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 126-134, SIAM, 2017. (25% acceptance rate) |
|
Efficiently Summarising Event Sequences with Rich Interleaving Patterns. In: Proceedings of the SIAM Conference on Data Mining (SDM), pp 795-803, SIAM, 2017. (selected in the top 10 papers of SDM'17, 2.7% acceptance rate; overall 25%) |
|
MDL for Causal Inference on Discrete Data. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'17), pp 751-756, IEEE, 2017. (19.9% acceptance rate) |
|
Correlation by Compression. In: Proceedings of the SIAM Conference on Data Mining (SDM), SIAM, 2017. (25% acceptance rate) |
|
Efficiently Discovering Locally Exceptional yet Globally Representative Subgroups. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'17), IEEE, 2017. (full paper, 9.3% acceptance rate; overall 19.9%) |
|
Discovering Reliable Approximate Functional Dependencies. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp 355-363, ACM, 2017. (oral presentation, 8.6% acceptance rate; overall 17.5%) |
|
Telling Cause from Effect by MDL-based Local and Global Regression. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'17), pp 307-316, IEEE, 2017. (full paper, 9.3% acceptance rate; overall 19.9%) (invited for the KAIS Special Issue on the Best of IEEE ICDM 2017) |
|
Adaptive Local Exploration of Large Graphs. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 597-605, SIAM, 2017. (25% acceptance rate) |
|
Summarising Event Sequences using Serial Episodes and an Ontology. In: Proceedings of the 4th Workshop on Interactions between Data Mining and Natural Language Processing (DMNLP'17), pp 33-48, CEUR Workshop Proceedings, 2017. |
|
Characterising the Difference and the Norm between Sequences Databases. In: Proceedings of the 4th Workshop on Interactions between Data Mining and Natural Language Processing (DMNLP'17), pp 49-64, CEUR Workshop Proceedings, 2017. |
|
2016 | |
Is Exploratory Search Different? A Comparison of Information Search Behavior for Exploratory and Lookup Tasks. Journal of the Association for Information Science and Technology (JASIST) vol.67(11), pp 2635-2651, Wiley, 2016. (IF 2.26) |
|
Keeping it Short and Simple: Summarising Complex Event Sequences with Multivariate Patterns. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16), pp 735-744, ACM, 2016. (oral presentation, 8.9% acceptance rate; overall 18.1%) |
|
Causal Inference by Compression. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'16), IEEE, 2016. (full paper, 8.5% acceptance rate; overall 19.6%) (invited for the KAIS Special Issue on the Best of IEEE ICDM 2016) |
|
Universal Dependency Analysis. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 792-800, SIAM, 2016. (overall 25% acceptance rate) |
|
Flexibly Mining Better Subgroups. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 585-593, SIAM, 2016. (overall 25% acceptance rate) |
|
Linear-time Detection of Non-Linear Changes in Massively High Dimensional Time Series. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 828-836, SIAM, 2016. (overall 25% acceptance rate) |
|
Reconstructing an Epidemic over Time. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp 1835-1844, ACM, 2016. (18.1% acceptance rate) |
|
Proceedings of the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Data (ECMLPKDD). Springer, 2016. (Part I) |
|
Proceedings of the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Data (ECMLPKDD). Springer, 2016. (Part II) |
|
2015 | |
Summarizing and Understanding Large Graphs. Statistical Analysis and Data Mining vol.8(3), pp 183-202, Wiley, 2015. |
|
The Blind Men and the Elephant: About Meeting the Problem of Multiple Truths in Data from Clustering and Pattern Mining Perspectives. Machine Learning vol.98(1), pp 121-155, Springer, 2015. (IF 1.587) |
|
The Difference and the Norm – Characterising Similarities and Differences between Databases. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 206-223, Springer, 2015. |
|
Getting to Know the Unknown Unknowns: Destructive-Noise Resistant Boolean Matrix Factorization. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 325-333, SIAM, 2015. |
|
Non-Parametric Jensen-Shannon Divergence. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 173-189, Springer, 2015. |
|
AdaptiveNav: Adaptive Discovery of Interesting and Surprising Nodes in Large Graphs. In: Proceedings of the IEEE Conference on Visualization (VIS), IEEE, 2015. |
|
Hidden Hazards: Finding Missing Nodes in Large Graph Epidemics. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 415-423, SIAM, 2015. |
|
Causal Inference by Direction of Information. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 909-917, SIAM, 2015. |
|
2014 | |
mdl4bmf: Minimal Description Length for Boolean Matrix Factorization. Transactions on Knowledge Discovery from Data vol.8(4), pp 1-30, ACM, 2014. (IF 1.68) |
|
Unsupervised Interaction-Preserving Discretization of Multivariate Data. Data Mining and Knowledge Discovery vol.28(5), pp 1366-1397, Springer, 2014. (IF 2.877) (ECML PKDD'14 Journal Track) |
|
Efficiently Spotting the Starting Points of an Epidemic in a Large Graph. Knowledge and Information Systems vol.38(1), pp 35-59, Springer, 2014. (IF 2.225) |
|
Efficient Discovery of the Most Interesting Associations. Transactions on Knowledge Discovery from Data vol.8(3), pp 1-31, ACM, 2014. (IF 1.68) |
|
Uncovering the Plot: Detecting Surprising Coalitions of Entities in Multi-Relational Schemas. Data Mining and Knowledge Discovery vol.28(5), pp 1398-1428, Springer, 2014. (IF 2.877) (ECML PKDD'14 Journal Track) |
|
Narrow or Broad? Estimating Subjective Specificity in Exploratory Search. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp 819-828, ACM, 2014. (IR track full paper, overall 21% acceptance rate) |
|
VoG: Summarizing and Understanding Large Graphs. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 91-99, SIAM, 2014. (fast track journal invitation, as one of the best of SDM'14; full paper with presentation, 15.4% acceptance rate) |
|
A Fresh Look on Knowledge Bases: Distilling Named Events from News. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp 1689-1698, ACM, 2014. (KM track full paper, overall 21% acceptance rate) |
|
Multivariate Maximal Correlation Analysis. In: Proceedings of the International Conference on Machine Learning (ICML), pp 775-783, JMLR: W&CP vol.32, 2014. (25.0% acceptance rate) |
|
Interesting Patterns. In: Aggarwal, CC & Han, J (eds) Frequent Pattern Mining, pp 105-134, pp 105-134, Springer, 2014. |
|
Frequent Pattern Mining Algorithms for Data Clustering. In: Aggarwal, CC & Han, J (eds) Frequent Pattern Mining, pp 403-424, pp 403-424, Springer, 2014. |
|
Mining and Using Sets of Patterns through Compression. In: Aggarwal, CC & Han, J (eds) Frequent Pattern Mining, pp 165-198, pp 165-198, Springer, 2014. |
|
Supporting Exploratory Search Through User Modeling. In: Proceedings of the UMAP Joint Workshop on Personalized Information Access (PIA), pp 1-6, 2014. |
|
Interaction Model to Predict Subjective-Specificity of Search Results. In: Proceedings of the 22nd Conference on User Modeling, Adaptation and Personalization — Late-Breaking Results (UMAP), pp 1-6, 2014. |
|
Slimmer, outsmarting Slim. In: the 13th International Symposium on Intelligent Data Analysis (IDA), Springer, 2014. |
|
2013 | |
Summarizing Categorical Data by Clustering Attributes. Data Mining and Knowledge Discovery vol.26(1), pp 130-173, Springer, 2013. (IF 2.877) |
|
Mining Connection Pathways for Marked Nodes in Large Graphs. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 37-45, SIAM, 2013. (oral presentation, 14.4% acceptance rate; overal 25%) |
|
Cartification: A Neighborhood Preserving Transformation for Mining High Dimensional Data. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp 937-942, IEEE, 2013. (19.6% acceptance rate) |
|
Maximum Entropy Models for Iteratively Identifying Subjectively Interesting Structure in Real-Valued Data. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 256-271, Springer, 2013. |
|
CMI: An Information-Theoretic Contrast Measure for Enhancing Subspace Cluster and Outlier Detection. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 198-206, SIAM, 2013. (oral presentation, 14.4% acceptance rate; overal 25%) |
|
Detecting Bicliques in GF[q]. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 509-524, Springer, 2013. |
|
Islands and Bridges: Making Sense of Marked Nodes in Large Graphs. Technical Report CMU-CS-12-124R, Carnegie Mellon University, 2013. |
|
2012 | |
Summarizing Data Succinctly with the Most Informative Itemsets. Transactions on Knowledge Discovery from Data vol.6(4), pp 1-44, ACM, 2012. (IF 1.68) |
|
Comparing Apples and Oranges – Measuring Differences between Exploratory Data Mining Results. Data Mining and Knowledge Discovery vol.25(2), pp 173-207, Springer, 2012. (IF 1.545) (ECMLPKDD'11 Special Issue) |
|
Fast and Reliable Anomaly Detection in Categoric Data. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp 415-424, ACM, 2012. (full paper, 13.4% acceptance rate; 27% overall) |
|
Spotting Culprits in Epidemics: How many and Which ones?. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp 11-20, IEEE, 2012. (full paper, 10.7% acceptance rate; overall 20%) |
|
Slim: Directly Mining Descriptive Patterns. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 236-247, SIAM, 2012. (oral presentation, 14.6% acceptance rate) |
|
Discovering Descriptive Tile Trees by Fast Mining of Optimal Geometric Subtiles. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 9-24, Springer, 2012. |
|
The Long and the Short of It: Summarising Event Sequences with Serial Episodes. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 462-470, ACM, 2012. (17.6% acceptance rate) |
|
Mining and Visualizing Connection Pathways in Large Information Networks. In: Proceedings of the Workshop on Information in Networks (WIN), pp 1-3, 2012. |
|
Summarising Event Sequences with Serial Episodes. In: Proceedings of the 5th Workshop on Information Theoretic Methods in Science and Engineering (WITMSE), pp 82-85, 2012. (invited contribution, extended abstract of our KDD'12 paper) |
|
Where Do I Start? Algorithmic Strategies to Guide Intelligence Analysts. In: Proceedings of the ACM SIGKDD Workshop on Intelligence and Security Informatics (ISI-KDD), pp 1-8, ACM, 2012. |
|
Interactively and Visually Exploring Tours of Marked Nodes in Large Graphs. Demo at, and included in: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), ACM, 2012. |
|
TourViz: Interactive Visualization of Connection Pathways in Large Graphs. Demo at, and included in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 1516-1519, ACM, 2012. |
|
mdl4bmf: Minimum Description Length for Boolean Matrix Factorization. Technical Report MPI-I-2012-5-001, Max-Planck-Institut für Informatik, 2012. |
|
Proceedings of the 12th IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 2012. |
|
Proceedings of the ECML PKDD Workshop on Instant Interactive Data Mining (IID). , 2012. |
|
2011 | |
Unraveling Tobacco BY-2 Protein Complexes with BN PAGE/LC-MS/MS and Clustering Methods. Journal of Proteomics vol.74(8), pp 1201-1217, Elsevier, 2011. (IF 5.074) |
|
Krimp: Mining Itemsets that Compress. Data Mining and Knowledge Discovery vol.23(1), pp 169-214, Springer, 2011. (IF 2.950) |
|
Maximum Entropy Modelling for Assessing Results on Real-Valued Data. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp 350-359, IEEE, 2011. (oral presentation, 12.3% acceptance rate; overall 18%) |
|
Data Summarization with Informative Itemsets. In: Proceedings of the 23rd Benelux Conference on Artificial Intelligence (BNAIC), ISSN 1568-7805, 2011. (extended abstract of our KDD'11 paper) |
|
Tell Me What I Need To Know: Succinctly Summarizing Data with Itemsets. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 573-581, ACM, 2011. (Best Student Paper Award; oral presentation, 7.8% acceptance rate; overall 17.5%) |
|
Model Order Selection for Boolean Matrix Factorization. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 51-59, ACM, 2011. (oral presentation, 7.8% acceptance rate; overall 17.5%) |
|
Identifying and Characterising Anomalies in Transaction Data. In: Proceedings of the 23rd Benelux Conference on Artificial Intelligence (BNAIC), ISSN 1568-7805, 2011. (extended abstract of our SDM'11 paper) |
|
The Odd One Out: Identifying and Characterising Anomalies. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 804-815, SIAM, 2011. (25% acceptance rate) |
|
Comparing Apples and Oranges – Measuring Differences between Data Mining Results. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 398-413, Springer, 2011. (invited for extension for best-of special issue, 3% acceptance rate; overall 20%) |
|
When Pattern Met Subspace Cluster - A Relationship Story. In: Proceedings of the 2nd Workshop on Discovering, Summarizing and Using Multiple Clusterings (MultiClust), pp 7-18, 2011. |
|
mime: A Framework for Interactive Visual Pattern Mining. Demo at, and included in: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 634-637, Springer, 2011. |
|
mime: A Framework for Interactive Visual Pattern Mining. Demo at, and included in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 757-760, ACM, 2011. |
|
Comparing Apples and Oranges – Measuring Differences between Data Mining Results. Technical Report UA-CS-2011-03, Universiteit Antwerpen, 2011. |
|
2010 | |
Useful Patterns (UP'10) ACM SIGKDD Workshop Report. ACM SIGKDD Explorations vol.12(2), pp 56-58, ACM Press, 2010. |
|
Summarising Data by Clustering Items. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 321-336, Springer, 2010. (18% acceptance rate) |
|
Proceedings of the ACM SIGKDD Workshop on Useful Patterns (UP). ACM Press, 2010. |
|
2009 | |
Identifying the Components. Data Mining and Knowledge Discovery vol.19(2), pp 176-193, Springer, 2009. (IF 2.950) (ECMLPKDD'09 Special Issue) (Best Student Paper) |
|
Low-Entropy Set Selection. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 569-579, SIAM, 2009. (25% acceptance rate) |
|
Identifying the Components. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 32-32, Springer, 2009. (ECMLPKDD'09 Best Student Paper) |
|
2008 | |
Finding Good Itemsets by Packing Data. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp 588-597, IEEE, 2008. (9.8% acceptance rate) |
|
Filling in the Blanks – Krimp Minimisation for Missing Data. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp 1067-1072, IEEE, 2008. (19% acceptance rate) |
|
Krimp Minimisation for Missing Data Estimation. Technical Report UU-CS-2008-034, Universiteit Utrecht, 2008. |
|
2007 | |
MDL for Pattern Mining. In: Proceedings of the International Conference on Statistics for Data Mining, Learning and Knowledge Extraction Models (IASC), 2007. |
|
Preserving Privacy through Data Generation. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp 685-690, IEEE, 2007. (19% acceptance rate) |
|
Characterising the Difference. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp 765-774, ACM, 2007. (19% acceptance rate) |
|
2006 | |
Item Sets That Compress. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 393-404, SIAM, 2006. (16% acceptance rate) |
|
Compression Picks the Item Sets that Matter. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 585-592, Springer, 2006. (18% acceptance rate) |
|
Compression Picks the Significant Item Sets. Technical Report UU-CS-2006-050, Universiteit Utrecht, 2006. |
|
2004 | |
Simulation and Optimization of Traffic in a City. In: Proceedings of the IEEE Intelligent Vehicles Symposium (IV), pp 453-458, IEEE, 2004. |
|
On real-world temporal pattern recognition using Liquid State Machines. M.Sc. Thesis, Universiteit Utrecht, 2004. |
|
2003 | |
Exploring Temporal Memory of LSTM and Spiking Circuits. In: Workshop on the Future of Neural Networks (FUNN), 2003. |
|
2002 | |
Spiking neural networks, an introduction. Technical Report UU-CS-2003-008, Universiteit Utrecht, 2002. |
|