Linear-time Detection of Non-linear Changes
in Massively High Dimensional Time Series

Abstract. Change detection on multivariate time series has applications in various areas, e.g. health care and network monitoring. A common approach is to compare the divergence between data distributions of a reference window and a test window. When the number of dimensions is very large, this approach has both efficiency and quality issues. For instance, the window size needs to be large to ensure robustness, which further increases runtime and misses alarms. In this paper, we aim at tackling these issues and propose Light. In short, Light scales to very high dimensionality while providing flexibility in choosing the window size, to fit the level of details required. It works in three steps: 1) scalable pca for dimension reduction, 2) scalable factorization of the joint distribution in the pca space to lower dimensional distributions, and 3) scalable divergence computation using lower dimensional distributions. Experiments on both synthetic and real-world data show that Light outperforms state of the art, providing up to 100% improvement in both quality and efficiency.

Implementation

the Java source code (October 2015) by Hoang Vu Nguyen.

Related Publications

Nguyen, H-V & Vreeken, J Linear-time Detection of Non-Linear Changes in Massively High Dimensional Time Series. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 828-836, SIAM, 2016. (overall 25% acceptance rate)
Nguyen, H-V & Vreeken, J Linear-time Detection of Non-Linear Changes in Massively High Dimensional Time Series. Technical Report 1510.08385, arXiv, 2015.