Sequentially observed data usually exhibit occasional changes in their structure, such as network anomalies in banking systems, distributional changes in teletraffic models, sudden changes of volatility in stock markets, variations of an electroencephalogram (EEG) signal caused by mode changes in the brain, or environmental changes in various ecosystems.

How to mathematically quantify and evaluate “changes” that have occurred in the past?

Change detection analysis aims to identify both the number and location of change points. One of the main challenges is to model dependent data appropriately and to identify potential structural changes in a computationally tractable way. We tackle these challenges by modeling the data generating process as a segment-wise autoregression. Classical search-based change detection algorithms are often computationally cumbersome for models with dependency structure.  We propose a multi-window method that is both effective and efficient for discovering the structural changes. The proposed approach was motivated by transforming a segment-wise autoregression into a multivariate time series that is asymptotically segment-wise independent and identically distributed.

We also derive theoretical guarantees for (almost surely) selecting the true number of change points of segment-wise multivariate time series. Specifically, under mild assumptions, we show that a Bayesian Information Criterion (BIC)-like criterion gives a strongly consistent selection of the optimal number of change points.

The proposed methodology is demonstrated by experiments on both synthetic and real-world data, including the Eastern US temperature data and the El Nino data from 1854 to 2015. The experiments lead to some interesting discoveries about temporal variability of the summer-time temperature over the Eastern US, and about the most dominant factor of ocean influence on climate.

 

Jie Ding, Yu Xiang, Lu Shen, Vahid Tarokh, “Multiple Change Point Analysis: Fast Implementation And Strong Consistency”. pdf