Research

Below, I describe some pieces of my past research.

 

Information Criterion
In the era of “big data”, data analysts have the freedom to apply/propose numerous parametric models to various datasets of interest. We propose a general principle that selects the most appropriate model or procedure for a given dataset. Our principle was originally motivated by autoregressive order selection for a linear process. [More]

Change Point Analysis
Sequentially observed data usually exhibit occasional changes in their structure, such as network anomalies in banking systems, distributional changes in teletraffic models, sudden changes of volatility in stock markets, variations of an electroencephalogram (EEG) signal caused by mode changes in the brain, or environmental changes in various ecosystems. In this direction, we study an offline methodology to mathematically quantify and detect “changes” from massibe data with dependency structure. [More]

Variable Selection
Variable selection, also known as feature selection, attribute selection or subset selection, is the process of selecting a subset of relevant variables (features, predictors) for constructing a regression model. We apply the above principle to variable selection, and propose a procedure that identifies a model approaching the smallest possible loss and risk. Our framework includes the usual variable selection in linear regression, and the selection of basis such as polynomials, splines, or wavelets in function estimation. We also extended our methodology to high dimensional settings where there are much less data than variables. [More]

Robust Data Prediction
In time series data prediction, when there exist potential strutural changes, a routine procedure is to keep detecting changes while conducting inference. However, classical analysis cannot be easily applied to transient or small changes. We explore the idea of “prediction without (reliable) inference,” and propose new algorithms for predicting time-varying densities from either nonparametric or parametric classes. The key idea is that an insufficient justification of inference may not be a hindrance if an analyst is only interested in prediction. [More]

State Space Models
Sequential inference and diagnosis of nonlinear non-Gaussian state space models (which is analytically intractable) has drawn much attention in fields such as finance and biology. For diagnosis in Bayesian settings, we consider alternative measure to the commonly used Bayes factors, in order to alleviate the sensitivity to vague priors. Motivated by SMC^2, a Monte Carlo method aiming at simultaneous parameter inference and state filtering, we propose a method to sequentially conduct inference and diagnosis of state space models. [More]

Sequential Nonlinear Modeling of Time Series Data
Classical decision tree based nonlinear regression analysis cannot be easily adapted to fast online implementation with provable guarantees. Given observations in the form of a (high dimensional) vector time series,  we designed a fast online algorithm that efficiently explores the underlying functional relationships across different dimensions and time lags. [More]  

Multi-Regime Analysis
The author Mark Twain once mused that “History never repeats itself, but it rhymes.” Time series data collected from various domains usually exhibit recurring patterns in addition to change points, e.g., activity data collected from wearable devices, U.S. business cycles, and electroencephalogram signals. Proper exploration and utilization of such recurring patterns can usually offer non-negligible predictive power. [More]

Data Security 
We have witnessed ever-increasing wireless sensor networks (WSNs)-based applications which induced research activities in both academia and industry. Since most of the target WSN applications are very sensitive, security issue is one of the major challenges in the deployment of WSN. One of the important building blocks in securing WSN is key management. Since WSN networks are resource (e.g., memory, computation, and energy) limited, we were motivated to study new key management solutions more efficient and secure than traditional ones. [More]

Coded Aperture Imaging
Imaging using high-energy radiation with a spectrum ranging from x-ray to γ-ray has found many applications, including high-energy astronomy and medical imaging. In these wavelengths, imaging using lenses is not possible since the rays cannot be refracted or reflected, and hence cannot be focused. We propose a broad range of combinatorial designs that extend the classical pseudorandom design, and further propose a new architecture for high-energy imaging. [More]

Some other miscellaneous works are summarized here