The growing dependence on machine learning in real-world applications emphasizes the importance of understanding and ensuring its safety. Backdoor attacks pose a significant security risk due to their stealthy nature and potentially serious consequences. Such attacks involve embedding triggers within a learning model with the intention of causing malicious behavior when an active trigger is present while maintaining regular functionality without it. This paper evaluates the effectiveness of any backdoor attack incorporating a constant trigger, by establishing tight lower and upper boundaries for the performance of the compromised model on both clean and backdoor test data. The developed theory answers a series of fundamental but previously unsolved problems, including (1) what are the determining factors for a backdoor attack’s success, (2) what is the most effective backdoor attack, and (3) when will a human-imperceptible trigger succeed. The experimental outcomes corroborate the established theory.
Adversarial learning
Backdoor attack
Statistical analysis
Continual learning (CL) is an emerging research area aiming to emulate human learning throughout a lifetime. Most existing CL approaches primarily focus on mitigating catastrophic forgetting, a phenomenon where performance on old tasks declines while learning new ones. However, human learning involves not only re-learning knowledge but also quickly recognizing the current environment, recalling related knowledge, and refining it for improved performance. In this work, we introduce a new problem setting, Adaptive CL, which captures these aspects in an online, recurring task environment without explicit task boundaries or identities. We propose the LEARN algorithm to efficiently explore, recall, and refine knowledge in such environments. We provide theoretical guarantees from two perspectives: online prediction with tight regret bounds and asymptotic consistency of knowledge. Additionally, we present a scalable implementation that requires only first-order gradients for training deep learning models. Our experiments demonstrate that the LEARN algorithm is highly effective in exploring, recalling, and refining knowledge in adaptive CL environments, resulting in superior performance compared to competing methods.
Continual learning
Online streaming data
Backdoor attacks involve inserting poisoned samples during training, resulting in a model containing a hidden backdoor that can trigger specific behaviors without impacting performance on normal samples. These attacks are challenging to detect, as the backdoored model appears normal until activated by the backdoor trigger, rendering them particularly stealthy. In this study, we devise a unified inference-stage detection framework to defend against backdoor attacks. We first rigorously formulate the inference-stage backdoor detection problem, encompassing various existing methods, and discuss several challenges and limitations. We then propose a framework with provable guarantees on the false positive rate or the probability of misclassifying a clean sample. Further, we derive the most powerful detection rule to maximize the detection power, namely the rate of accurately identifying a backdoor sample, given a false positive rate under classical learning scenarios. Based on the theoretically optimal detection rule, we suggest a practical and effective approach for real-world applications based on the latent representations of backdoored deep nets. We extensively evaluate our method on 12 different backdoor attacks using Computer Vision (CV) and Natural Language Processing (NLP) benchmark datasets. The experimental findings align with our theoretical results. We significantly surpass the state-of-the-art methods, e.g., up to 300% improvement on the detection power as evaluated by AUCROC, over the state-of-the-art defense against advanced adaptive backdoor attacks.
Backdoor defense
Data poisoning
Collaborations among various entities, such as companies, research labs, AI agents, and edge devices, have become increasingly crucial for achieving machine learning tasks that cannot be accomplished by a single entity alone. This is likely due to factors such as security constraints, privacy concerns, and limitations in computation resources. As a result, collaborative learning (CL) research has been gaining momentum. However, a significant challenge in practical applications of CL is how to effectively incentivize multiple entities to collaborate before any collaboration occurs. In this study, we propose ICL, a general framework for incentivized collaborative learning, and provide insights into the critical issue of when and why incentives can improve collaboration performance. Furthermore, we show the broad applicability of ICL to specific cases in federated learning, assisted learning, and multi-armed bandit with both theory and experimental results.
Collaborative learning
Incentives
The Whittaker 2d growth model is a triangular continuous Markov diffusion process that appears in many scientific contexts. It has been theoretically intriguing to establish a large deviation principle for this 2d process with a scaling factor. The main challenge is the spatiotemporal interactions and dynamics that may depend on potential sample-path intersections. We develop such a principle with a novel rate function. Our approach is based on Schider’s Theorem, contraction principle, and special treatment for intersecting sample paths.
Large deviation principle
Markov diffusion process
This paper proves that a class of scaled Whittaker growth models will converge in distribution to the Dyson Brownian motion. A Whittaker 2d growth model is a continuous-time Markov diffusion process embedded on a spatial triangular array. Our result is interesting because each particle in a Whittaker 2d growth model only interacts with its neighboring particles. In contrast, each particle in the Dyson Brownian motion interacts with all the other particles. We provide two different proofs of the main result.
Stochastic differential equations
Dyson Brownian motion
We introduce a new criterion for variable selection in regression models and show its optimality in terms of both loss and risk under appropriate assumptions. The key idea is to impose a penalty that is nonlinear in model dimensions. In contrast to the state-of-art model selection criteria such as the Cp method, delete-1 or delete-k cross-validation, Akaike information criterion, Bayesian information criterion, the proposed method is able to achieve asymptotic loss and risk efficiency in both parametric and nonparametric regression settings, giving new insights on the reconciliation of two types of classical criteria with different asymptotic behaviors. Adaptivity and wide applicability of the new criterion are demonstrated by several numerical experiments. Unless the signal to noise ratio is very low, it performs better than some popular methods in our experimental study. An R package ‘bc’ is released that serves as a supplement to this work.
Regression
Subset selection
Feature selection