Publication Types:

Sort by year:

Understanding Backdoor Attacks through the Adaptability Hypothesis

AI SafetyConference paper
Xun Xian, Ganghua Wang, Jayanth Srinivasa, Ashish Kundu, Xuan Bi, Mingyi Hong, Jie Ding
International Conference on Machine Learning (ICML)
Publication year: 2023

Abstract:

A poisoning backdoor attack is a rising security concern for deep learning. This type of attack can result in the backdoored model functioning normally most of the time but exhibiting abnormal behavior when presented with inputs containing the backdoor trigger, making it difficult to detect and prevent. In this work, we propose the adaptability hypothesis to understand when and why a backdoor attack works for general learning models, including deep neural networks, based on the theoretical investigation of classical kernel-based learning models. The adaptability hypothesis postulates that for an effective attack, the effect of incorporating a new dataset on the predictions of the original data points will be small, provided that the original data points are distant from the new dataset. Experiments on benchmark image datasets and state-of-the-art backdoor attacks for deep neural networks are conducted to corroborate the hypothesis. Our finding provides insight into the factors that affect the attack’s effectiveness and has implications for the design of future attacks and defenses.

Keywords:

Adversarial deep learning

Backdoor attack

Data poisoning

Personalized Federated Recommender Systems with Private and Partially Federated AutoEncoders

AI SafetyConference paper
Qi Le, Enmao Diao, Xinran Wang, Ali Anwar, Vahid Tarokh, Jie Ding
Asilomar Conference on Signals, Systems, and Computers (Asilomar)
Publication year: 2023

Abstract:

Recommender Systems (RSs) have become increasingly important in many application domains, such as digital marketing. Conventional RSs often need to collect users’ data, centralize them on the server-side, and form a global model to generate reliable recommendations. However, they suffer from two critical limitations: the personalization problem that the RSs trained traditionally may not be customized for individual users, and the privacy problem that directly sharing user data is not encouraged. We propose Personalized Federated Recommender Systems (PersonalFR), which introduces a personalized autoencoder-based recommendation model with Federated Learning (FL) to address these challenges. PersonalFR guarantees that each user can learn a personal model from the local dataset and other participating users’ data without sharing local data, data embeddings, or models. PersonalFR consists of three main components, including AutoEncoder-based RSs (ARSs) that learn the user-item interactions, Partially Federated Learning (PFL) that updates the encoder locally and aggregates the decoder on the server-side, and Partial Compression (PC) that only computes and transmits active model parameters. Extensive experiments on two real-world datasets demonstrate that PersonalFR can achieve private and personalized performance comparable to that trained by centralizing all users’ data. Moreover, PersonalFR requires significantly less computation and communication overhead than standard FL baselines.

Keywords:

Recommender systems

Data and model privacy

Model Privacy: A Unified Framework to Understand Model Stealing Attack and Defense

AI SafetyManuscript
Ganghua Wang, Yuhong Yang, Jie Ding
Manuscript under review
Publication year: 2023

Abstract:

The security of machine learning models against adversarial attacks has been an increasingly important problem in modern application scenarios such as machine-learning-as-a-service and collaborative learning. The model stealing attack is a particular threat that aims to reverse-engineer a general learned model (e.g., a server-based API, an information exchange protocol, an on-chip AI architecture) from only a tiny number of query-response interactions. Consequently, the attack may steal a proprietary model in a manner that may be much more cost-effective than the model owner’s original training cost. Many modelstealing attack and defense strategies have been proposed with good empirical success. However, most existing works are heuristic, limited in evaluation metrics, and imprecise in characterizing loss and gain. This work presents a unified conceptual framework called Model Privacy for understanding and quantifying model stealing attacks and defenseModel privacy encapsulates the foundational tradeoffs regarding the usability and vulnerability of functionality of a learned model. Based on the developed concepts, we then develop fundamental limits on privacy-utility tradeoffs and their implications in various machine learning problems (e.g., those based on linear functions, polynomials, reproducing kernels, and neural networks). The studied new problems are also interesting from a theoretical perspective, as a model owner may maneuver multiple query responses jointly to maximally enhance model privacy, violating the data independence assumption that plays a critical role in classical learning theory. For example, we show that by breaking independence, a model owner can simultaneously attain a slight utility loss and a much larger privacy gain, a desirable property not achievable in independent data regimes.

Keywords:

Model-stealing attack and defense
Privacy

Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

AI SafetyManuscript
Xun Xian, Ganghua Wang, Jayanth Srinivasa, Ashish Kundu, Xuan Bi, Mingyi Hong, Jie Ding
Manuscript under review
Publication year: 2023

Abstract:

The growing dependence on machine learning in real-world applications emphasizes the importance of understanding and ensuring its safety. Backdoor attacks pose a significant security risk due to their stealthy nature and potentially serious consequences. Such attacks involve embedding triggers within a learning model with the intention of causing malicious behavior when an active trigger is present while maintaining regular functionality without it. This paper evaluates the effectiveness of any backdoor attack incorporating a constant trigger, by establishing tight lower and upper boundaries for the performance of the compromised model on both clean and backdoor test data. The developed theory answers a series of fundamental but previously unsolved problems, including (1) what are the determining factors for a backdoor attack’s success, (2) what is the most effective backdoor attack, and (3) when will a human-imperceptible trigger succeed. The experimental outcomes corroborate the established theory.

Keywords:

Adversarial learning

Backdoor attack

Statistical analysis

A Unified Framework for Inference-Stage Backdoor Defenses

AI SafetyManuscript
Xun Xian, Ganghua Wang, Jayanth Srinivasa, Ashish Kundu, Xuan Bi, Mingyi Hong, Jie Ding
Manuscript under review
Publication year: 2023

Abstract:

Backdoor attacks involve inserting poisoned samples during training, resulting in a model containing a hidden backdoor that can trigger specific behaviors without impacting performance on normal samples. These attacks are challenging to detect, as the backdoored model appears normal until activated by the backdoor trigger, rendering them particularly stealthy. In this study, we devise a unified inference-stage detection framework to defend against backdoor attacks. We first rigorously formulate the inference-stage backdoor detection problem, encompassing various existing methods, and discuss several challenges and limitations. We then propose a framework with provable guarantees on the false positive rate or the probability of misclassifying a clean sample. Further, we derive the most powerful detection rule to maximize the detection power, namely the rate of accurately identifying a backdoor sample, given a false positive rate under classical learning scenarios. Based on the theoretically optimal detection rule, we suggest a practical and effective approach for real-world applications based on the latent representations of backdoored deep nets. We extensively evaluate our method on 12 different backdoor attacks using Computer Vision (CV) and Natural Language Processing (NLP) benchmark datasets. The experimental findings align with our theoretical results. We significantly surpass the state-of-the-art methods, e.g., up to 300% improvement on the detection power as evaluated by AUCROC, over the state-of-the-art defense against advanced adaptive backdoor attacks.

Keywords

Backdoor defense

Data poisoning

Understanding Model Extraction Games

AI SafetyConference paper
Xun Xian, Mingyi Hong, Jie Ding
2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA)
Publication year: 2022

Abstract:

The privacy of machine learning models has become a significant concern in many emerging Machine-Learning-as- a-Service applications, where prediction services based on well-trained models are offered to users via the pay-per-query scheme. However, the lack of a defense mechanism can impose a high risk on the privacy of the server’s model since an adversary could efficiently steal the model by querying only a few ‘good’ data points. The game between a server’s defense and an adversary’s attack inevitably leads to an arms race dilemma, as commonly seen in Adversarial Machine Learning. To study the fundamental tradeoffs between model utility from a benign user’s view and privacy from an adversary’s view, we develop new metrics to quantify such tradeoffs, analyze their theoretical properties, and develop an optimization problem to understand the optimal adversarial attack and defense strategies. The developed concepts and theory match the empirical findings on the ‘equilibrium’ between privacy and utility. In terms of optimization, the key ingredient that enables our results is a unified representation of the attack-defense problem as a min-max bi-level problem. The developed results are demonstrated by examples and empirical experiments.

Keywords:

 

Regression with Set-Valued Categorical Predictors

AI SafetyJournal paper
Ganghua Wang, Jie Ding and Yuhong Yang
Statistica Sinica
Publication year: 2022

Abstract:

We address the regression problem with a new form of data that arises from data privacy applications. Instead of point values, the observed explanatory variables are subsets containing each individual’s original value. The classical regression analyses such as least squares are not applicable since the set-valued predictors only carry partial information about the original values. We propose a computationally efficient subset least squares method to perform regression for such data. We establish upper bounds of the prediction loss and risk in terms of the subset structure, the model structure, and the data dimension. The error rates are shown to be optimal under some common situations. Furthermore, we develop a model selection method to identify the most appropriate model for prediction. Experiment results on both simulated and real-world datasets demonstrate the promising performance of the proposed method.

Keywords:

Model selection

Regression

Set-valued data

Mismatched Supervised Learning

AI SafetyConference paper
Xun Xian, Mingyi Hong, Jie Ding
2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Publication year: 2022

Abstract:

Supervised learning scenarios, where labels and features are possibly mismatched, have been an emerging concern in machine learning applications. For example, researchers often need to align heterogeneous data from multiple resources to the same entities without a unique identifier in the socioeconomic study. Such a mismatch problem can significantly affect learning performance if it is not appropriately addressed. Due to the combinatorial nature of the mismatch problem, existing methods are often designed for small datasets and simple linear models but are not scalable to large-scale datasets and complex models. In this paper, we first present a new formulation of the mismatch problem that supports continuous optimization problems and allows for gradient-based methods. Moreover, we develop a computation and memory-efficient method to process complex data and models. Empirical studies on synthetic and real-world data show significantly better performance of the proposed algorithms than state-of-the-art methods.

Keywords:

Label mismatch

Supervised learning

Interval Privacy: A Framework for Privacy-Preserving Data Collection

AI FoundationsAI SafetyJournal paper
Jie Ding, Bangjun Ding
IEEE Transactions on Signal Processing
Publication year: 2022

Abstract:

The emerging public awareness and government regulations of data privacy motivate new paradigms of collecting and analyzing data transparent and acceptable to data owners. We present a new concept of privacy and corresponding data formats, mechanisms, and theories for privatizing data during data collection. The privacy, named Interval Privacy, enforces the raw data conditional distribution on the privatized data to be the same as its unconditional distribution over a nontrivial support set. Correspondingly, the proposed privacy mechanism will record each data value as a random interval (or, more generally, a range) containing it. The proposed interval privacy mechanisms can be easily deployed through survey-based data collection interfaces, e.g., by asking a respondent whether its data value is within a randomly generated range. Another unique feature of interval mechanisms is that they obfuscate the truth but do not perturb it. Using narrowed range to convey information is complementary to the popular paradigm of perturbing data. Also, the interval mechanisms can generate progressively refined information at the discretion of individuals, naturally leading to privacy-adaptive data collection. We develop different aspects of theory such as composition, robustness, distribution estimation, and regression learning from interval-valued data. Interval privacy provides a new perspective of human-centric data privacy where individuals have a perceptible, transparent, and simple way of sharing sensitive data.

Keywords:

Data collection
Data privacy
Interval mechanism
Local privacy

 

GAL: Gradient Assisted Learning for Decentralized Multi-Organization Collaborations

AI SafetyAI ScalabilityConference paperDecentralized AI
Enmao Diao, Jie Ding, Vahid Tarokh
36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Publication year: 2022

Abstract:

Collaborations among multiple organizations, such as financial institutions, medical centers, and retail markets in decentralized settings are crucial to providing improved service and performance. However, the underlying organizations may have little interest in sharing their local data, models, and objective functions. These requirements have created new challenges for multi-organization collaboration. In this work, we propose Gradient Assisted Learning (GAL), a new method for multiple organizations to assist each other in supervised learning tasks without sharing local data, models, and objective functions. In this framework, all participants collaboratively optimize the aggregate of local loss functions, and each participant autonomously builds its own model by iteratively fitting the gradients of the overarching objective function. We also provide asymptotic convergence analysis and practical case studies of GAL. Experimental studies demonstrate that GAL can achieve performance close to centralized learning when all data, models, and objective functions are fully disclosed.

Keywords:

Assisted learning

Privacy

Information Laundering for Model Privacy

AI SafetyConference paper
Xinran Wang, Yu Xiang, Jun Gao, Jie Ding
International Conference on Learning Representations (ICLR), spotlight
Publication year: 2021

Abstract:

In this work, we propose information laundering, a novel framework for enhancing model privacy. Unlike data privacy that concerns the protection of raw data information, model privacy aims to protect an already-learned model that is to be deployed for public use. The private model can be obtained from general learning methods, and its deployment means that it will return a deterministic or random response for a given input query. An information laundered model consists of probabilistic components that deliberately maneuver the intended input and output for queries to the model, so the model’s adversarial acquisition is less likely. Under the proposed framework, we develop an information-theoretic principle to quantify the fundamental tradeoffs between model utility and privacy leakage and derive the optimal design.

Keywords:

Information theory
Model privacy
Optimal privacy-utility tradeoff