A curated list of Meachine learning Security & Privacy papers published in security top-4 conferences (IEEE S&P, ACM CCS, USENIX Security and NDSS).
- Awesome-ML-Security-and-Privacy-Papers
- Contents:
- 1. Security Papers
- 2. Privacy Papers
- Contributing
- Licenses
-
Hybrid Batch Attacks: Finding Black-box Adversarial Examples with Limited Queries. USENIX Security 2020.
Transferability + Query. Black-box Attack[pdf] [code] -
Adversarial Preprocessing: Understanding and Preventing Image-Scaling Attacks in Machine Learning. USENIX Security 2020.
Defense of Image Scaling Attack[pdf] [code] -
HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. IEEE S&P 2020.
Query-based Black-box Attack[pdf] [code] -
PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking. USENIX Security 2021.
Adversarial Patch Defense[pdf] [code] -
Gotta Catch'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks. ACM CCS 2020.
Build an trap in model to induce specific adversarial perturbation[pdf] [code] -
A Tale of Evil Twins: Adversarial Inputs versus Poisoned Models. ACM CCS 2020.
Perturbate both input and model[pdf] [code] -
Feature-Indistinguishable Attack to Circumvent Trapdoor-Enabled Defense. ACM CCS 2021.
A new attack method can break TeD defense mechanism[pdf] [code] -
DetectorGuard: Provably Securing Object Detectors against Localized Patch Hiding Attacks. ACM CCS 2021.
Provable robustness for patch hiding in object detection[pdf] [code] -
It's Not What It Looks Like: Manipulating Perceptual Hashing based Applications. ACM CCS 2021.
Adversarial Attack against PHash[pdf] [code] -
RamBoAttack: A Robust and Query Efficient Deep Neural Network Decision Exploit. NDSS 2022.
Query-based black box attack[pdf] [code] -
What You See is Not What the Network Infers: Detecting Adversarial Examples Based on Semantic Contradiction. NDSS 2022.
Generative-based AE detection[pdf] [code] -
AutoDA: Automated Decision-based Iterative Adversarial Attacks. USENIX 2022.
Program Synthesis for Adversarial Attack[pdf] -
Blacklight: Scalable Defense for Neural Networks against Query-Based Black-Box Attacks. USENIX Security 2022.
AE Detection using probabilistic fingerprints based on hash of input similarity[pdf] [code] -
Physical Hijacking Attacks against Object Trackers. ACM CCS 2022.
Adversarial Attacks on Object Trackers[pdf] [code] -
Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models. ACM CCS 2022.
Adversarial Attacks on Object Trackers[pdf] -
Squint Hard Enough: Attacking Perceptual Hashing with Adversarial Machine Learning. USENIX Security 2023.
Adversarial Attacks against PhotoDNA and PDQ[pdf] -
The Space of Adversarial Strategies. USENIX Security 2023.
Decompose the Adversarial Attack Components and combine them together[pdf] -
Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks. ACM CCS 2023.
Attack strategy to enhance the query-based attack against the stateful defense[pdf] [code] -
BounceAttack: A Query-Efficient Decision-based Adversarial Attack by Bouncing into the Wild. IEEE S&P 2024.
Query-based hard label attack[pdf] -
Sabre: Cutting through Adversarial Noise with Adaptive Spectral Filtering and Input Reconstruction. IEEE S&P 2024.
Filter-based adversarial perturbation defense[pdf] [code] -
Sabre: Cutting through Adversarial Noise with Adaptive Spectral Filtering and Input Reconstruction. IEEE S&P 2024.
Adversarial attack against face recognization system[pdf] [code] -
Why Does Little Robustness Help? A Further Step Towards Understanding Adversarial Transferability. IEEE S&P 2024.
Exploring the transferability of adversarial examples[pdf] [code] -
Group-based Robustness: A General Framework for Customized Robustness in the Real World. NDSS 2024.
New metrics to measure adversarial examples[pdf] -
DorPatch: Distributed and Occlusion-Robust Adversarial Patch to Evade Certifiable Defenses. NDSS 2024.
Adversarial path against certified robustness[pdf] [code] -
UniID: Spoofing Face Authentication System by Universal Identity. NDSS 2024.
Face apoofing attack[pdf] -
Enhance Stealthiness and Transferability of Adversarial Attacks with Class Activation Mapping Ensemble Attack. NDSS 2024.
Enhancing transferability of adversarial examples[pdf] [code] -
Neural Invisibility Cloak: Concealing Adversary in Images via Compromised AI-driven Image Signal Processing. USENIX Security 2025. [pdf]
-
Self-interpreting Adversarial Images. USENIX Security 2025. [pdf]
-
TextShield: Robust Text Classification Based on Multimodal Embedding and Neural Machine Translation. USENIX Security 2020.
Defense in preprossing[pdf] -
Bad Characters: Imperceptible NLP Attacks. IEEE S&P 2022.
Use unicode to conduct human imperceptible attack[pdf] [code] -
Order-Disorder: Imitation Adversarial Attacks for Black-box Neural Ranking Models. ACM CCS 2022.
Attack Neural Ranking Models[pdf] -
No more Reviewer #2: Subverting Automatic Paper-Reviewer Assignment using Adversarial Learning. USENIX Security 2023.
Adversarial Attack on Paper Assignment[pdf]
-
WaveGuard: Understanding and Mitigating Audio Adversarial Examples. USENIX Security 2021.
Defense in preprossing[pdf] [code] -
Dompteur: Taming Audio Adversarial Examples. USENIX Security 2021.
Defense in preprossing. Preprocessing the audio to make the noise human noticeable[pdf] [code] -
EarArray: Defending against DolphinAttack via Acoustic Attenuation. NDSS 2021.
Defense[pdf] -
Who is Real Bob? Adversarial Attacks on Speaker Recognition Systems. IEEE S&P 2021.
Attack[pdf] [code] -
Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems. IEEE S&P 2021.
Black-box Attack[pdf] -
SoK: The Faults in our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems. IEEE S&P 2021.
Survey[pdf] -
AdvPulse: Universal, Synchronization-free, and Targeted Audio Adversarial Attacks via Subsecond Perturbations. ACM CCS 2020.
Attack[pdf] -
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information. ACM CCS 2021.
Black-box Attack. Physical World[pdf] -
Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception. ACM CCS 2022.
Adversarial Audio with human-aware noise[pdf] -
SpecPatch: Human-in-the-Loop Adversarial Audio Spectrogram Patch Attack on Speech Recognition. ACM CCS 2022.
Adversarial Patch for audio[pdf] -
Learning Normality is Enough: A Software-based Mitigation against Inaudible Voice Attacks. USENIX Security 2023.
Unsupervised learning-based defense[pdf] -
Understanding and Benchmarking the Commonality of Adversarial Examples. IEEE S&P 2024.
Common features of adversarial audio examples[pdf] -
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features. IEEE S&P 2024.
Black-box adverarial audio attack[pdf] [code] -
Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models. NDSS 2024.
Black-box adverarial audio attack using parrot[pdf] -
When Translators Refuse to Translate: A Novel Attack to Speech Translation Systems. USENIX Security 2025. [pdf]
-
Universal 3-Dimensional Perturbations for Black-Box Attacks on Video Recognition Systems. IEEE S&P 2022.
Adversarial attack in video recognition[pdf] -
StyleFool: Fooling Video Classification Systems via Style Transfer. IEEE S&P 2023.
Style Transfer to conduct adversarial attack[pdf] [code]
- A Hard Label Black-box Adversarial Attack Against Graph Neural Networks. ACM CCS 2021.
Graph Classification[pdf]
-
Evading Classifiers by Morphing in the Dark. ACM CCS 2017.
Morpher and search to generate adversarial PDF[pdf] -
Misleading Authorship Attribution of Source Code using Adversarial Learning. USENIX Security 2019.
Adversarial attack in source code, MCST[pdf] [code] -
Intriguing Properties of Adversarial ML Attacks in the Problem Space. IEEE S&P 2020.
Attack Malware Classification[pdf] -
Structural Attack against Graph Based Android Malware Detection. IEEE S&P 2020.
Perturbed function call graph[pdf] -
URET: Universal Robustness Evaluation Toolkit (for Evasion). USENIX Security 2023.
General Toolbox to select the perdefined perturbations[pdf] [code] -
Adversarial Training for Raw-Binary Malware Classifiers. USENIX Security 2023.
Adversarial Training for Windows PE malware[pdf] -
PELICAN: Exploiting Backdoors of Naturally Trained Deep Learning Models In Binary Code Analysis. USENIX Security 2023.
Reverse engineering natural backdoor in transformer-based x86 binary code analysis task[pdf] -
Black-box Adversarial Example Attack towards FCG Based Android Malware Detection under Incomplete Feature Information. USENIX Security 2023.
Black-box Android Adversarial Malware against the FCG-based ML classifier[pdf] -
Efficient Query-Based Attack against ML-Based Android Malware Detection under Zero Knowledge Setting. ACM CCS 2023.
Semantic similar perturbations are more likely to have similar evasion effectiveness[pdf] [code] -
Make a Feint to the East While Attacking in the West: Blinding LLM-Based Code Auditors with Flashboom Attacks. IEEE S&P 2025. [pdf]
-
ATTRITION: Attacking Static Hardware Trojan Detection Techniques Using Reinforcement Learning. ACM CCS 2022.
Attack Hardware Trojan Detection[pdf] -
DeepShuffle: A Lightweight Defense Framework against Adversarial Fault Injection Attacks on Deep Neural Networks in Multi-Tenant Cloud-FPGA. IEEE S&P 2024.
Adversarial defense against adversarial fault injection[pdf]
-
Interpretable Deep Learning under Fire. USENIX Security 2020.
Attack both image classification and interpret method[pdf] -
“Is your explanation stable?”: A Robustness Evaluation Framework for Feature Attribution. ACM CCS 2022.
Hypothesis Testing to increasing the robustness of explaination methods[pdf] -
AIRS: Explanation for Deep Reinforcement Learning based Security Applications. USENIX Security 2023.
DRL Interpertation Method to pinpoint the most influence step[pdf] [code] -
SoK: Explainable Machine Learning in Adversarial Environments. IEEE S&P 2024.
Adversarial Explaination SoK[pdf
-
SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations. USENIX Security 2021.
Projector light causes misclassification[pdf] [code] -
Understanding Real-world Threats to Deep Learning Models in Android Apps. ACM CCS 2022.
Adversarial Attack in real-world models[pdf] -
X-Adv: Physical Adversarial Object Attacks against X-ray Prohibited Item Detection. USENIX Security 2023.
Adversarial Attack on X-ray Images[pdf] [code] -
That Person Moves Like A Car: Misclassification Attack Detection for Autonomous Systems Using Spatiotemporal Consistency. USENIX Security 2023.
Robust OD in Autonomous System using spatiotemporal information[pdf] -
You Can't See Me: Physical Removal Attacks on LiDAR-based Autonomous Vehicles Driving Frameworks. USENIX Security 2023.
Adversarial attack against Autonomous Vehicles using Laser[pdf] demo] -
CAPatch: Physical Adversarial Patch against Image Captioning Systems. USENIX Security 2023.
Physical Adversarial Patch against the image caption system[pdf] [code] -
Exorcising "Wraith": Protecting LiDAR-based Object Detector in Automated Driving System from Appearing Attacks. USENIX Security 2023.
Defend the appearing attack in autonomous system using local objectness predictor[pdf] [code] -
Invisible Reflections: Leveraging Infrared Laser Reflections to Target Traffic Sign Perception. NDSS 2024.
Adversarial attacks on automous vehicles using infrared laser reflections[pdf] -
Avara: A Uniform Evaluation System for Perceptibility Analysis Against Adversarial Object Evasion Attacks. CCS 2024.
Adversarial Object Evasion attack evaluation system[pdf] [code] -
VisionGuard: Secure and Robust Visual Perception of Autonomous Vehicles in Practice. CCS 2024.
Adversarial Patch detection[pdf] [demo] -
Invisible but Detected: Physical Adversarial Shadow Attack and Defense on LiDAR Object Detection. USENIX Security 2025. [pdf]
-
From Threat to Trust: Exploiting Attention Mechanisms for Attacks and Defenses in Cooperative Perception. USENIX Security 2025. [pdf]
-
Adversarial Policy Training against Deep Reinforcement Learning. USENIX Security 2021.
Weird behavior to trigger opposite abnormal action. Two-agent competitor game[pdf] [code] -
SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning Systems. CCS 2024.
Adversarial policy against the reinforcement learning system[pdf] [code] -
CAMP in the Odyssey: Provably Robust Reinforcement Learning with Certified Radius Maximization. USENIX Security 2025. [pdf]
-
Cost-Aware Robust Tree Ensembles for Security Applications. USENIX Security 2021.
Propose Cost of feature to certify the model robustness[pdf] [code] -
CADE: Detecting and Explaining Concept Drift Samples for Security Applications. USENIX Security 2021.
Detect Concept shift[pdf] [code] -
Learning Security Classifiers with Verified Global Robustness Properties. ACM CCS 2021.
Train a classifier with global robustness[pdf] [code] -
On the Robustness of Domain Constraints. ACM CCS 2021.
Domain constraints. Input space robustness[pdf] -
Cert-RNN: Towards Certifying the Robustness of Recurrent Neural Networks. ACM CCS 2021.
Certify robustness in RNN[pdf] -
TSS: Transformation-Specific Smoothing for Robustness Certification. ACM CCS 2021.
Certify robustness about transformation[pdf][code] -
Transcend: Detecting Concept Drift in Malware Classification Models. USENIX Security 2017.
Conformal evaluators[pdf] [code] -
Transcending Transcend: Revisiting Malware Classification in the Presence of Concept Drift. IEEE S&P 2022.
New conformal evaluators[pdf][code] -
Transferring Adversarial Robustness Through Robust Representation Matching. USENIX Security 2022.
Robust Transfer Learning[pdf] -
DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing. USENIX Security 2023.
Diffusion Model Improve Certified Robustness[pdf] -
Anomaly Detection in the Open World: Normality Shift Detection, Explanation, and Adaptation. NDSS 2023.
Concept Drift Detection using unsupervised approch[pdf] [code] -
BARS: Local Robustness Certification for Deep Learning based Traffic Analysis Systems. NDSS 2023.
Certified Robustness for Traffic Analysis Systems[pdf] [code] -
REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service. NDSS 2023.
Build a certificable EaaS model[pdf] -
Continuous Learning for Android Malware Detection. USENIX Security 2023.
New Continual Learning Paridigram for Malware detection[pdf] [code] -
ObjectSeeker: Certifiably Robust Object Detection against Patch Hiding Attacks via Patch-agnostic Masking. IEEE S&P 2023.
Certified robustness of object detection[pdf] [code] -
On The Empirical Effectiveness of Unrealistic Adversarial Hardening Against Realistic Adversarial Attacks. IEEE S&P 2023.
Adversarial attacks on feature space may enhance the robustness in problem space[pdf] [code] -
Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks. IEEE S&P 2024.
Certified robustness on adversarial text[pdf] [code] -
It's Simplex! Disaggregating Measures to Improve Certified Robustness. IEEE S&P 2024.
Disagreement to improve the certified robustness[pdf] [code] -
SoK: Efficiency Robustness of Dynamic Deep Learning Systems. USENIX Security 2025. [pdf]
-
AGNNCert: Defending Graph Neural Networks against Arbitrary Perturbations with Deterministic Certification. USENIX Security 2025. [pdf]
-
Robustifying ML-powered Network Classifiers with PANTS. USENIX Security 2025. [pdf]
-
CertTA: Certified Robustness Made Practical for Learning-Based Traffic Analysis. USENIX Security 2025. [pdf]
-
Sylva: Tailoring Personalized Adversarial Defense in Pre-trained Models via Collaborative Fine-tuning. ACM CCS 2025. [pdf]
-
Defeating DNN-Based Traffic Analysis Systems in Real-Time With Blind Adversarial Perturbations. USENIX Security 2021.
Adversarial attack to defeat DNN-based traffic analysis[pdf] [code] -
Pryde: A Modular Generalizable Workflow for Uncovering Evasion Attacks Against Stateful Firewall Deployments. IEEE S&P 2024.
Evasion attack against Firewalls[pdf] -
Multi-Instance Adversarial Attack on GNN-Based Malicious Domain Detection. IEEE S&P 2024.
Adversarial attack on GNN-based malicious domain detection[pdf] [code] -
Swallow: A Transfer-Robust Website Fingerprinting Attack via Consistent Feature Learning. ACM CCS 2025. [pdf]
- Robust Adversarial Attacks Against DNN-Based Wireless Communication Systems. ACM CCS 2021.
Attack[pdf]
- Adversarial Robustness for Tabular Data through Cost and Utility Awareness. NDSS 2023.
Adversarial Attack & Defense on tabular data[pdf]
-
Local Model Poisoning Attacks to Byzantine-Robust Federated Learning. USENIX Security 2020.
Poisoning Attack[pdf] -
Manipulating the Byzantine: Optimizing Model Poisoning Attacks and Defenses for Federated Learning. NDSS 2021.
Poisoning Attack[pdf] -
DeepSight: Mitigating Backdoor Attacks in Federated Learning Through Deep Model Inspection. NDSS 2022.
Backdoor defense[pdf] -
FLAME: Taming Backdoors in Federated Learning. USENIX Security 2022.
Backdoor defense[pdf] -
EIFFeL: Ensuring Integrity for Federated Learning. ACM CCS 2022.
New FL Protocol to guarteen integrity[pdf] -
Eluding Secure Aggregation in Federated Learning via Model Inconsistency. ACM CCS 2022.
Model inconsistency to break the secure aggregation[pdf] -
FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information. IEEE S&P 2023.
Poisoned Model Recovery Algorithm[pdf] -
Every Vote Counts: Ranking-Based Training of Federated Learning to Resist Poisoning Attacks. USENIX Security 2023.
Discrete the model updates and purning the model to defense the poisoning attack[pdf] [code] -
Securing Federated Sensitive Topic Classification against Poisoning Attacks. NDSS 2023.
Robust Aggregation against the poisoning attack[pdf] -
BayBFed: Bayesian Backdoor Defense for Federated Learning. IEEE S&P 2023.
Purify the model updates using bayesian[pdf] -
ADI: Adversarial Dominating Inputs in Vertical Federated Learning Systems. IEEE S&P 2023.
Poisoning the vertical federated learning system[pdf] [code] -
3DFed: Adaptive and Extensible Framework for Covert Backdoor Attack in Federated Learning. IEEE S&P 2023.
Convert normal backdoor into the federated learning scenario[pdf] -
FLShield: A Validation Based Federated Learning Framework to Defend Against Poisoning Attacks. IEEE S&P 2023.
Data poisoning defense[pdf] -
BadVFL: Backdoor Attacks in Vertical Federated Learning. IEEE S&P 2023.
Backdoor attacks against vertical federated learning[pdf] -
CrowdGuard: Federated Backdoor Detection in Federated Learning. NDSS 2024.
Backdoor detection in federated learning leveraging hidden layer outputs[pdf] [code] -
Automatic Adversarial Adaption for Stealthy Poisoning Attacks in Federated Learning. NDSS 2024.
Adaptative poisoning attacks in FL[pdf] -
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning. NDSS 2024.
Mitigate poisoning attack in FL using frequency analysis techniques[pdf] -
Dealing Doubt: Unveiling Threat Models in Gradient Inversion Attacks under Federated Learning – A Survey and Taxonomy. CCS 2024.
Mitigate poisoning attack in FL using frequency analysis techniques[pdf] -
Byzantine-Robust Decentralized Federated Learning. CCS 2024.
Byzantine robust federated learning[pdf] -
PoiSAFL: Scalable Poisoning Attack Framework to Byzantine-resilient Semi-asynchronous Federated Learning. USENIX Security 2025. [pdf]
-
DeBackdoor: A Deductive Framework for Detecting Backdoor Attacks on Deep Models with Limited Data. USENIX Security 2025. [pdf]
- Justinian's GAAvernor: Robust Distributed Learning with Gradient Aggregation Agent. USENIX Security 2020.
Defense in Gradient Aggregation. Reinforcement learning[pdf]
- Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning. IEEE S&P 2020.
Hijack Word Embedding[pdf]
-
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion. USENIX Security 2021.
Hijack Code Autocomplete[pdf] -
TROJANPUZZLE: Covertly Poisoning Code-Suggestion Models. IEEE S&P 2024.
Hijack Code Autocomplete[pdf] [code]
- Poisoning the Unlabeled Dataset of Semi-Supervised Learning. USENIX Security 2021.
Poisoning semi-supervised learning[pdf]
-
Data Poisoning Attacks to Deep Learning Based Recommender Systems. NDSS 2021.
The attacker chosen items are recommended as much as possible[pdf] -
Reverse Attack: Black-box Attacks on Collaborative Recommendation. ACM CCS 2021.
Black-box setting. Surrogate model. Collaborative Filtering. Demoting and Promoting[pdf]
-
Subpopulation Data Poisoning Attacks. ACM CCS 2021.
Poisoning to flip a group of data samples[pdf] -
Get a Model! Model Hijacking Attack Against Machine Learning Models. NDSS 2022.
Fusing dataset to hijacking model[pdf] [code]
-
PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning. USENIX Security 2022.
Poison attack in constractive learning[pdf] -
Preference Poisoning Attacks on Reward Model Learning. IEEE S&P 2025.
Poison attack in reward model learning[pdf]
- Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets. ACM CCS 2022.
Poison attack to reveal sensitive information[pdf]
- Test-Time Poisoning Attacks Against Test-Time Adaptation Models. IEEE S&P 2024.
Poisoning attack at test time[pdf] [code]
-
Poison Forensics: Traceback of Data Poisoning Attacks in Neural Networks. USENIX Security 2022.
Identify poisioned subset by clustering and purning benign set[pdf] -
Understanding Implosion in Text-to-Image Generative Models. CCS 2024.
Analytic framework for the poisoning attack against T2I model[pdf]
-
Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. USENIX Security 2021.
Class-specific Backdoor. Defense by decomposition[pdf] -
Double-Cross Attacks: Subverting Active Learning Systems. USENIX Security 2021.
Active Learning System. Backdoor Attack[pdf] -
Detecting AI Trojans Using Meta Neural Analysis. IEEE S&P 2021.
Meta Neural Classifier[pdf] [code] -
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning. IEEE S&P 2022.
Backdoor attack in image-text pretrained model[pdf] [code] -
Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features. ACM CCS 2020.
Composite backdoor. Image & text tasks[pdf] [code] -
AI-Lancet: Locating Error-inducing Neurons to Optimize Neural Networks. ACM CCS 2021.
Locate neural location and finetuning it[pdf] -
LoneNeuron: a Highly-Effective Feature-Domain Neural Trojan Using Invisible and Polymorphic Watermarks. ACM CCS 2022.
Backdoor attack by modifying neuros[pdf] -
ATTEQ-NN: Attention-based QoE-aware Evasive Backdoor Attacks. NDSS 2022.
Backdoor attack by attention techniques[pdf] -
RAB: Provable Robustness Against Backdoor Attacks. IEEE S&P 2023.
Backdoor Cetrification[pdf] -
A Data-free Backdoor Injection Approach in Neural Networks. USENIX Security 2023.
Data free backdoor injection[pdf] [code] -
Backdoor Attacks Against Dataset Distillation. NDSS 2023.
Backdoor attack against dataset istillation[pdf] [code] -
BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense. NDSS 2023.
Backdoor Forensics[pdf] [code] -
Disguising Attacks with Explanation-Aware Backdoors. IEEE S&P 2023.
Backdoor to mislead the explaination method[pdf] -
Selective Amnesia: On Efficient, High-Fidelity and Blind Suppression of Backdoor Effects in Trojaned Machine Learning Models. IEEE S&P 2023.
Finetuning to remove backdoor[pdf] -
AI-Guardian: Defeating Adversarial Attacks using Backdoors. IEEE S&P 2023.
using backdoor to detect adversarial example. Backdoor with all-to-all mapping and reverse the mapping[pdf] -
REDEEM MYSELF: Purifying Backdoors in Deep Learning Models using Self Attention Distillation. IEEE S&P 2023.
Purifying backdoor using model distillation[pdf] -
NARCISSUS: A Practical Clean-Label Backdoor Attack with Limited Information. ACM CCS 2023.
Clean label backdoor attack[pdf] [code] -
ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms. USENIX Security 2023.
Backdoor Defense works in Different Learning Paradigms[pdf] [code] -
ODSCAN: Backdoor Scanning for Object Detection Models. IEEE S&P 2024.
Backdoor defense by model dynamics[pdf] [github] -
MM-BD: Post-Training Detection of Backdoor Attacks with Arbitrary Backdoor Pattern Types Using a Maximum Margin Statistic. IEEE S&P 2024.
Backdoor defense using maximum margin statistic in classification layer[pdf] [github] -
Distribution Preserving Backdoor Attack in Self-supervised Learning. IEEE S&P 2024.
Backdoor attack in contrastive learning by improving the distribution[pdf] [github] -
Backdooring Bias (B^2) into Stable Diffusion Models. USENIX Security 2025.
Backdoor attack in stable diffusion model[pdf] -
Watch the Watchers! On the Security Risks of Robustness-Enhancing Diffusion Models. USENIX Security 2025. [pdf]
-
Pretender: Universal Active Defense against Diffusion Finetuning Attacks. USENIX Security 2025. [pdf]
-
Rowhammer-Based Trojan Injection: One Bit Flip Is Sufficient for Backdooring DNNs. USENIX Security 2025. [pdf]
-
From Purity to Peril: Backdooring Merged Models From "Harmless" Benign Components. USENIX Security 2025. [pdf]
-
Revisiting Training-Inference Trigger Intensity in Backdoor Attacks. USENIX Security 2025. [pdf]
-
Persistent Backdoor Attacks in Continual Learning. USENIX Security 2025. [pdf]
-
T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification. USENIX Security 2021.
Backdoor Defense. GAN to recover trigger[pdf] [code] -
Hidden Backdoors in Human-Centric Language Models. ACM CCS 2021.
Novel trigger[pdf] [code] -
Backdoor Pre-trained Models Can Transfer to All. ACM CCS 2021.
Backdoor in pre-trained to poison the down stream task[pdf] [code] -
Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation. USENIX Security 2022.
Backdoor via linguistic style manipulation[pdf] -
TextGuard: Provable Defense against Backdoor Attacks on Text Classification. NDSS 2024.
Provable backdoor defense by spliting the sentence and ensumble learning[pdf] [code]
-
Graph Backdoor. USENIX Security 2021.
Classification[pdf] [code] -
Distributed Backdoor Attacks on Federated Graph Learning and Certified Defenses. CCS 2024.
Distributed Backdoor attacks on federated graph learning[pdf] [code]
- Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers. USENIX Security 2021.
Explanation Method. Evade Classification[pdf] [code]
-
TrojanModel: A Practical Trojan Attack against Automatic Speech Recognition Systems. IEEE S&P 2023.
Backdoor attack in speech recognition systems[pdf] -
MagBackdoor: Beware of Your Loudspeaker as Backdoor of Magnetic Attack for Malicious Command Injection. IEEE S&P 2023.
Backdoor attack in audio using magentic trigget[pdf]
- Sneaky Spikes: Uncovering Stealthy Backdoor Attacks in Spiking Neural Networks with Neuromorphic Data. NDSS 2024.
Backdoor attack in neuromorphic data[pdf] [code]
-
Blind Backdoors in Deep Learning Models. USENIX Security 2021.
Loss Manipulation. Backdoor[pdf] [code] -
IvySyn: Automated Vulnerability Discovery in Deep Learning Frameworks. USENIX Security 2023.
Automatic Bug Discovery in ML libraries[pdf]
-
Towards Understanding and Detecting Cyberbullying in Real-world Images. NDSS 2021.
Detect image cyberbully[pdf] -
You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content. IEEE S&P 2024.
Using LLM for toxic content detection[pdf] [code] -
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns. USENIX Security 2025. [pdf] [code]
-
FARE: Enabling Fine-grained Attack Categorization under Low-quality Labeled Data. NDSS 2021.
Clustering Method to complete the dataset label[pdf] [code] -
From Grim Reality to Practical Solution: Malware Classification in Real-World Noise. IEEE S&P 2023.
Noise Learning method for malware detection[pdf] [code] -
Decoding the Secrets of Machine Learning in Windows Malware Classification: A Deep Dive into Datasets, Features, and Model Performance. ACM CCS 2023.
static features are better than dynamic feature in WindowsPE malware detection[pdf] -
KAIROS: Practical Intrusion Detection and Investigation using Whole-system Provenance. IEEE S&P 2024.
GNN-based intrusion detection method[pdf] [code] -
FLASH: A Comprehensive Approach to Intrusion Detection via Provenance Graph Representation Learning. IEEE S&P 2024.
GNN-based intrusion detection method[pdf] [code] -
Understanding and Bridging the Gap Between Unsupervised Network Representation Learning and Security Analytics. IEEE S&P 2024.
Unsupervised graph learning for graph-based security applications[pdf] [code] -
FP-Fed: Privacy-Preserving Federated Detection of Browser Fingerprinting. NDSS 2024.
Federated learning for browser fingerprinting[pdf] -
GNNIC: Finding Long-Lost Sibling Functions with Abstract Similarity. NDSS 2024.
GNN for static analysis[pdf] -
Experimental Analyses of the Physical Surveillance Risks in Client-Side Content Scanning. NDSS 2024.
Attack client scanning systems[pdf] -
Attributions for ML-based ICS Anomaly Detection: From Theory to Practice. NDSS 2024.
Evaluating attribution methods for industrial control systems[pdf] [code] -
DRAINCLoG: Detecting Rogue Accounts with Illegally-obtained NFTs using Classifiers Learned on Graphs. NDSS 2024.
Detecting rogue accounts in NFTs using GNN[pdf] -
Low-Quality Training Data Only? A Robust Framework for Detecting Encrypted Malicious Network Traffic. NDSS 2024.
Training ML-based traffic detection using low-quality data[pdf] [code] -
SafeEar: Content Privacy-Preserving Audio Deepfake Detection. ACM CCS 2024.
Speech content privacy-preserving deepfake detection[pdf] [website] [code] [dataset] -
USD: NSFW Content Detection for Text-to-Image Models via Scene Graph. USENIX Security 2025.
NSFWE image detection[pdf] -
On the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts. USENIX Security 2025.
NSFWE image defense[pdf] -
VoiceWukong: Benchmarking Deepfake Voice Detection. USENIX Security 2025. [pdf]
-
SafeSpeech: Robust and Universal Voice Protection Against Malicious Speech Synthesis. USENIX Security 2025. [pdf]
-
Slot: Provenance-Driven APT Detection through Graph Reinforcement Learning. ACM CCS 2025. [pdf]
-
Combating Concept Drift with Explanatory Detection and Adaptation for Android Malware Classification. ACM CCS 2025. [pdf]
-
MM4flow: A Pre-trained Multi-modal Model for Versatile Network Traffic Analysis. ACM CCS 2025. [pdf]
-
Analyzing PDFs like Binaries: Adversarially Robust PDF Malware Analysis via Intermediate Representation and Language Model. ACM CCS 2025. [pdf]
- WtaGraph: Web Tracking and Advertising Detection using Graph Neural Networks. IEEE S&P 2022.
GNN[pdf]
-
Text Captcha Is Dead? A Large Scale Deployment and Empirical Studys. ACM CCS 2020.
Adversarial CAPTCHA[pdf] -
Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems. NDSS 2023.
Adversarial Audio CAPTCHA[pdf] [demo] -
A Generic, Efficient, and Effortless Solver with Self-Supervised Learning for Breaking Text Captchas. IEEE S&P 2023.
Text CAPTCHA Solver[pdf]
-
PalmTree: Learning an Assembly Language Model for Instruction Embedding. ACM CCS 2021.
Pre-trained model to generate code embedding[pdf] [code] -
CALLEE: Recovering Call Graphs for Binaries with Transfer and Contrastive Learning. IEEE S&P 2023.
Recovering call graph from binaries using transfer and contrastive learning[pdf] [code] -
Examining Zero-Shot Vulnerability Repair with Large Language Models. IEEE S&P 2023.
Zero-short vulnerability repair using large language model[pdf] -
Raconteur: A Knowledgeable, Insightful, and Portable LLM-Powered Shell Command Explainer. NDSS 2025.
LLM-powered malicious code analysis[pdf] [website]
- Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots. ACM CCS 2022.
Measuring Chatbot Textico behavior[pdf]
-
Towards a General Video-based Keystroke Inference Attack. USENIX Security 2023.
Self Supervised Learning to recover the keybroad input[pdf] -
Deep perceptual hashing algorithms with hidden dual purpose: when client-side scanning does facial recognition. IEEE S&P 2023.
Manipulate deep phash algorithm to conduct specific person inference[pdf] [code]
-
Dos and Don'ts of Machine Learning in Computer Security. USENIX Security 2022.
Survey pitfalls in ML4Security[pdf] -
“Security is not my field, I’m a stats guy”: A Qualitative Root Cause Analysis of Barriers to Adversarial Machine Learning Defenses in Industry. USENIX Security 2023.
Survey AML Application in Industry[pdf] -
Everybody’s Got ML, Tell Me What Else You Have: Practitioners’ Perception of ML-Based Security Tools and Explanations. IEEE S&P 2023.
Explainable AI in practice[pdf]
- CERBERUS: Exploring Federated Prediction of Security Events. ACM CCS 2022.
Federated Learning to predict security event[pdf]
- VulChecker: Graph-based Vulnerability Localization in Source Code. USENIX Security 2023.
Detecting Bugs using GCN[pdf] [code]
- On the Security Risks of AutoML. USENIX Security 2022.
Adversarial evasion. Model poisoning. Backdoor. Functionality stealing. Membership Inference[pdf]
-
DeepDyve: Dynamic Verification for Deep Neural Networks. ACM CCS 2020. [pdf]
-
NeuroPots: Realtime Proactive Defense against Bit-Flip Attacks in Neural Networks. USENIX Security 2023.
Honey Pot to trap the bitflip attacks[pdf] -
Aegis: Mitigating Targeted Bit-flip Attacks against Deep Neural Networks. USENIX Security 2023.
Train multi classifer to defend the BFA[pdf] [code]
- DeepAID: Interpreting and Improving Deep Learning-based Anomaly Detection in Security Applications. ACM CCS 2021.
Anomaly detection[pdf] [code]
- Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing. ACM CCS 2023.
Trend-based faithfulness testing[pdf] [code]
- FINER: Enhancing State-of-the-art Classifiers with Feature Attribution to Facilitate Security Analysis. ACM CCS 2023.
Ensumble explaination for different stakeholder[pdf] [code]
- Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction. USENIX Security 2022.
deepfake detection using vocal tract reconstruction[pdf]
-
ImU: Physical Impersonating Attack for Face Recognition System with Natural Style Changes. IEEE S&P 2023.
StyleGAN to impersonate persion[pdf] [code] -
DepthFake: Spoofing 3D Face Authentication with a 2D Photo. IEEE S&P 2023.
Adversarial image to attack 3D photos[pdf] [demo]
- Understanding the (In)Security of Cross-side Face Verification Systems in Mobile Apps: A System Perspective. IEEE S&P 2023.
Measurement study of the security risks of cross-side face verification systems.[pdf]
-
Deepfake Text Detection: Limitations and Opportunities. IEEE S&P 2023.
Detecting the machine generated text[pdf] [code] -
MGTBench: Benchmarking Machine-Generated Text Detection. CCS 2024.
Benchmarking machine generated text detection[pdf] [code] -
**SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models **. CCS 2025.
Benchmarking machine generated text detection[pdf]
-
SoK: The Good, The Bad, and The Unbalanced: Measuring Structural Limitations of Deepfake Media Datasets. USENIX Security 2024.
Issues in deepfake media dataset[pdf] [website] -
SafeEar: Content Privacy-Preserving Audio Deepfake Detection. ACM CCS 2024.
Speech content privacy-preserving deepfake detection[pdf] [website] [code] [dataset] -
"Better Be Computer or I’m Dumb": A Large-Scale Evaluation of Humans as Audio Deepfake Detectors. ACM CCS 2024.
Huamn in deepfake detection[pdf]
-
Large Language Models for Code: Security Hardening and Adversarial Testing. ACM CCS 2023.
Prefix tuning for secure code generation[pdf] [code] -
DeGPT: Optimizing Decompiler Output with LLM. NDSS 2024.
LLM-enhanced reverse engineering[pdf] [code] -
Raconteur: A Knowledgeable, Insightful, and Portable LLM-Powered Shell Command Explainer. NDSS 2025.
LLM-powered malicious code analysis[pdf] [website] -
PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs). CCS 2024.
Black-box LLM secure code generation[pdf] [code] -
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs. USENIX Security 2025. [pdf]
-
Transferable Multimodal Attack on Vision-Language Pre-training Models. IEEE S&P 2024.
Transferable adversarial attack on VLM[pdf] -
SneakyPrompt: Jailbreaking Text-to-image Generative Models. IEEE S&P 2024.
Jailbreaking text-to-image generative model using reinforcement-learning adversarial NLP methods[pdf] [code] -
SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models. ACM CCS 2024.
defending against unsafe content generation in text-to-image models[pdf] [code] [model] -
SurrogatePrompt: Bypassing the Safety Filter of Text-to-Image Models via Substitution. ACM CCS 2024.
Bypassing the safety filter of T2I model[pdf] -
Moderator: Moderating Text-to-Image Diffusion Models through Fine-grained Context-based Policies. ACM CCS 2024.
Content moderating for T2I model[pdf] [code] -
Bridging the Gap in Vision Language Models in Identifying Unsafe Concepts Across Modalities. USENIX Security 2025. [pdf]
-
Are CAPTCHAs Still Bot-hard? Generalized Visual CAPTCHA Solving with Agentic Vision Language Model. USENIX Security 2025. [pdf]
-
From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models. USENIX Security 2025. [pdf]
-
MASTERKEY: Automated Jailbreaking of Large Language Model Chatbots. NDSS 2024.
LLM jailbreaking[pdf] -
Mind the Inconspicuous: Revealing the Hidden Weakness in Aligned LLMs' Refusal Boundaries. USENIX Security 2025.
LLM jailbreaking[pdf] -
Refusal Is Not an Option: Unlearning Safety Alignment of Large Language Models. USENIX Security 2025.
LLM jailbreaking[pdf] [code] -
Activation Approximations Can Incur Safety Vulnerabilities in Aligned LLMs: Comprehensive Analysis and Defense. USENIX Security 2025.
LLM jailbreaking defense[pdf] -
Exposing the Guardrails: Reverse-Engineering and Jailbreaking Safety Filters in DALL·E Text-to-Image Pipelines. USENIX Security 2025.
LLM jailbreaking[pdf] -
TwinBreak: Jailbreaking LLM Security Alignments based on Twin Prompts. USENIX Security 2025.
LLM jailbreaking[pdf] -
Exploiting Task-Level Vulnerabilities: An Automatic Jailbreak Attack and Defense Benchmarking for LLMs. USENIX Security 2025.
LLM jailbreaking[pdf] -
PAPILLON: Efficient and Stealthy Fuzz Testing-Powered Jailbreaks for LLMs. USENIX Security 2025.
LLM jailbreaking[pdf] -
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack. USENIX Security 2025.
LLM jailbreaking[pdf] -
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner. USENIX Security 2025.
LLM jailbreaking defense[pdf] -
JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation. USENIX Security 2025. [pdf]
- Improving the Robustness of Transformer-based Large Language Models with Dynamic Attention. NDSS 2024.
Improving the robustness of LLM by dynamic attention[pdf]
-
DEMASQ: Unmasking the ChatGPT Wordsmith. NDSS 2024.
Generated text detection[pdf] -
Organic or Diffused: Can We Distinguish Human Art from AI-generated Images?. CCS 2024.
Human arts and the AI-generated image detection[pdf] -
On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing. CCS 2024.
LLM generated concent detection[pdf] -
GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors. USENIX Security 2025.
evade LLM generated concent detection[pdf] [code] -
Data-Free Model-Related Attacks: Unleashing the Potential of Generative AI. USENIX Security 2025. [pdf] [code]
-
"I Cannot Write This Because It Violates Our Content Policy": Understanding Content Moderation Policies and User Experiences in Generative AI Products. USENIX Security 2025. [pdf]
-
Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data. USENIX Security 2025. [pdf]
-
LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors. NDSS 2024.
Task-agnostic backdoor detection[pdf] [code] -
EmbedX: Embedding-Based Cross-Trigger Backdoor Attack Against Large Language Models. USENIX Security 2025.
Backdoor attack in LLM[pdf]
-
When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs. USENIX Security 2025. [pdf]
-
Make Agent Defeat Agent: Automatic Detection of Taint-Style Vulnerabilities in LLM-based Agents. USENIX Security 2025. [pdf]
-
TracLLM: A Generic Framework for Attributing Long Context LLMs. USENIX Security 2025. [pdf] [code]
-
Unsafe LLM-Based Search: Quantitative Analysis and Mitigation of Safety Risks in AI Web Search. USENIX Security 2025. [pdf]
-
Cloak, Honey, Trap: Proactive Defenses Against LLM Agents. USENIX Security 2025. [pdf]
-
Big Help or Big Brother? Auditing Tracking, Profiling, and Personalization in Generative AI Assistants. USENIX Security 2025. [pdf]
-
AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use Agents. ACM CCS 2025. [pdf]
-
StruQ: Defending Against Prompt Injection with Structured Queries. USENIX Security 2025.
Prompt Injection Defense[pdf] -
Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents. USENIX Security 2025. [pdf]
-
SecAlign: Defending Against Prompt Injection with Preference Optimization. ACM CCS 2025. [pdf]
- Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink. USENIX Security 2025. [pdf]
-
Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning. USENIX Security 2020.
Online Learning. Model updates[pdf] -
Extracting Training Data from Large Language Models. USENIX Security 2021.
Membership inference attack. GPT-2[pdf] -
Analyzing Information Leakage of Updates to Natural Language Models. ACM CCS 2020.
data leakage in model changes[pdf] -
TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing. ACM CCS 2021.
Membership collision in GAN[pdf] -
DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation. ACM CCS 2021.
DP to train an privacy preserving GAN[pdf] -
Property Inference Attacks Against GANs. NDSS 2022.
Property Inference Attacks Against GAN[pdf] [code] -
MIRROR: Model Inversion for Deep Learning Network with High Fidelity. NDSS 2022.
Model inversion attack using GAN[pdf] [code] -
Analyzing Leakage of Personally Identifiable Information in Language Models. IEEE S&P 2023.
Personally identifiable information leakage in language model[pdf] [code] -
Timing Channels in Adaptive Neural Networks. NDSS 2024.
Infer input of adaptive NN using timing information[pdf] [code] -
Crafter: Facial Feature Crafting against Inversion-based Identity Theft on Deep Models. NDSS 2024.
Protect model inversion attack[pdf] [code] -
Transpose Attack: Stealing Datasets with Bidirectional Training. NDSS 2024.
Stealing dataset in bidirectional models[pdf] [code] -
SafeEar: Content Privacy-Preserving Audio Deepfake Detection. ACM CCS 2024.
Speech content privacy-preserving deepfake detection[pdf] [website] [code] [dataset] -
Dye4AI: Assuring Data Boundary on Generative AI Services. ACM CCS 2024.
Dye testing system in LLM[pdf] -
Evaluations of Machine Learning Privacy Defenses are Misleading. ACM CCS 2024.
Evaluation DP defense[pdf] [code] -
Towards a Re-evaluation of Data Forging Attacks in Practice. USENIX Security 2025. [pdf]
-
SoK: Data Reconstruction Attacks Against Machine Learning Models: Definition, Metrics, and Benchmark. USENIX Security 2025. [pdf]
-
Anonymity Unveiled: A Practical Framework for Auditing Data Use in Deep Learning Models. ACM CCS 2025. [pdf]
-
Poisoning Attacks to Local Differential Privacy for Ranking Estimation. ACM CCS 2025. [pdf]
-
Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference. USENIX Security 2020.
White-box Setting[pdf] -
Systematic Evaluation of Privacy Risks of Machine Learning Models. USENIX Security 2020.
Metric-based Membership inference Attack Method. Define Privacy Risk Score[pdf] [code] -
Practical Blind Membership Inference Attack via Differential Comparisons. NDSS 2021.
Use non-member data to replace shadow model[pdf] [code] -
GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models. ACM CCS 2020.
Membership inference attack in Generative model. Member has small reconstruction error[pdf] -
Quantifying and Mitigating Privacy Risks of Contrastive Learning. ACM CCS 2021.
Membership inference attack. Property inference attack. Contrastive learning in classification task[pdf] [code] -
Membership Inference Attacks Against Recommender Systems. ACM CCS 2021.
Recommender System[pdf] [code] -
EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning. ACM CCS 2021.
Contrastive learning in pre-trained model. Data augmentation has higher similarity[pdf] [code] -
Auditing Membership Leakages of Multi-Exit Networks. ACM CCS 2022.
Membership inference attack in multi-exit networks[pdf] -
Membership Inference Attacks by Exploiting Loss Trajectory. ACM CCS 2022.
Membership inference attack, knowledge distillation[pdf] -
On the Privacy Risks of Cell-Based NAS Architectures. ACM CCS 2022.
Membership inference attack in NAS[pdf] -
Membership Inference Attacks and Defenses in Neural Network Pruning. USENIX Security 2022.
Membership inference attack in Neural Network Pruning[pdf] -
Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture. USENIX Security 2022.
Membership inference defense by ensemble[pdf] -
Enhanced Membership Inference Attacks against Machine Learning Models. USENIX Security 2022.
Membership inference attack with hypothesis testing[pdf] [code] -
Membership Inference Attacks and Generalization: A Causal Perspective. ACM CCS 2022.
Membership inference attack with casual reasoning[pdf] -
SLMIA-SR: Speaker-Level Membership Inference Attacks against Speaker Recognition Systems. NDSS 2024.
Membership inference attack in speaker recongization[pdf] [code] -
Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction. NDSS 2024.
The defense of membership inference attack[pdf] [code] -
Membership Inference Attacks Against Vision-Language Models. USENIX Security 2025.
Membership inference attack in vision language model[pdf] -
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models. USENIX Security 2025.
Membership inference attack in LLM[pdf] -
Enhanced Label-Only Membership Inference Attacks with Fewer Queries. USENIX Security 2025. [pdf]
-
SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks. USENIX Security 2025. [pdf]
-
Membership Inference Attacks as Privacy Tools: Reliability, Disparity and Ensemble. ACM CCS 2025. [pdf]
-
Label Inference Attacks Against Vertical Federated Learning. USENIX Security 2022.
Label Leakage. Federated Learning[pdf] [code] -
The Value of Collaboration in Convex Machine Learning with Differential Privacy. IEEE S&P 2020.
DP as Defense[pdf] -
Leakage of Dataset Properties in Multi-Party Machine Learning. USENIX Security 2021.
Dataset Properties Leakage[pdf] -
Unleashing the Tiger: Inference Attacks on Split Learning. ACM CCS 2021.
Split learning. Feature-space hijacking attack[pdf] [code] -
Local and Central Differential Privacy for Robustness and Privacy in Federated Learning. NDSS 2022.
DP in federated learning[pdf] -
Gradient Obfuscation Gives a False Sense of Security in Federated Learning. USENIX Security 2023.
Data Recovery in federated learning[pdf] -
PPA: Preference Profiling Attack Against Federated Learning. NDSS 2023.
Preference Leakage in federated learning[pdf] [code] -
On the (In)security of Peer-to-Peer Decentralized Machine Learning. IEEE S&P 2023.
Information leakage in peer-to-peer decentralized machine learning system[pdf] -
RoFL: Robustness of Secure Federated Learning. IEEE S&P 2023.
Robust Federated Learning Framework using Secuire Aggregation[pdf] [code] -
Scalable and Privacy-Preserving Federated Principal Component Analysis. IEEE S&P 2023.
Privacy preserving feaderated PCA algorithm[pdf] -
Protecting Label Distribution in Cross-Silo Federated Learning. IEEE S&P 2024.
Priveacy-preserving SGD to protect label distribution[pdf] -
LOKI: Large-scale Data Reconstruction Attack against Federated Learning through Model Manipulation. IEEE S&P 2024.
Dataset reconstruction attack in fedearted learning by sending customized convoluational kernel[pdf] -
Analyzing Inference Privacy Risks Through Gradients In Machine Learning. CCS 2024.
information leakage through gradients[pdf] -
Boosting Gradient Leakage Attacks: Data Reconstruction in Realistic FL Settings. USENIX Security 2025. [pdf]
-
Refiner: Data Refining against Gradient Leakage Attacks in Federated Learning. USENIX Security 2025. [pdf]
-
Aion: Robust and Efficient Multi-Round Single-Mask Secure Aggregation Against Malicious Participants. USENIX Security 2025. [pdf]
-
SoK: On Gradient Leakage in Federated Learning. USENIX Security 2025. [pdf]
-
DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum. USENIX Security 2025. [pdf]
-
SLOTHE : Lazy Approximation of Non-Arithmetic Neural Network Functions over Encrypted Data. USENIX Security 2025. [pdf]
-
Sharpness-Aware Initialization: Improving Differentially Private Machine Learning from First Principles. USENIX Security 2025. [pdf]
-
Task-Oriented Training Data Privacy Protection for Cloud-based Model Training. USENIX Security 2025. [pdf]
-
From Risk to Resilience: Towards Assessing and Mitigating the Risk of Data Reconstruction Attacks in Federated Learning. USENIX Security 2025. [pdf]
-
SoK: Gradient Inversion Attacks in Federated Learning. USENIX Security 2025. [pdf]
-
Privacy Risks of General-Purpose Language Models. IEEE S&P 2020.
Pretrained Language Model[pdf] -
Information Leakage in Embedding Models. ACM CCS 2020.
Exact Word Recovery. Attribute inference. Membership inference[pdf] -
Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be Secretly Coded into the Classifiers' Outputs. ACM CCS 2021.
Infer privacy information in classification output[pdf] [code]
-
Stealing Links from Graph Neural Networks. USENIX Security 2021.
Inference Graph Link[pdf] -
Inference Attacks Against Graph Neural Networks. USENIX Security 2022.
Property inference: number of nodes. Subgraph inference. Graph reconstruction[pdf] [code] -
LinkTeller: Recovering Private Edges from Graph Neural Networks via Influence Analysis. IEEE S&P 2022.
Use node connection influence to infer graph edges[pdf] -
Locally Private Graph Neural Networks. IEEE S&P 2022.
LDP as defense for node privacy[pdf] [code] -
Finding MNEMON: Reviving Memories of Node Embeddings. ACM CCS 2022.
Graph recovery attack through node embedding[pdf] -
Group Property Inference Attacks Against Graph Neural Networks. ACM CCS 2022.
Group Property inference attack on GNN[pdf] -
LPGNet: Link Private Graph Networks for Node Classification. ACM CCS 2022.
DP to build private GNN[pdf] -
GraphGuard: Detecting and Counteracting Training Data Misuse in Graph Neural Networks. MDSS 2024.
Mitigate data misuse issues in GNN[pdf] [code] -
GRID: Protecting Training Graph from Link Stealing Attacks on GNN Models. IEEE S&P 2025.
Link stealing defense[pdf] [code]
-
Machine Unlearning. IEEE S&P 2020.
Shard and isolate the training dataset[pdf] [code] -
When Machine Unlearning Jeopardizes Privacy. ACM CCS 2021.
Membership inference attack in unlearning setting[pdf] [code] -
Graph Unlearning. ACM CCS 2022.
Graph Unlearning[pdf] [code] -
On the Necessity of Auditable Algorithmic Definitions for Machine Unlearning. ACM CCS 2022.
Auditable Unlearning[pdf] -
Machine Unlearning of Features and Labels. NDSS 2023.
Influence Function to achieve unlearning[pdf] [code] -
A Duty to Forget, a Right to be Assured? Exposing Vulnerabilities in Machine Unlearning Services. NDSS 2024.
The vulnerabilities in machine unlearning[pdf] [code] -
ERASER: Machine Unlearning in MLaaS via an Inference Serving-Aware Approach. CCS 2024.
Machine unlearning as a inferencing-aware approach[pdf] -
Rectifying Privacy and Efficacy Measurements in Machine Unlearning: A New Inference Attack Perspective. USENIX Security 2025. [pdf]
-
Data Duplication: A Novel Multi-Purpose Attack Paradigm in Machine Unlearning. USENIX Security 2025. [pdf]
-
Towards Lifecycle Unlearning Commitment Management: Measuring Sample-level Unlearning Completeness. USENIX Security 2025. [pdf]
-
Split Unlearning. ACM CCS 2025. [pdf]
-
Rethinking Machine Unlearning in Image Generation Models. ACM CCS 2025. [pdf]
-
Prototype Surgery: Tailoring Neural Prototypes via Soft Labels for Efficient Machine Unlearning. ACM CCS 2025. [pdf]
-
Are Attribute Inference Attacks Just Imputation?. ACM CCS 2022.
Attribute Inference Attack by identified neuro with data[pdf] [code] -
Feature Inference Attack on Shapley Values. ACM CCS 2022.
Attribute Inference Attack using shapley values[pdf] -
QuerySnout: Automating the Discovery of Attribute Inference Attacks against Query-Based Systems. ACM CCS 2022.
Attribute Inference detection[pdf] -
Disparate Privacy Vulnerability: Targeted Attribute Inference Attacks and Defenses. USENIX Security 2025. [pdf]
- SNAP: Efficient Extraction of Private Properties with Poisoning. IEEE S&P 2023.
Stronger Property Inference Attack by poisoning the data[pdf] [code]
- SoK: Privacy-Preserving Data Synthesis. IEEE S&P 2024.
Privacy-Preserving Data Synthesis[pdf] [website]
-
ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning. NDSS 2024.
Dataset auditing in deep reinforcement learning[pdf] [code] -
SoK: Dataset Copyright Auditing in Machine Learning Systems. IEEE S&P 2025.
Dataset copyright[pdf]
-
Exploring Connections Between Active Learning and Model Extraction. USENIX Security 2020.
Active Learning[pdf] -
High Accuracy and High Fidelity Extraction of Neural Networks. USENIX Security 2020.
Fidelity[pdf] -
DRMI: A Dataset Reduction Technology based on Mutual Information for Black-box Attacks. USENIX Security 2021.
Query Data Selection Method to reduce the query[pdf] -
Entangled Watermarks as a Defense against Model Extraction. USENIX Security 2021.
Backdoor as watermark against model extraction[pdf] -
CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples. NDSS 2020.
Adversarial Example to strengthen model stealing[pdf] -
Teacher Model Fingerprinting Attacks Against Transfer Learning. USENIX Securiy 2022.
Teacher model fingerprinting[pdf] -
StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning. ACM CCS 2022.
Model Stealing attack in encoder[pdf] -
D-DAE: Defense-Penetrating Model Extraction Attacks. IEEE S&P 2023.
Meta classifier to classify the defense and generator model to reduce the noise[pdf] -
SoK: Neural Network Extraction Through Physical Side Channels. USENIX Security 2024.
Physical Side Channel-based model extraction[pdf] -
SoK: All You Need to Know About On-Device ML Model Extraction - The Gap Between Research and Practice. USENIX Security 2024.
on device model extraction[pdf]
-
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding. IEEE S&P 2021.
Encode secret message into LM[pdf] -
Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural Obfuscation. USENIX Security 2023.
Inject dummy neurons into the model to break the white-box model watermark[pdf] -
MEA-Defender: A Robust Watermark against Model Extraction Attack. IEEE S&P 2024.
Backdoor as watermark[pdf] [code] -
SSL-WM: A Black-Box Watermarking Approach for Encoders Pre-trained by Self-Supervised Learning. NDSS 2024.
Watermark on self-supervised learning[pdf] [code] -
Watermarking Language Models for Many Adaptive Users. IEEE S&P 2025.
Watermark on LLM[pdf] [code] -
SoK: Watermarking for AI-Generated Content. IEEE S&P 2025. [pdf]
-
Provably Robust Multi-bit Watermarking for AI-generated Text. USENIX Security 2025. [pdf] [code]
-
AUDIO WATERMARK: Dynamic and Harmless Watermark for Black-box Voice Dataset Copyright Protection. USENIX Security 2025. [pdf]
-
AudioMarkNet: Audio Watermarking for Deepfake Speech Detection. USENIX Security 2025. [pdf]
-
Towards Understanding and Enhancing Security of Proof-of-Training for DNN Model Ownership Verification. USENIX Security 2025. [pdf]
-
LightShed: Defeating Perturbation-based Image Copyright Protections. USENIX Security 2025. [pdf]
-
A Crack in the Bark: Leveraging Public Knowledge to Remove Tree-Ring Watermarks. USENIX Security 2025. [pdf]
-
Proof-of-Learning: Definitions and Practice. IEEE S&P 2021.
Proof the ownership of model parameters[pdf] -
SoK: How Robust is Image Classification Deep Neural Network Watermarking?. IEEE S&P 2022.
Survey of DNN watermarking[pdf] -
Copy, Right? A Testing Framework for Copyright Protection of Deep Learning Models. IEEE S&P 2022.
Calculate model similarity by generating test examples[pdf] [code] -
SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders. ACM CCS 2022.
Watermarking in encoder[pdf] -
RAI2: Responsible Identity Audit Governing the Artificial Intelligence. NDSS 2023.
Model and Data auditing in AI[pdf] [code] -
ActiveDaemon: Unconscious DNN Dormancy and Waking Up via User-specific Invisible Token. NDSS 2024.
Protecting DNN models by specific user tokens[pdf] [code] -
THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models. USENIX Security 2025. [pdf]
- PublicCheck: Public Integrity Verification for Services of Run-time Deep Models. IEEE S&P 2023.
Model verification via crafted query[pdf]
-
Prompt Inversion Attack against Collaborative Inference of Large Language Models. IEEE S&P 2025.
Prompt inversion attack[pdf] -
On the Effectiveness of Prompt Stealing Attacks on In-the-Wild Prompts. IEEE S&P 2025.
Prompt stealing attack[pdf] -
PRSA: Prompt Stealing Attacks against Real-World Prompt Services. USENIX Security 2025.
Prompt stealing attack[pdf] -
Cross-Modal Prompt Inversion: Unifying Threats to Text and Image Generative AI Models. USENIX Security 2025.
Prompt inversion attack[pdf] -
Prompt Obfuscation for Large Language Models. USENIX Security 2025.
Prompt defense[pdf] -
Prompt Inference Attack on Distributed Large Language Model Inference Frameworks. ACM CCS 2025. [pdf]
-
Codebreaker: Dynamic Extraction Attacks on Code Language Models. IEEE S&P 2025.
Personal Information Extraction in code llm[pdf] -
LLMmap: Fingerprinting for Large Language Models. USENIX Security 2025.
LLM fingerprinting[pdf] -
Unlocking the Power of Differentially Private Zeroth-order Optimization for Fine-tuning LLMs. USENIX Security 2025.
LLM DP[pdf] -
Depth Gives a False Sense of Privacy: LLM Internal States Inversion. USENIX Security 2025. [pdf]
-
Evaluating LLM-based Personal Information Extraction and Countermeasures. USENIX Security 2025. [pdf]
-
PrivacyXray: Detecting Privacy Breaches in LLMs through Semantic Consistency and Probability Certainty. USENIX Security 2025. [pdf]
-
Synthetic Artifact Auditing: Tracing LLM-Generated Synthetic Data Usage in Downstream Applications. USENIX Security 2025. [pdf]
-
Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems. USENIX Security 2025. [pdf]
-
Effective PII Extraction from LLMs through Augmented Few-Shot Learning. USENIX Security 2025. [pdf]
-
Private Investigator: Extracting Personally Identifiable Information from Large Language Models Using Optimized Prompts. USENIX Security 2025. [pdf]
-
Fawkes: Protecting Privacy against Unauthorized Deep Learning Models. USENIX Security 2020.
Protect Face Privacy[pdf] [code] -
Automatically Detecting Bystanders in Photos to Reduce Privacy Risks. IEEE S&P 2020.
Detecting bystanders[pdf] -
Characterizing and Detecting Non-Consensual Photo Sharing on Social Networks. IEEE S&P 2020.
Detecting Non-Consensual People in a photo[pdf] -
Fairness Properties of Face Recognition and Obfuscation Systems. USENIX Security 2023.
Fairness in Face related models[pdf] [code]
-
SWIFT: Super-fast and Robust Privacy-Preserving Machine Learning. USENIX Security 2021. [pdf]
-
BLAZE: Blazing Fast Privacy-Preserving Machine Learning. NDSS 2020. [pdf]
-
Bicoptor: Two-round Secure Three-party Non-linear Computation without Preprocessing for Privacy-preserving Machine Learning. IEEE S&P 2023. [pdf]
-
Ents: An Efficient Three-party Training Framework for Decision Trees by Communication Optimization. CCS 2024. [pdf]
- Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning. NDSS 2020. [pdf]
-
Cerebro: A Platform for Multi-Party Cryptographic Collaborative Learning. USENIX Security 2021. [pdf] [code]
-
Private, Efficient, and Accurate: Protecting Models Trained by Multi-party Learning with Differential Privacy. IEEE S&P 2023. [pdf]
-
MPCDiff: Testing and Repairing MPC-Hardened Deep Learning Models. NDSS 2023. [pdf] [code]
-
Pencil: Private and Extensible Collaborative Learning without the Non-Colluding Assumption. NDSS 2024. [pdf] [code]
-
Securely Training Decision Trees Efficiently. CCS 2024. [pdf]
-
CoGNN: Towards Secure and Efficient Collaborative Graph Learning. CCS 2024. [pdf]
-
SoK: Cryptographic Neural-Network Computation. IEEE S&P 2023. [pdf]
-
From Individual Computation to Allied Optimization: Remodeling Privacy-Preserving Neural Inference with Function Input Tuning. IEEE S&P 2024. [pdf]
-
BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers. IEEE S&P 2024. [pdf] [code]
-
Flamingo: Multi-Round Single-Server Secure Aggregation with Applications to Private Federated Learning. IEEE S&P 2023. [pdf] [code]
-
ELSA: Secure Aggregation for Federated Learning with Malicious Actors. IEEE S&P 2023. [pdf] [code]
- ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models. USENIX Security 2022.
Membership inference attack. Model inversion. Attribute inference. Model stealing[pdf]
- SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning. IEEE S&P 2023.
Systematizing privacy risks using game framework[pdf]
- Federated Boosted Decision Trees with Differential Privacy. ACM CCS 2022.
Federated Learning with Tree Model in DP[pdf]
-
Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering. IEEE S&P 2023.
Spectral DP[pdf] -
Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering. IEEE S&P 2024.
Spectral DP[pdf] -
Bounded and Unbiased Composite Differential Privacy. IEEE S&P 2024.
Composite DP[pdf] [code] -
Cohere: Managing Differential Privacy in Large Scale Systems. IEEE S&P 2024.
Unified DP in large system[pdf] [code] -
You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks. NDSS 2024.
DP in image recognization[pdf] [code]
-
Locally Differentially Private Frequency Estimation Based on Convolution Framework. IEEE S&P 2023. [pdf]
-
Data Poisoning Attacks to Locally Differentially Private Frequent Itemset Mining Protocols. CCS 2024. [pdf]
- PLeak: Prompt Leaking Attacks against Large Language Model Applications. CCS 2024.
Stealing system prompts[pdf] [code]
This list is mainly maintained by Ping He from NESA Lab.
We are very much welcome contributors for contributing this repository!
Markdown format
**Paper Name**. Conference Year. `Keywords` [[pdf](pdf_link)] [[code](code_link)]To the extent possible under law, gnipping holds all copyright and related or neighboring rights to this repository.
