NDSS
对抗样本防御
- Adversarial Robustness for Tabular Data through Cost and Utility Awareness
- BARS: Local Robustness Certification for Deep Learning based Traffic Analysis Systems 流量相关
后门攻击
S&P
AI伦理
AI&差分隐私
- A Theory to Instruct Differentially-Private Learning via Clipping Bias Reduction
- Continual Observation under User-level Differential Privacy
- Locally Differentially Private Frequency Estimation Based on Convolution Framework
- Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering
对抗样本攻击
- AI-Guardian: Defeating Adversarial Attacks using Backdoors
- SoK: Certified Robustness for Deep Neural Networks
后门攻击
- Redeem Myself: Purifying Backdoors in Deep Learning Models using Self Attention Distillation
- Disguising Attacks with Explanation-Aware Backdoors
推理攻击
- SNAP: Efficient Extraction of Private Properties with Poisoning
- Accuracy-Privacy Trade-off in Deep Ensemble: A Membership Inference Perspective
- SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning
模型萃取攻击
机器学习可解释性
USENIX
对抗样本攻击
- KENKU: Towards Efficient and Stealthy Black-box Adversarial Attacks against ASR Systems
- Towards Targeted Obfuscation of Adversarial Unsafe Images using Reconstruction and Counterfactual Super Region Attribution Explainability
- TPatch: A Triggered Physical Adversarial Patch
- CAPatch: Physical Adversarial Patch against Image Captioning Systems
- Hard-label Black-box Universal Adversarial Patch Attack
- The Space of Adversarial Strategies
- X-Adv: Physical Adversarial Object Attacks against X-ray Prohibited Item Detection
- SMACK: Semantically Meaningful Adversarial Audio Attack audio
- URET: Universal Robustness Evaluation Toolkit (for Evasion)
- Precise and Generalized Robustness Certification for Neural Networks
- DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing
成员推理攻击
后门攻击
- Towards A Proactive ML Approach for Detecting Backdoor Poison Samples
- PELICAN: Exploiting Backdoors of Naturally Trained Deep Learning Models In Binary Code Analysis 二进制代码分析自动发现后门
- A Data-free Backdoor Injection Approach in Neural Networks
- Sparsity Brings Vulnerabilities: Exploring New Metrics in Backdoor Attacks
- Aliasing Backdoor Attacks on Pre-trained Models
- ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms
- VILLAIN: Backdoor Attacks Against Vertical Split Learning
- FreeEagle: Detecting Complex Neural Trojans in Data-Free Cases
投毒攻击
- Meta-Sift: How to Sift Out a Clean Subset in the Presence of Data Poisoning?
- Fine-grained Poisoning Attack to Local Differential Privacy Protocols for Mean and Variance Estimation
比特翻转(bit-flip)攻击
- Aegis: Mitigating Targeted Bit-flip Attacks against Deep Neural Networks
- NeuroPots: Realtime Proactive Defense against Bit-Flip Attacks in Neural Networks
差分隐私
- What Are the Chances? Explaining the Epsilon Parameter in Differential Privacy
- Tight Auditing of Differentially Private Machine Learning
模型水印
其它
- CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot
- IvySyn: Automated Vulnerability Discovery in Deep Learning Frameworks
- “Security is not my field, I’m a stats guy”: A Qualitative Root Cause Analysis of Barriers to Adversarial Machine Learning Defenses in Industry 对抗训练相关
CCS
后门攻击
模型窃取
- Stealing the Decoding Algorithms of Language Models语言模型的解码算法、超参数窃取
- Stolen Risks of Models with Security Properties 强化学习模型隐私风险验证
差分隐私&机器学习
- DPMLBench: Holistic Evaluation of Differentially Private Machine Learning
- Geometry of Sensitivity: Twice Sampling and Hybrid Clipping in Differential Privacy with Optimal Gaussian Noise and Application to Deep Learning
- Blink: Link Local Differential Privacy in Graph Neural Networks via Bayesian Estimation 图神经网络
- DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass 语言模型
其它
- Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks 黑盒攻击增强策略
- Prediction Privacy in Distributed Multi-Exit Neural Networks: Vulnerabilities and Solutions
- Devil in Disguise: Breaching Graph Neural Networks Privacy through Infiltration 对图神经网络的攻击
- Evading Watermark based Detection of AI-Generated Content 生成模型水印检测规避
- Interactive Proofs For Differentially Private Counting 交互式差分隐私证明(可能没有AI相关的内容)
- SalsaPicante: A Machine Learning Attack on LWE with Binary Secrets 用机器学习攻击量子密码系统
- Efficient Query-Based Attack against ML-Based Android Malware Detection under Zero Knowledge Setting 攻击恶意软件检测模型
- “Get in Researchers; We’re Measuring Reproducibility”: A Reproducibility Study of Machine Learning Papers in Tier 1 Security Conferences 论文可复现性检查
- DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models 生成图像检测
- Attack Some while Protecting Others: Selective Attack Strategies for Attacking and Protecting Multiple Concepts
- Unforgeability in Stochastic Gradient Descent SGD执行的可伪造性