Model extraction

Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 176 papers

Title	Date	Tasks	Status	Hype	Score
Safety at Scale: A Comprehensive Survey of Large Model Safety	Feb 2, 2025	Autonomous DrivingData Poisoning	CodeCode Available	3	5
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark	May 17, 2023	Model extraction	CodeCode Available	1	5
ATOM: A Framework of Detecting Query-Based Model Extraction Attacks for Graph Neural Networks	Mar 20, 2025	Model extraction	CodeCode Available	1	5
MEA-Defender: A Robust Watermark against Model Extraction Attack	Jan 26, 2024	Model extractionSelf-Supervised Learning	CodeCode Available	1	5
Protecting Language Generation Models via Invisible Watermarking	Feb 6, 2023	Model extractionText Generation	CodeCode Available	1	5
Neural Honeytrace: A Robust Plug-and-Play Watermarking Framework against Model Extraction Attacks	Jan 16, 2025	Model extraction	CodeCode Available	1	5
MEME: Generating RNN Model Explanations via Model Extraction	Dec 13, 2020	Decision Makingmodel	CodeCode Available	1	5
Cryptanalytic Extraction of Neural Network Models	Mar 10, 2020	Model extraction	CodeCode Available	1	5
Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!	Mar 18, 2021	Model extractiontext-classification	CodeCode Available	1	5
Entangled Watermarks as a Defense against Model Extraction	Feb 27, 2020	model	CodeCode Available	1	5
"Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation	Sep 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1	5
FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction	Dec 3, 2022	Federated Learningmodel	CodeCode Available	1	5
MARLeME: A Multi-Agent Reinforcement Learning Model Extraction Library	Apr 16, 2020	Model extractionMulti-agent Reinforcement Learning	CodeCode Available	1	5
Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service	Nov 10, 2023	Model extraction	CodeCode Available	1	5
Now You See Me (CME): Concept-based Model Extraction	Oct 25, 2020	Model extraction	CodeCode Available	1	5
MEME: Generating RNN Model Explanations via Model Extraction	Oct 15, 2020	Decision Makingmodel	CodeCode Available	1	5
Data-Free Model Extraction	Nov 30, 2020	modelModel extraction	CodeCode Available	1	5
Stateful Detection of Model Extraction Attacks	Jul 12, 2021	BIG-bench Machine Learningmodel	CodeCode Available	0	5
Safe and Robust Watermark Injection with a Single OoD Image	Sep 4, 2023	Model extraction	CodeCode Available	0	5
Stealing Machine Learning Models via Prediction APIs	Sep 9, 2016	BIG-bench Machine LearningLearning Theory	CodeCode Available	0	5
Protecting Intellectual Property of Language Generation APIs with Lexical Watermark	Dec 5, 2021	Document SummarizationImage Captioning	CodeCode Available	0	5
On the Difficulty of Defending Self-Supervised Learning against Model Extraction	May 16, 2022	Model extractionSelf-Supervised Learning	CodeCode Available	0	5
Defense Against Model Extraction Attacks on Recommender Systems	Oct 25, 2023	Model extractionRecommendation Systems	CodeCode Available	0	5
VidModEx: Interpretable and Efficient Black Box Model Extraction for High-Dimensional Spaces	Aug 4, 2024	image-classificationImage Classification	CodeCode Available	0	5
Process Extraction from Text: Benchmarking the State of the Art and Paving the Way for Future Challenges	Oct 7, 2021	BenchmarkingModel extraction	CodeCode Available	0	5
On the Effectiveness of Dataset Watermarking in Adversarial Settings	Feb 25, 2022	Model extraction	CodeCode Available	0	5
Stealing and Evading Malware Classifiers and Antivirus at Low False Positive Conditions	Apr 13, 2022	Active LearningMalware Detection	CodeCode Available	0	5
SAME: Sample Reconstruction against Model Extraction Attacks	Dec 17, 2023	modelModel extraction	CodeCode Available	0	5
Beyond Slow Signs in High-fidelity Model Extraction	Jun 14, 2024	Benchmarkingmodel	CodeCode Available	0	5
Model extraction from counterfactual explanations	Sep 3, 2020	counterfactualmodel	CodeCode Available	0	5
MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models	Jun 3, 2025	Bilevel OptimizationData Augmentation	CodeCode Available	0	5
Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory	May 8, 2024	counterfactualModel extraction	CodeCode Available	0	5
ACTIVETHIEF: Model Extraction Using Active Learning and Unannotated Public Data	Feb 7, 2020	Active LearningBIG-bench Machine Learning	CodeCode Available	0	5
Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection	Nov 8, 2023	Active LearningAdversarial Attack	CodeCode Available	0	5
MeaeQ: Mount Model Extraction Attacks with Efficient Queries	Oct 21, 2023	Active LearningDiversity	CodeCode Available	0	5
Knowledge Distillation-Based Model Extraction Attack using GAN-based Private Counterfactual Explanations	Apr 4, 2024	counterfactualKnowledge Distillation	CodeCode Available	0	5
CEGA: A Cost-Effective Approach for Graph-Based Model Extraction and Acquisition	Jun 21, 2025	Model extraction	CodeCode Available	0	5
Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack using Public Data	Feb 16, 2023	Model extraction	CodeCode Available	0	5
From Counterfactuals to Trees: Competitive Analysis of Model Extraction Attacks	Feb 7, 2025	counterfactualModel extraction	CodeCode Available	0	5
DAWN: Dynamic Adversarial Watermarking of Neural Networks	Jun 3, 2019	Model extraction	CodeCode Available	0	5
GUIDO: A Hybrid Approach to Guideline Discovery & Ordering from Natural Language Texts	Jul 19, 2023	Dependency ParsingModel extraction	CodeCode Available	0	5
Robust and Minimally Invasive Watermarking for EaaS	Oct 23, 2024	Model extraction	CodeCode Available	0	5
Deep Neural Network Fingerprinting by Conferrable Adversarial Examples	Dec 2, 2019	Model extractionTransfer Learning	CodeCode Available	0	5
An Approach for Process Model Extraction By Multi-Grained Text Classification	May 16, 2019	General ClassificationManagement	CodeCode Available	0	5
Model Extraction Attacks on Graph Neural Networks: Taxonomy and Realization	Oct 24, 2020	Anomaly DetectionModel extraction	CodeCode Available	0	5
Efficient and Effective Model Extraction	Sep 21, 2024	Benchmarkingmodel	CodeCode Available	0	5
FLuID: Mitigating Stragglers in Federated Learning using Invariant Dropout	Jul 5, 2023	Federated LearningModel extraction	CodeCode Available	0	5
A Hard-Label Cryptanalytic Extraction of Non-Fully Connected Deep Neural Networks using Side-Channel Attacks	Nov 15, 2024	Model extraction	CodeCode Available	0	5
Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data	Mar 15, 2024	Model extraction	CodeCode Available	0	5

Show:10 25 50

← PrevPage 1 of 4Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	three-step-original	Exact Match	0.17	—	Unverified