Model extraction

Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 176 papers

Title	Date	Tasks	Status	Hype
Safety at Scale: A Comprehensive Survey of Large Model Safety	Feb 2, 2025	Autonomous DrivingData Poisoning	CodeCode Available	3
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark	May 17, 2023	Model extraction	CodeCode Available	1
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1
Protecting Language Generation Models via Invisible Watermarking	Feb 6, 2023	Model extractionText Generation	CodeCode Available	1
Neural Honeytrace: A Robust Plug-and-Play Watermarking Framework against Model Extraction Attacks	Jan 16, 2025	Model extraction	CodeCode Available	1
MEME: Generating RNN Model Explanations via Model Extraction	Dec 13, 2020	Decision Makingmodel	CodeCode Available	1
Data-Free Model Extraction	Nov 30, 2020	modelModel extraction	CodeCode Available	1
MARLeME: A Multi-Agent Reinforcement Learning Model Extraction Library	Apr 16, 2020	Model extractionMulti-agent Reinforcement Learning	CodeCode Available	1
MEME: Generating RNN Model Explanations via Model Extraction	Oct 15, 2020	Decision Makingmodel	CodeCode Available	1
Now You See Me (CME): Concept-based Model Extraction	Oct 25, 2020	Model extraction	CodeCode Available	1
Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service	Nov 10, 2023	Model extraction	CodeCode Available	1
ATOM: A Framework of Detecting Query-Based Model Extraction Attacks for Graph Neural Networks	Mar 20, 2025	Model extraction	CodeCode Available	1
"Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation	Sep 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!	Mar 18, 2021	Model extractiontext-classification	CodeCode Available	1
MEA-Defender: A Robust Watermark against Model Extraction Attack	Jan 26, 2024	Model extractionSelf-Supervised Learning	CodeCode Available	1
Cryptanalytic Extraction of Neural Network Models	Mar 10, 2020	Model extraction	CodeCode Available	1
Entangled Watermarks as a Defense against Model Extraction	Feb 27, 2020	model	CodeCode Available	1
FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction	Dec 3, 2022	Federated Learningmodel	CodeCode Available	1
An anatomy-based V1 model: Extraction of Low-level Features, Reduction of distortion and a V1-inspired SOM	Feb 18, 2023	AnatomyContour Detection	—Unverified	0
Adversarial Exploitation of Policy Imitation	Jun 3, 2019	Deep Reinforcement LearningImitation Learning	—Unverified	0
A Knowledge Representation Approach to Automated Mathematical Modelling	Nov 12, 2020	Combinatorial OptimizationModel extraction	—Unverified	0
AUTOLYCUS: Exploiting Explainable AI (XAI) for Model Extraction Attacks against Interpretable Models	Feb 4, 2023	Decision MakingExplainable artificial intelligence	—Unverified	0
A Desynchronization-Based Countermeasure Against Side-Channel Analysis of Neural Networks	Mar 25, 2023	Model extractionSide Channel Analysis	—Unverified	0
Beyond Labeling Oracles: What does it mean to steal ML models?	Oct 3, 2023	Model extraction	—Unverified	0
Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs	Aug 29, 2021	Domain AdaptationModel extraction	—Unverified	0

Show:10 25 50

← PrevPage 1 of 8Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	three-step-original	Exact Match	0.17	—	Unverified