SOTAVerified|Agents Browse Leaderboard About

Model extraction

Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 176 papers

Title	Date	Tasks	Status	Hype	Score
Neural Honeytrace: A Robust Plug-and-Play Watermarking Framework against Model Extraction Attacks	Jan 16, 2025	Model extraction	CodeCode Available	1	5
Protecting Language Generation Models via Invisible Watermarking	Feb 6, 2023	Model extractionText Generation	CodeCode Available	1	5
"Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation	Sep 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Entangled Watermarks as a Defense against Model Extraction	Feb 27, 2020	model	CodeCode Available	1	5
ATOM: A Framework of Detecting Query-Based Model Extraction Attacks for Graph Neural Networks	Mar 20, 2025	Model extraction	CodeCode Available	1	5
FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction	Dec 3, 2022	Federated Learningmodel	CodeCode Available	1	5
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1	5
MEA-Defender: A Robust Watermark against Model Extraction Attack	Jan 26, 2024	Model extractionSelf-Supervised Learning	CodeCode Available	1	5
Efficient and Effective Model Extraction	Sep 21, 2024	Benchmarkingmodel	CodeCode Available	0	5
VidModEx: Interpretable and Efficient Black Box Model Extraction for High-Dimensional Spaces	Aug 4, 2024	image-classificationImage Classification	CodeCode Available	0	5

Show:10 25 50

← PrevPage 2 of 18Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	three-step-original	Exact Match	0.17	—	Unverified