Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 4240 papers

Title	Date	Tasks	Status	Hype
Backdoor Attacks on Self-Supervised Learning	May 21, 2021	Backdoor AttackInductive Bias	CodeCode Available	1
Backdoor Cleansing with Unlabeled Data	Nov 22, 2022	Knowledge Distillation	CodeCode Available	1
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance	Sep 21, 2023	Domain GeneralizationKnowledge Distillation	CodeCode Available	1
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation	Apr 1, 2022	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1
Enhancing and Adapting in the Clinic: Source-free Unsupervised Domain Adaptation for Medical Image Enhancement	Dec 3, 2023	Domain AdaptationImage Enhancement	CodeCode Available	1
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping	Feb 16, 2025	Code GenerationInstruction Following	CodeCode Available	1
A semi-supervised Teacher-Student framework for surgical tool detection and localization	Aug 21, 2022	Knowledge DistillationPseudo Label	CodeCode Available	1
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification	Nov 4, 2023	ClassificationCross-Domain Few-Shot	CodeCode Available	1
CrossKD: Cross-Head Knowledge Distillation for Object Detection	Jun 20, 2023	Dense Object DetectionKnowledge Distillation	CodeCode Available	1
Action knowledge for video captioning with graph neural networks	Mar 16, 2023	Action RecognitionGraph Neural Network	CodeCode Available	1
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval	Oct 7, 2022	Knowledge DistillationRetrieval	CodeCode Available	1
Cross-Layer Distillation with Semantic Calibration	Dec 6, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1
Aligned Structured Sparsity Learning for Efficient Image Super-Resolution	Dec 1, 2021	Image Super-ResolutionKnowledge Distillation	CodeCode Available	1
Evolving Search Space for Neural Architecture Search	Nov 22, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1
Bayes Conditional Distribution Estimation for Knowledge Distillation Based on Conditional Mutual Information	Jan 16, 2024	Knowledge Distillation	CodeCode Available	1
CaMEL: Mean Teacher Learning for Image Captioning	Feb 21, 2022	Image CaptioningKnowledge Distillation	CodeCode Available	1
Exploring Navigation Maps for Learning-Based Motion Prediction	Feb 13, 2023	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA Acceleration	Jul 31, 2023	CPUFault Diagnosis	CodeCode Available	1
CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection	Jan 1, 2024	3D Object DetectionKnowledge Distillation	CodeCode Available	1
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image	Dec 1, 2021	Knowledge Distillation	CodeCode Available	1
FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction	Feb 16, 2022	Active LearningKnowledge Distillation	CodeCode Available	1
CrossMatch: Enhance Semi-Supervised Medical Image Segmentation with Perturbation Strategies and Knowledge Distillation	May 1, 2024	Image SegmentationKnowledge Distillation	CodeCode Available	1
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model	Dec 2, 2024	cross-modal alignmentKnowledge Distillation	CodeCode Available	1
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech	Jun 8, 2020	Knowledge DistillationSpeech Synthesis	CodeCode Available	1
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing	Feb 7, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement	Jan 1, 2025	cross-modal alignmentKnowledge Distillation	CodeCode Available	1
Better Estimation of the KL Divergence Between Language Models	Apr 14, 2025	Knowledge Distillation	CodeCode Available	1
Camera clustering for scalable stream-based active distillation	Apr 16, 2024	ClusteringKnowledge Distillation	CodeCode Available	1
Feature Structure Distillation with Centered Kernel Alignment in BERT Transferring	Apr 1, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	1
FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection	Mar 10, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Curriculum Temperature for Knowledge Distillation	Nov 29, 2022	Image ClassificationKnowledge Distillation	CodeCode Available	1
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection	Nov 17, 2022	3D Object DetectionDepth Estimation	CodeCode Available	1
BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for BEV 3D Object Detection	Dec 1, 2022	3D Object DetectionAutonomous Driving	CodeCode Available	1
SimDistill: Simulated Multi-modal Distillation for BEV 3D Object Detection	Mar 29, 2023	3D geometry3D Object Detection	CodeCode Available	1
Federated Learning with Extremely Noisy Clients via Negative Distillation	Dec 20, 2023	Federated LearningKnowledge Distillation	CodeCode Available	1
FedMD: Heterogenous Federated Learning via Model Distillation	Oct 8, 2019	Federated LearningKnowledge Distillation	CodeCode Available	1
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model	Aug 2, 2023	HallucinationImage Captioning	CodeCode Available	1
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression	Sep 7, 2021	Knowledge DistillationQuantization	CodeCode Available	1
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones	Mar 10, 2021	Knowledge Distillationobject-detection	CodeCode Available	1
AlphaFold Distillation for Protein Design	Oct 5, 2022	DiversityDrug Discovery	CodeCode Available	1
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation	Oct 12, 2022	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available	1
FerKD: Surgical Label Adaptation for Efficient Distillation	Dec 29, 2023	Knowledge Distillation	CodeCode Available	1
Contrastive Model Inversion for Data-Free Knowledge Distillation	May 18, 2021	Contrastive LearningData-free Knowledge Distillation	CodeCode Available	1
FitNets: Hints for Thin Deep Nets	Dec 19, 2014	Knowledge Distillation	CodeCode Available	1
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation	Feb 5, 2024	Knowledge DistillationRetrieval	CodeCode Available	1
AltDiffusion: A Multilingual Text-to-Image Diffusion Model	Aug 19, 2023	BlockingConcept Alignment	CodeCode Available	1
Bidirectional Distillation for Top-K Recommender System	Jun 5, 2021	Knowledge DistillationModel Compression	CodeCode Available	1
Bi-directional Weakly Supervised Knowledge Distillation for Whole Slide Image Classification	Oct 7, 2022	Classificationimage-classification	CodeCode Available	1
Contrastive Distillation on Intermediate Representations for Language Model Compression	Sep 29, 2020	Knowledge DistillationLanguage Modeling	CodeCode Available	1
Contrastive Representation Distillation	Oct 23, 2019	Contrastive LearningKnowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 9 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified