Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 876–900 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint	Jan 1, 2023	Data AugmentationData-free Knowledge Distillation	CodeCode Available	1	5
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression	Sep 7, 2021	Knowledge DistillationQuantization	CodeCode Available	1	5
I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval	Jun 4, 2023	Knowledge DistillationPassage Retrieval	CodeCode Available	1	5
Knowledge Distillation for Multi-task Learning	Jul 14, 2020	Knowledge DistillationMulti-Task Learning	CodeCode Available	1	5
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones	Mar 10, 2021	Knowledge Distillationobject-detection	CodeCode Available	1	5
AlphaFold Distillation for Protein Design	Oct 5, 2022	DiversityDrug Discovery	CodeCode Available	1	5
Improved Feature Distillation via Projector Ensemble	Oct 27, 2022	Knowledge DistillationMulti-Task Learning	CodeCode Available	1	5
Decoupled Kullback-Leibler Divergence Loss	May 23, 2023	Adversarial DefenseAdversarial Robustness	CodeCode Available	1	5
DA-Mamba: Domain Adaptive Hybrid Mamba-Transformer Based One-Stage Object Detection	Feb 16, 2025	Domain AdaptationKnowledge Distillation	CodeCode Available	1	5
Sequence-Level Knowledge Distillation	Jun 25, 2016	Knowledge DistillationMachine Translation	CodeCode Available	1	5
Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction	May 11, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available	1	5
Decoupled Multimodal Distilling for Emotion Recognition	Mar 24, 2023	Emotion RecognitionKnowledge Distillation	CodeCode Available	1	5
Improving Continual Relation Extraction by Distinguishing Analogous Semantics	May 11, 2023	Continual Relation ExtractionKnowledge Distillation	CodeCode Available	1	5
Improving Knowledge Distillation via Category Structure	Aug 1, 2020	Knowledge Distillation	CodeCode Available	1	5
Simplified TinyBERT: Knowledge Distillation for Document Retrieval	Sep 16, 2020	Document RankingKnowledge Distillation	CodeCode Available	1	5
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation	Feb 5, 2024	Knowledge DistillationRetrieval	CodeCode Available	1	5
Improving Event Detection via Open-domain Trigger Knowledge	Jul 1, 2020	Event DetectionKnowledge Distillation	CodeCode Available	1	5
AltDiffusion: A Multilingual Text-to-Image Diffusion Model	Aug 19, 2023	BlockingConcept Alignment	CodeCode Available	1	5
Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition	Jun 25, 2024	Knowledge DistillationMicro Expression Recognition	CodeCode Available	1	5
Bidirectional Distillation for Top-K Recommender System	Jun 5, 2021	Knowledge DistillationModel Compression	CodeCode Available	1	5
Always Clear Depth: Robust Monocular Depth Estimation under Adverse Weather	May 18, 2025	Autonomous DrivingDepth Estimation	CodeCode Available	1	5
Bi-directional Weakly Supervised Knowledge Distillation for Whole Slide Image Classification	Oct 7, 2022	Classificationimage-classification	CodeCode Available	1	5
Deep Structured Instance Graph for Distilling Object Detectors	Sep 27, 2021	Instance SegmentationKnowledge Distillation	CodeCode Available	1	5
Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation	Oct 15, 2020	Knowledge Distillation	CodeCode Available	1	5
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation	Oct 12, 2022	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 36 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified