Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3826–3850 of 4240 papers

Title	Date	Tasks	Status
Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems	May 1, 2018	Knowledge DistillationRetrieval	CodeCode Available
Distilling Global and Local Logits With Densely Connected Relations	Jan 1, 2021	image-classificationImage Classification	CodeCode Available
UPFL: Unsupervised Personalized Federated Learning towards New Clients	Jul 29, 2023	Federated LearningKnowledge Distillation	CodeCode Available
GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning	Oct 20, 2024	Image RetrievalImage-text Retrieval	CodeCode Available
Two-stage Textual Knowledge Distillation for End-to-End Spoken Language Understanding	Oct 25, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Distilling Focal Knowledge From Imperfect Expert for 3D Object Detection	Jan 1, 2023	3D geometry3D Object Detection	CodeCode Available
Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression	Apr 7, 2021	General Classificationimage-classification	CodeCode Available
MoMA: Momentum Contrastive Learning with Multi-head Attention-based Knowledge Distillation for Histopathology Image Analysis	Aug 31, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation	Sep 20, 2022	Data AugmentationKnowledge Distillation	CodeCode Available
GSB: Group Superposition Binarization for Vision Transformer with Limited Training Samples	May 13, 2023	BinarizationKnowledge Distillation	CodeCode Available
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices	Jul 12, 2022	Emotion RecognitionKeyword Spotting	CodeCode Available
Group Multi-View Transformer for 3D Shape Analysis with Spatial Encoding	Dec 27, 2023	3D Classification3D Shape Recognition	CodeCode Available
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing	May 31, 2021	Knowledge DistillationUnsupervised Pre-training	CodeCode Available
Automatic Assignment of Radiology Examination Protocols Using Pre-trained Language Models with Knowledge Distillation	Sep 1, 2020	Data AugmentationKnowledge Distillation	CodeCode Available
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available
Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments	May 26, 2025	Data-free Knowledge DistillationFederated Learning	CodeCode Available
Graph Entropy Minimization for Semi-supervised Node Classification	May 31, 2023	ClassificationKnowledge Distillation	CodeCode Available
Rethinking Intermediate Layers design in Knowledge Distillation for Kidney and Liver Tumor Segmentation	Nov 28, 2023	DiagnosticKnowledge Distillation	CodeCode Available
AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search	Jan 13, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available
Graph-based Knowledge Distillation by Multi-head Attention Network	Jul 4, 2019	Inductive BiasKnowledge Distillation	CodeCode Available
Gradient Knowledge Distillation for Pre-trained Language Models	Nov 2, 2022	Knowledge Distillation	CodeCode Available
MSE-Optimal Neural Network Initialization via Layer Fusion	Jan 28, 2020	General ClassificationKnowledge Distillation	CodeCode Available
Automatic adaptation of object detectors to new domains using self-training	Apr 15, 2019	Domain AdaptationKnowledge Distillation	CodeCode Available
MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition	Aug 29, 2024	Face RecognitionKnowledge Distillation	CodeCode Available
STKDRec: Spatial-Temporal Knowledge Distillation for Takeaway Recommendation	Dec 21, 2024	Knowledge DistillationKnowledge Graphs	CodeCode Available

Show:10 25 50

← PrevPage 154 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified