Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–950 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation	Jun 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Inter-Region Affinity Distillation for Road Marking Segmentation	Apr 11, 2020	Knowledge DistillationLane Detection	CodeCode Available	1	5
Dice Semimetric Losses: Optimizing the Dice Score with Soft Labels	Mar 28, 2023	Knowledge Distillation	CodeCode Available	1	5
Intra-Document Cascading: Learning to Select Passages for Neural Document Ranking	May 20, 2021	Document RankingKnowledge Distillation	CodeCode Available	1	5
DGEKT: A Dual Graph Ensemble Learning Method for Knowledge Tracing	Nov 23, 2022	Ensemble LearningKnowledge Distillation	CodeCode Available	1	5
Decoupled Multimodal Distilling for Emotion Recognition	Mar 24, 2023	Emotion RecognitionKnowledge Distillation	CodeCode Available	1	5
Discriminative and Consistent Representation Distillation	Jul 16, 2024	Causal InferenceContrastive Learning	CodeCode Available	1	5
Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation	Dec 17, 2020	3D Human Pose EstimationKnowledge Distillation	CodeCode Available	1	5
DFIL: Deepfake Incremental Learning by Exploiting Domain-invariant Forgery Clues	Sep 18, 2023	Continual LearningContrastive Learning	CodeCode Available	1	5
Directed Acyclic Transformer for Non-Autoregressive Machine Translation	May 16, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1	5
Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels	Feb 11, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available	1	5
DeepAqua: Self-Supervised Semantic Segmentation of Wetland Surface Water Extent with SAR Images using Knowledge Distillation	May 2, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available	1	5
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval	Oct 11, 2022	Knowledge DistillationQuantization	CodeCode Available	1	5
TinyGAN: Distilling BigGAN for Conditional Image Generation	Sep 29, 2020	Conditional Image GenerationImage Generation	CodeCode Available	1	5
Densely Guided Knowledge Distillation using Multiple Teacher Assistants	Sep 18, 2020	Knowledge DistillationModel Compression	CodeCode Available	1	5
DE-RRD: A Knowledge Distillation Framework for Recommender System	Dec 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1	5
Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models	Nov 2, 2023	Data AugmentationDomain Generalization	CodeCode Available	1	5
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation	Jun 18, 2020	DecoderKnowledge Distillation	CodeCode Available	1	5
AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection	May 21, 2024	Knowledge DistillationPedestrian Detection	CodeCode Available	1	5
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket	Nov 23, 2022	Data AugmentationKnowledge Distillation	CodeCode Available	1	5
Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation	Dec 19, 2021	Anomaly DetectionKnowledge Distillation	CodeCode Available	1	5
DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer	May 21, 2025	DenoisingKnowledge Distillation	CodeCode Available	1	5
Learning Cross-Lingual IR from an English Retriever	Dec 15, 2021	Cross-Lingual Information RetrievalInformation Retrieval	CodeCode Available	1	5
KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation	Nov 19, 2020	Domain AdaptationKnowledge Distillation	CodeCode Available	1	5
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1	5
KDAS: Knowledge Distillation via Attention Supervision Framework for Polyp Segmentation	Dec 13, 2023	Knowledge DistillationMedical Image Segmentation	CodeCode Available	1	5
Black-box Few-shot Knowledge Distillation	Jul 25, 2022	image-classificationImage Classification	CodeCode Available	1	5
KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow	Apr 11, 2020	CPUGPU	CodeCode Available	1	5
Adaptive Multi-Teacher Multi-level Knowledge Distillation	Mar 6, 2021	Knowledge Distillation	CodeCode Available	1	5
KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo	Jul 21, 2022	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1	5
Knapsack Pruning with Inner Distillation	Feb 19, 2020	GPUKnowledge Distillation	CodeCode Available	1	5
Deep Semi-supervised Knowledge Distillation for Overlapping Cervical Cell Instance Segmentation	Jul 21, 2020	Instance SegmentationKnowledge Distillation	CodeCode Available	1	5
Towards Efficient 3D Object Detection with Knowledge Distillation	May 30, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1	5
Deep Structured Instance Graph for Distilling Object Detectors	Sep 27, 2021	Instance SegmentationKnowledge Distillation	CodeCode Available	1	5
Understanding the Role of the Projector in Knowledge Distillation	Mar 20, 2023	image-classificationImage Classification	CodeCode Available	1	5
Knowledge Condensation Distillation	Jul 12, 2022	Knowledge Distillation	CodeCode Available	1	5
Learning to Retrieve In-Context Examples for Large Language Models	Jul 14, 2023	In-Context LearningKnowledge Distillation	CodeCode Available	1	5
LTE4G: Long-Tail Experts for Graph Neural Networks	Aug 22, 2022	Knowledge DistillationNode Classification	CodeCode Available	1	5
Defocus Blur Detection via Depth Distillation	Jul 16, 2020	DecoderDefocus Blur Detection	CodeCode Available	1	5
Deformation Flow Based Two-Stream Network for Lip Reading	Mar 12, 2020	Knowledge DistillationLipreading	CodeCode Available	1	5
Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability	Mar 10, 2022	Knowledge Distillation	CodeCode Available	1	5
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation	Sep 16, 2022	Domain AdaptationImage-to-Image Translation	CodeCode Available	1	5
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs	May 21, 2025	Knowledge DistillationKnowledge Graphs	CodeCode Available	1	5
Knowledge Distillation for BERT Unsupervised Domain Adaptation	Oct 22, 2020	Domain AdaptationGeneral Classification	CodeCode Available	1	5
Knowledge Distillation Based on Transformed Teacher Matching	Feb 17, 2024	Knowledge Distillation	CodeCode Available	1	5
Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL Approach	Dec 12, 2023	Knowledge DistillationOffline RL	CodeCode Available	1	5
Knowledge Distillation for Brain Tumor Segmentation	Feb 10, 2020	Brain Tumor SegmentationKnowledge Distillation	CodeCode Available	1	5
Knowledge Distillation for Feature Extraction in Underwater VSLAM	Mar 31, 2023	BinarizationKnowledge Distillation	CodeCode Available	1	5
Dense Interspecies Face Embedding	Nov 28, 2022	Image ManipulationInterspecies Facial Keypoint Transfer	CodeCode Available	1	5
Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation	Aug 22, 2022	Knowledge DistillationSemantic Segmentation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 19 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified