Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3651–3700 of 4240 papers

Title	Date	Tasks	Status
Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs	Jun 5, 2023	Entity AlignmentKnowledge Distillation	CodeCode Available
Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization	Aug 6, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available
DMSSN: Distilled Mixed Spectral-Spatial Network for Hyperspectral Salient Object Detection	Mar 31, 2024	Dimensionality ReductionKnowledge Distillation	CodeCode Available
TIE-KD: Teacher-Independent and Explainable Knowledge Distillation for Monocular Depth Estimation	Feb 22, 2024	Depth EstimationKnowledge Distillation	CodeCode Available
A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data	Nov 18, 2020	Decision MakingICU Admission	CodeCode Available
SMOTExT: SMOTE meets Large Language Models	May 19, 2025	Cross-Modal RetrievalData Augmentation	CodeCode Available
Random Path Selection for Continual Learning	Dec 1, 2019	Continual LearningIncremental Learning	CodeCode Available
An Adaptive Random Path Selection Approach for Incremental Learning	Jun 3, 2019	Incremental LearningKnowledge Distillation	CodeCode Available
A Knowledge Distillation-Based Approach to Enhance Transparency of Classifier Models	Feb 21, 2025	Decision MakingKnowledge Distillation	CodeCode Available
Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model	Sep 17, 2024	Knowledge DistillationOperator learning	CodeCode Available
Joint Answering and Explanation for Visual Commonsense Reasoning	Feb 25, 2022	Knowledge DistillationQuestion Answering	CodeCode Available
Being Strong Progressively! Enhancing Knowledge Distillation of Large Language Models through a Curriculum Learning Framework	Jun 6, 2025	Instruction FollowingKnowledge Distillation	CodeCode Available
Comparative Knowledge Distillation	Nov 3, 2023	Data AugmentationKnowledge Distillation	CodeCode Available
Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation	Mar 27, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available
Distribution Aligned Semantics Adaption for Lifelong Person Re-Identification	May 30, 2024	Knowledge DistillationPerson Re-Identification	CodeCode Available
Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network	Jun 12, 2024	Acoustic Scene ClassificationData Augmentation	CodeCode Available
BEBERT: Efficient and Robust Binary Ensemble BERT	Oct 28, 2022	BinarizationComputational Efficiency	CodeCode Available
Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System	Sep 19, 2018	Knowledge DistillationLearning-To-Rank	CodeCode Available
Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation	Feb 7, 2024	DiversityKnowledge Distillation	CodeCode Available
Invariant debiasing learning for recommendation via biased imputation	Dec 28, 2024	ImputationKnowledge Distillation	CodeCode Available
Adaptive Teaching with Shared Classifier for Knowledge Distillation	Jun 12, 2024	Knowledge Distillation	CodeCode Available
Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation	Nov 29, 2019	Knowledge Distillationreinforcement-learning	CodeCode Available
Compact Trilinear Interaction for Visual Question Answering	Sep 26, 2019	BenchmarkingKnowledge Distillation	CodeCode Available
CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis	Jul 24, 2024	Knowledge DistillationLesion Detection	CodeCode Available
Adaptive Search-and-Training for Robust and Efficient Network Pruning	Feb 1, 2023	Knowledge DistillationNetwork Pruning	CodeCode Available
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation	Jun 5, 2024	Few-Shot Image Classificationimage-classification	CodeCode Available
Intra-class Patch Swap for Self-Distillation	May 20, 2025	image-classificationImage Classification	CodeCode Available
Distill n' Explain: explaining graph neural networks using simple surrogates	Mar 17, 2023	Knowledge Distillation	CodeCode Available
Interpreting Microbiome Relative Abundance Data Using Symbolic Regression	Oct 18, 2024	DiagnosticKnowledge Distillation	CodeCode Available
Combining inherent knowledge of vision-language models with unsupervised domain adaptation through strong-weak guidance	Dec 7, 2023	Domain AdaptationKnowledge Distillation	CodeCode Available
RDPD: Rich Data Helps Poor Data via Imitation	Sep 6, 2018	Knowledge Distillation	CodeCode Available
Interpreting and Disentangling Feature Components of Various Complexity from DNNs	Jun 29, 2020	Knowledge Distillation	CodeCode Available
Interpretable Embedding Procedure Knowledge Transfer via Stacked Principal Component Analysis and Graph Neural Network	Apr 28, 2021	Graph Neural NetworkKnowledge Distillation	CodeCode Available
BAM! Born-Again Multi-Task Networks for Natural Language Understanding	Jul 10, 2019	Knowledge DistillationNatural Language Understanding	CodeCode Available
m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers	Feb 26, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available
3MVRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding	Feb 28, 2024	document understandingForm	CodeCode Available
Distilling Virtual Examples for Long-tailed Recognition	Mar 28, 2021	Knowledge DistillationLong-tail Learning	CodeCode Available
Inter-Domain Alignment for Predicting High-Resolution Brain Networks Using Teacher-Student Learning	Oct 6, 2021	DecoderDomain Adaptation	CodeCode Available
Instance Temperature Knowledge Distillation	Jun 27, 2024	Decision MakingEfficient Exploration	CodeCode Available
Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation	Jul 18, 2024	Knowledge DistillationSemantic Segmentation	CodeCode Available
Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism	Oct 19, 2020	DecoderKnowledge Distillation	CodeCode Available
Real-Time Cell Sorting with Scalable In Situ FPGA-Accelerated Deep Learning	Mar 16, 2025	Cell DetectionClassification	CodeCode Available
Real-Time Correlation Tracking via Joint Model Compression and Transfer	Jul 23, 2019	Computational EfficiencyCPU	CodeCode Available
Real-Time Decentralized knowledge Transfer at the Edge	Nov 11, 2020	Knowledge DistillationTransfer Learning	CodeCode Available
Distilling Universal and Joint Knowledge for Cross-Domain Model Compression on Time Series Data	Jul 7, 2023	Knowledge DistillationModel Compression	CodeCode Available
Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations	Sep 13, 2018	Depth EstimationGPU	CodeCode Available
Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias	Mar 1, 2023	Backdoor AttackKnowledge Distillation	CodeCode Available
AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning	Jan 1, 2025	Audio-visual Question AnsweringContinual Learning	CodeCode Available
Aligning Logits Generatively for Principled Black-Box Knowledge Distillation	May 21, 2022	Federated LearningKnowledge Distillation	CodeCode Available
Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation	Jun 10, 2021	Knowledge Distillation	CodeCode Available

Show:10 25 50

← PrevPage 74 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified