Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 951–1000 of 4240 papers

Title	Date	Tasks	Status	Hype
Densely Guided Knowledge Distillation using Multiple Teacher Assistants	Sep 18, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Noisy Self-Knowledge Distillation for Text Summarization	Sep 15, 2020	Knowledge DistillationSelf-Knowledge Distillation	CodeCode Available	1
Transferring Knowledge Distillation for Multilingual Social Event Detection	Aug 6, 2021	Cross-Lingual Word EmbeddingsEvent Detection	CodeCode Available	1
DE-RRD: A Knowledge Distillation Framework for Recommender System	Dec 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Boosting Light-Weight Depth Estimation Via Knowledge Distillation	May 13, 2021	Computational EfficiencyDepth Estimation	CodeCode Available	1
Transport-Hub-Aware Spatial-Temporal Adaptive Graph Transformer for Traffic Flow Prediction	Oct 12, 2023	Incremental LearningKnowledge Distillation	CodeCode Available	1
TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation	Apr 2, 2024	Knowledge DistillationVisual Place Recognition	CodeCode Available	1
Brittle Features May Help Anomaly Detection	Apr 21, 2021	Anomaly DetectionKnowledge Distillation	—Unverified	0
Bring the Power of Diffusion Model to Defect Detection	Aug 25, 2024	Defect DetectionDenoising	—Unverified	0
An Enhanced Low-Resolution Image Recognition Method for Traffic Environments	Sep 28, 2023	Computational EfficiencyKnowledge Distillation	—Unverified	0
Bridging the Modality Gap: Enhancing Channel Prediction with Semantically Aligned LLMs and Knowledge Distillation	May 19, 2025	Knowledge DistillationPrediction	—Unverified	0
Bridging the Gap: Unpacking the Hidden Challenges in Knowledge Distillation for Online Ranking Systems	Aug 26, 2024	Knowledge DistillationRecommendation Systems	—Unverified	0
An Empirical Study of Uniform-Architecture Knowledge Distillation in Document Ranking	Feb 8, 2023	Document RankingKnowledge Distillation	—Unverified	0
Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation	Nov 1, 2020	DecoderDialogue Generation	—Unverified	0
Bridging the Gap Between Patient-specific and Patient-independent Seizure Prediction via Knowledge Distillation	Feb 25, 2022	Knowledge DistillationPrediction	—Unverified	0
A Deep Hierarchical Feature Sparse Framework for Occluded Person Re-Identification	Jan 15, 2024	Data AugmentationKnowledge Distillation	—Unverified	0
Bridging the gap between Human Action Recognition and Online Action Detection	Jan 21, 2021	Action DetectionAction Recognition	—Unverified	0
An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models	Apr 19, 2023	Knowledge DistillationMachine Translation	—Unverified	0
Supervised domain adaptation for building extraction from off-nadir aerial images	Nov 7, 2023	Domain AdaptationEarth Observation	—Unverified	0
Dual Discriminator Adversarial Distillation for Data-free Model Compression	Apr 12, 2021	Data-free Knowledge DistillationKnowledge Distillation	—Unverified	0
Dual Embodied-Symbolic Concept Representations for Deep Learning	Mar 1, 2022	class-incremental learningClass Incremental Learning	—Unverified	0
An Empirical Study of Efficient ASR Rescoring with Transformers	Oct 24, 2019	Knowledge DistillationLanguage Modeling	—Unverified	0
Ground Reaction Force Estimation via Time-aware Knowledge Distillation	Jun 12, 2025	Knowledge Distillation	—Unverified	0
Bridging Fairness and Environmental Sustainability in Natural Language Processing	Nov 8, 2022	Dimensionality ReductionFairness	—Unverified	0
An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation	Jan 12, 2024	Knowledge Distillation	—Unverified	0
Addressing Bias Through Ensemble Learning and Regularized Fine-Tuning	Feb 1, 2024	Ensemble LearningKnowledge Distillation	—Unverified	0
DS-ViT: Dual-Stream Vision Transformer for Cross-Task Distillation in Alzheimer's Early Diagnosis	Sep 11, 2024	ClassificationKnowledge Distillation	—Unverified	0
Direct Preference Knowledge Distillation for Large Language Models	Jun 28, 2024	Knowledge Distillation	—Unverified	0
Bridging Classical and Quantum Machine Learning: Knowledge Transfer From Classical to Quantum Neural Networks Using Knowledge Distillation	Nov 23, 2023	Dimensionality ReductionImage Classification	—Unverified	0
An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation	Jun 6, 2020	Data AugmentationKnowledge Distillation	—Unverified	0
Bridge the Gap between Past and Future: Siamese Model Optimization for Context-Aware Document Ranking	May 20, 2025	Document RankingInformation Retrieval	—Unverified	0
DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling	Oct 7, 2020	Knowledge DistillationQuestion Answering	—Unverified	0
An Efficient Private GPT Never Autoregressively Decodes	May 21, 2025	Knowledge Distillation	—Unverified	0
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models	Oct 13, 2023	Knowledge Distillation	—Unverified	0
DTCM: Deep Transformer Capsule Mutual Distillation for Multivariate Time Series Classification	Feb 26, 2024	Knowledge DistillationRelation Network	—Unverified	0
DILEMMA: Joint LLM Quantization and Distributed LLM Inference Over Edge Computing Systems	Mar 3, 2025	Edge-computingKnowledge Distillation	—Unverified	0
Breaking the trade-off in personalized speech enhancement with cross-task knowledge distillation	Nov 5, 2022	Knowledge DistillationSpeech Enhancement	—Unverified	0
DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation	Sep 22, 2024	Image GenerationKnowledge Distillation	—Unverified	0
Digital Twin-Assisted Knowledge Distillation Framework for Heterogeneous Federated Learning	Mar 10, 2023	Federated LearningKnowledge Distillation	—Unverified	0
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs	Apr 24, 2025	Image-text RetrievalInstruction Following	—Unverified	0
An Efficient Method of Training Small Models for Regression Problems with Knowledge Distillation	Feb 28, 2020	Knowledge DistillationMemorization	—Unverified	0
Add a SideNet to your MainNet	Jul 14, 2020	General ClassificationKnowledge Distillation	—Unverified	0
Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs	Feb 29, 2024	Dataset GenerationKnowledge Distillation	—Unverified	0
Direct Distillation between Different Domains	Jan 12, 2024	Domain AdaptationKnowledge Distillation	—Unverified	0
Digging Deeper into CRNN Model in Chinese Text Images Recognition	Nov 17, 2020	DenoisingKnowledge Distillation	—Unverified	0
DS3-Net: Difficulty-perceived Common-to-T1ce Semi-Supervised Multimodal MRI Synthesis Network	Mar 14, 2022	Knowledge DistillationSSIM	—Unverified	0
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization	Dec 20, 2023	Knowledge DistillationNatural Language Understanding	—Unverified	0
DiReDi: Distillation and Reverse Distillation for AIoT Applications	Sep 12, 2024	Knowledge DistillationManagement	—Unverified	0
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser	Nov 28, 2023	3D Face AnimationContrastive Learning	—Unverified	0
Towards Complementary Knowledge Distillation for Efficient Dense Image Prediction	Jan 24, 2024	Implicit RelationsInstance Segmentation	—Unverified	0

Show:10 25 50

← PrevPage 20 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified