Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4001–4050 of 4240 papers

Title	Date	Tasks	Status
Lightweight 3D Human Pose Estimation Network Training Using Teacher-Student Learning	Jan 15, 2020	3D Human Pose Estimation3D Pose Estimation	—Unverified
Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval	Feb 27, 2025	Cross-Modal RetrievalKnowledge Distillation	—Unverified
Lightweight Modality Adaptation to Sequential Recommendation via Correlation Supervision	Jan 14, 2024	Knowledge DistillationRepresentation Learning	—Unverified
Lightweight Neural Network with Knowledge Distillation for CSI Feedback	Oct 31, 2022	Knowledge Distillation	—Unverified
Lightweight Sound Event Detection Model with RepVGG Architecture	Nov 1, 2022	Event DetectionKnowledge Distillation	—Unverified
Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models	Jun 16, 2025	Knowledge DistillationSemantic Communication	—Unverified
Limitations of Knowledge Distillation for Zero-shot Transfer Learning	Nov 1, 2021	CPUCross-Lingual Transfer	—Unverified
Linear Projections of Teacher Embeddings for Few-Class Distillation	Sep 30, 2024	Binary ClassificationKnowledge Distillation	—Unverified
Linkless Link Prediction via Relational Distillation	Oct 11, 2022	Knowledge DistillationLink Prediction	—Unverified
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models	Jun 5, 2022	Knowledge DistillationLipreading	—Unverified
Lipschitz Continuity Guided Knowledge Distillation	Aug 29, 2021	Knowledge DistillationModel Compression	—Unverified
ListBERT: Learning to Rank E-commerce products with Listwise BERT	Jun 30, 2022	Knowledge DistillationLearning-To-Rank	—Unverified
LIT: Block-wise Intermediate Representation Training for Model Compression	Oct 2, 2018	Knowledge DistillationModel Compression	—Unverified
LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation	Jan 22, 2025	Image GenerationKnowledge Distillation	—Unverified
LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving	Mar 13, 2024	Autonomous DrivingKnowledge Distillation	—Unverified
Llama-Nemotron: Efficient Reasoning Models	May 2, 2025	Knowledge DistillationNeural Architecture Search	—Unverified
LLAVADI: What Matters For Multimodal Large Language Models Distillation	Jul 28, 2024	Knowledge Distillation	—Unverified
LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound	Oct 19, 2024	Instruction FollowingKnowledge Distillation	—Unverified
LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification	Feb 26, 2024	Data AugmentationKnowledge Distillation	—Unverified
LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering	Dec 13, 2024	Few-Shot LearningKnowledge Distillation	—Unverified
LLM-driven Knowledge Distillation for Dynamic Text-Attributed Graphs	Feb 15, 2025	Edge ClassificationKnowledge Distillation	—Unverified
LLM Pretraining with Continuous Concepts	Feb 12, 2025	Knowledge DistillationLanguage Modeling	—Unverified
LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation	Apr 1, 2024	Knowledge Distillation	—Unverified
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward	Sep 19, 2024	Dialogue GenerationKnowledge Distillation	—Unverified
Local Correlation Consistency for Knowledge Distillation	Aug 1, 2020	Knowledge Distillation	—Unverified
LoCa: Logit Calibration for Knowledge Distillation	Sep 7, 2024	image-classificationImage Classification	—Unverified
Locally Linear Region Knowledge Distillation	Oct 9, 2020	Knowledge Distillation	—Unverified
Local-Selective Feature Distillation for Single Image Super-Resolution	Nov 22, 2021	Image Super-ResolutionKnowledge Distillation	—Unverified
Local-to-Global Self-Supervised Representation Learning for Diabetic Retinopathy Grading	Oct 1, 2024	Diabetic Retinopathy Gradingimage-classification	—Unverified
Local vs. Global: Local Land-Use and Land-Cover Models Deliver Higher Quality Maps	Dec 1, 2024	Earth ObservationKnowledge Distillation	—Unverified
Logic Distillation: Learning from Code Function by Function for Planning and Decision-making	Jul 28, 2024	Decision MakingKnowledge Distillation	—Unverified
Logits Poisoning Attack in Federated Distillation	Jan 8, 2024	Federated LearningKnowledge Distillation	—Unverified
LokiLM: Technical Report	Jul 10, 2024	Knowledge DistillationLanguage Modeling	—Unverified
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning	Jan 1, 2021	class-incremental learningClass Incremental Learning	—Unverified
Long-Range Zero-Shot Generative Deep Network Quantization	Nov 13, 2022	Knowledge DistillationQuantization	—Unverified
Long-Tailed Continual Learning For Visual Food Recognition	Jul 1, 2023	Continual LearningData Augmentation	—Unverified
Long-tailed Food Classification	Oct 26, 2022	ClassificationData Augmentation	—Unverified
Hierarchical Knowledge Guided Learning for Real-world Retinal Diseases Recognition	Nov 17, 2021	Knowledge Distillation	—Unverified
Long-Tailed Question Answering in an Open World	May 11, 2023	Knowledge DistillationLanguage Modelling	—Unverified
Long-Term Vehicle Localization by Recursive Knowledge Distillation	Apr 7, 2019	Domain AdaptationEnsemble Learning	—Unverified
LookALike: Human Mimicry based collaborative decision making	Mar 16, 2024	Decision MakingKnowledge Distillation	—Unverified
Look Backward and Forward: Self-Knowledge Distillation with Bidirectional Decoder for Neural Machine Translation	Mar 10, 2022	DecoderKnowledge Distillation	—Unverified
Look One and More: Distilling Hybrid Order Relational Knowledge for Cross-Resolution Image Recognition	Sep 9, 2024	Face Recognitionimage-classification	—Unverified
Lost in Distillation: A Case Study in Toxicity Modeling	Jul 1, 2022	Knowledge Distillation	—Unverified
Low-Complexity Inference in Continual Learning via Compressed Knowledge Transfer	May 13, 2025	class-incremental learningClass Incremental Learning	—Unverified
Low-Dimensional Federated Knowledge Graph Embedding via Knowledge Distillation	Aug 11, 2024	Graph EmbeddingKnowledge Distillation	—Unverified
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network	Sep 22, 2021	Knowledge DistillationLanguage Modeling	—Unverified
Low-Resolution Chest X-ray Classification via Knowledge Distillation and Multi-task Learning	May 22, 2024	DiagnosticKnowledge Distillation	—Unverified
Low-resolution Face Recognition in the Wild via Selective Knowledge Distillation	Nov 25, 2018	CPUFace Model	—Unverified
Low-Resolution Face Recognition via Adaptable Instance-Relation Distillation	Sep 3, 2024	Face RecognitionKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 81 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified