Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3501–3525 of 4240 papers

Title	Date	Tasks	Status
Follow Your Path: a Progressive Method for Knowledge Distillation	Jul 20, 2021	Knowledge Distillation	—Unverified
For the Misgendered Chinese in Gender Bias Research: Multi-Task Learning with Knowledge Distillation for Pinyin Name-Gender Prediction	May 10, 2024	Gender PredictionKnowledge Distillation	—Unverified
Forward-Backward Knowledge Distillation for Continual Clustering	May 29, 2024	ClusteringContinual Learning	—Unverified
Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption	Aug 23, 2024	Instruction FollowingKnowledge Distillation	—Unverified
FPGA Resource-aware Structured Pruning for Real-Time Neural Networks	Aug 9, 2023	Classificationimage-classification	—Unverified
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks	Jun 14, 2022	Knowledge Distillationreinforcement-learning	—Unverified
FreeTransfer-X: Safe and Label-Free Cross-Lingual Transfer from Off-the-Shelf Models	Jun 14, 2022	Cross-Lingual TransferDiagnostic	—Unverified
FReTAL: Generalizing Deepfake Detection using Knowledge Distillation and Representation Learning	May 28, 2021	DeepFake DetectionDomain Adaptation	—Unverified
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks	May 9, 2024	Knowledge DistillationModel Compression	—Unverified
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation	May 26, 2025	Graph GenerationKnowledge Distillation	—Unverified
From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation	Jul 12, 2024	Graph GenerationKnowledge Distillation	—Unverified
From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels	Mar 23, 2023	Knowledge DistillationSelf-Knowledge Distillation	—Unverified
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs	Apr 18, 2025	Knowledge DistillationModel Compression	—Unverified
From LLM to NMT: Advancing Low-Resource Machine Translation with Claude	Apr 22, 2024	Knowledge DistillationLanguage Modeling	—Unverified
From Multimodal to Unimodal Attention in Transformers using Knowledge Distillation	Oct 15, 2021	Knowledge DistillationMultimodal Deep Learning	—Unverified
From Two-Stream to One-Stream: Efficient RGB-T Tracking via Mutual Prompt Learning and Knowledge Distillation	Mar 25, 2024	Knowledge DistillationObject Tracking	—Unverified
From Wide to Deep: Dimension Lifting Network for Parameter-efficient Knowledge Graph Embedding	Mar 22, 2023	Graph EmbeddingKnowledge Distillation	—Unverified
FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation	Jun 19, 2023	Action RecognitionFederated Learning	—Unverified
Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners	Jul 4, 2024	Domain GeneralizationFew-Shot Learning	—Unverified
Fusing Bidirectional Chains of Thought and Reward Mechanisms A Method for Enhancing Question-Answering Capabilities of Large Language Models for Chinese Intangible Cultural Heritage	May 13, 2025	Knowledge DistillationLarge Language Model	—Unverified
Future-Guided Incremental Transformer for Simultaneous Translation	Dec 23, 2020	Knowledge DistillationTranslation	—Unverified
Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK	Feb 16, 2023	BenchmarkingKnowledge Distillation	—Unverified
GAI-Enabled Explainable Personalized Federated Semi-Supervised Learning	Oct 11, 2024	Federated LearningKnowledge Distillation	—Unverified
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models	Oct 7, 2020	AllKnowledge Distillation	—Unverified
GAML-BERT: Improving BERT Early Exiting by Gradient Aligned Mutual Learning	Nov 1, 2021	Knowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 141 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified