Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2201–2250 of 4240 papers

Title	Date	Tasks	Status
Students taught by multimodal teachers are superior action recognizers	Oct 9, 2022	Action RecognitionKnowledge Distillation	—Unverified
Students Who Study Together Learn Better: On the Importance of Collective Knowledge Distillation for Domain Transfer in Fact Verification	Nov 1, 2021	Fact VerificationKnowledge Distillation	—Unverified
Study of Encoder-Decoder Architectures for Code-Mix Search Query Translation	Aug 7, 2022	Data AugmentationDecoder	—Unverified
Style over Substance: Distilled Language Models Reason Via Stylistic Replication	Apr 2, 2025	Knowledge Distillation	—Unverified
Sub-Band Knowledge Distillation Framework for Speech Enhancement	May 29, 2020	Knowledge DistillationSpeech Enhancement	—Unverified
Subclass Knowledge Distillation with Known Subclass Labels	Jul 17, 2022	Binary ClassificationKnowledge Distillation	—Unverified
Sub-Graph Learning for Spatiotemporal Forecasting via Knowledge Distillation	Nov 17, 2022	DiversityGraph Learning	—Unverified
SUGAR: Pre-training 3D Visual Representations for Robotics	Apr 1, 2024	3D Instance Segmentation3D Object Recognition	—Unverified
Supervised Graph Contrastive Pretraining for Text Classification	Dec 21, 2021	ClassificationContrastive Learning	—Unverified
Supervision Complexity and its Role in Knowledge Distillation	Jan 28, 2023	image-classificationImage Classification	—Unverified
Supporting Cross-language Cross-project Bug Localization Using Pre-trained Language Models	Jul 3, 2024	Contrastive LearningCPU	—Unverified
Knowledge Distillation in Federated Edge Learning: A Survey	Jan 14, 2023	Knowledge DistillationSurvey	—Unverified
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application	Jul 2, 2024	Knowledge DistillationSurvey	—Unverified
Swing Distillation: A Privacy-Preserving Knowledge Distillation Framework	Dec 16, 2022	Knowledge DistillationModel Compression	—Unverified
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models	Oct 25, 2024	Instruction FollowingKnowledge Distillation	—Unverified
Synergic Adversarial Label Learning for Grading Retinal Diseases via Knowledge Distillation and Multi-task Learning	Mar 24, 2020	ClassificationGeneral Classification	—Unverified
Synergistic Effects of Knowledge Distillation and Structured Pruning for Self-Supervised Speech Models	Feb 9, 2025	Knowledge DistillationModel Compression	—Unverified
Syntactic Structure Distillation Pretraining For Bidirectional Encoders	May 27, 2020	Knowledge DistillationLanguage Modeling	—Unverified
Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks	Jul 22, 2024	Knowledge Distillation	—Unverified
Synthetic Unknown Class Learning for Learning Unknowns	Nov 15, 2021	DiversityKnowledge Distillation	—Unverified
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models	Jan 28, 2025	Knowledge DistillationModel Compression	—Unverified
Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation	Sep 29, 2024	Federated LearningKnowledge Distillation	—Unverified
Take a Prior from Other Tasks for Severe Blur Removal	Feb 14, 2023	DeblurringImage Deblurring	—Unverified
TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models	Jun 3, 2025	DecoderKnowledge Distillation	—Unverified
Talking Models: Distill Pre-trained Knowledge to Downstream Models via Interactive Communication	Oct 4, 2023	DecoderKnowledge Distillation	—Unverified
Target-driven Self-Distillation for Partial Observed Trajectories Forecasting	Jan 28, 2025	Autonomous DrivingKnowledge Distillation	—Unverified
Targeted Forgetting of Image Subgroups in CLIP Models	Jan 1, 2025	Knowledge DistillationUnsupervised Pre-training	—Unverified
TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant	Oct 16, 2024	Knowledge DistillationTransfer Learning	—Unverified
Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation	Mar 25, 2023	Continual LearningKnowledge Distillation	—Unverified
Task-Balanced Distillation for Object Detection	Aug 5, 2022	ClassificationKnowledge Distillation	—Unverified
TASKED: Transformer-based Adversarial learning for human activity recognition using wearable sensors via Self-KnowledgE Distillation	Sep 14, 2022	Activity RecognitionHuman Activity Recognition	—Unverified
Task Integration Distillation for Object Detectors	Apr 2, 2024	Knowledge DistillationObject	—Unverified
Task-Specific Knowledge Distillation from the Vision Foundation Model for Enhanced Medical Image Segmentation	Mar 10, 2025	Image SegmentationKnowledge Distillation	—Unverified
Teacher's pet: understanding and mitigating biases in distillation	Jun 19, 2021	image-classificationImage Classification	—Unverified
Teacher-Student Architecture for Knowledge Learning: A Survey	Oct 28, 2022	Knowledge DistillationMulti-Task Learning	—Unverified
Teacher-Student Architecture for Knowledge Distillation: A Survey	Aug 8, 2023	Knowledge Distillationregression	—Unverified
Teacher-Student chain for efficient semi-supervised histology image classification	Mar 17, 2020	ClassificationGeneral Classification	—Unverified
Teacher-Student Knowledge Distillation for Radar Perception on Embedded Accelerators	Mar 14, 2023	Knowledge Distillationobject-detection	—Unverified
Distilled Siamese Networks for Visual Tracking	Jul 24, 2019	Knowledge DistillationObject Tracking	—Unverified
Teacher-Student Training and Triplet Loss for Facial Expression Recognition under Occlusion	Aug 3, 2020	Facial Expression RecognitionFacial Expression Recognition (FER)	—Unverified
Teacher-Student Training and Triplet Loss to Reduce the Effect of Drastic Face Occlusion	Nov 20, 2021	Age EstimationFacial Expression Recognition	—Unverified
Teacher-Student Training for Robust Tacotron-based TTS	Nov 7, 2019	DecoderKnowledge Distillation	—Unverified
Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios	Jun 8, 2024	Knowledge Distillation	—Unverified
"Teaching Independent Parts Separately" (TIPSy-GAN) : Improving Accuracy and Stability in Unsupervised Adversarial 2D to 3D Pose Estimation	May 12, 2022	3D Human Pose Estimation3D Pose Estimation	—Unverified
Teaching MLP More Graph Information: A Three-stage Multitask Knowledge Distillation Framework	Mar 2, 2024	Knowledge Distillation	—Unverified
Teaching pathology foundation models to accurately predict gene expression with parameter efficient knowledge transfer	Apr 9, 2025	Knowledge Distillationparameter-efficient fine-tuning	—Unverified
Teaching Small Language Models to Reason	Dec 16, 2022	GSM8KKnowledge Distillation	—Unverified
Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection	Jun 11, 2024	Knowledge Distillationobject-detection	—Unverified
Teach me with a Whisper: Enhancing Large Language Models for Analyzing Spoken Transcripts using Speech Embeddings	Nov 13, 2023	Knowledge DistillationLanguage Modeling	—Unverified
Teach model to answer questions after comprehending the document	Jul 18, 2023	Knowledge DistillationMachine Reading Comprehension	—Unverified

Show:10 25 50

← PrevPage 45 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified