Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4151–4200 of 4240 papers

Title	Date	Tasks	Status
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events	Sep 25, 2024	Audio TaggingAutomatic Speech Recognition	—Unverified
MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution	Apr 15, 2024	Image Super-ResolutionKnowledge Distillation	—Unverified
MulDE: Multi-teacher Knowledge Distillation for Low-dimensional Knowledge Graph Embeddings	Oct 14, 2020	Graph EmbeddingKnowledge Distillation	—Unverified
Multi-adversarial Faster-RCNN with Paradigm Teacher for Unrestricted Object Detection	Dec 11, 2022	Domain AdaptationKnowledge Distillation	—Unverified
Multi-Branch Mutual-Distillation Transformer for EEG-Based Seizure Subtype Classification	Dec 4, 2024	EEGElectroencephalogram (EEG)	—Unverified
Multi-Channel Multi-Domain based Knowledge Distillation Algorithm for Sleep Staging with Single-Channel EEG	Jan 7, 2024	EEGKnowledge Distillation	—Unverified
Cultural Commonsense Knowledge for Intercultural Dialogues	Feb 16, 2024	Knowledge DistillationSpecificity	—Unverified
Multi-Document Financial Question Answering using LLMs	Nov 8, 2024	Knowledge DistillationKnowledge Graphs	—Unverified
Multi-Frame Self-Supervised Depth Estimation with Multi-Scale Feature Fusion in Dynamic Scenes	Mar 26, 2023	Depth EstimationKnowledge Distillation	—Unverified
Multi-Frame to Single-Frame: Knowledge Distillation for 3D Object Detection	Sep 24, 2020	3D Object DetectionAutonomous Driving	—Unverified
Multi-Grained Knowledge Distillation for Named Entity Recognition	Jun 1, 2021	Knowledge Distillationnamed-entity-recognition	—Unverified
Multi-Granularity Contrastive Knowledge Distillation for Multimodal Named Entity Recognition	Nov 16, 2021	Knowledge DistillationMulti-modal Named Entity Recognition	—Unverified
Multi-Granularity Semantic Revision for Large Language Model Distillation	Jul 14, 2024	Knowledge DistillationLanguage Modeling	—Unverified
Multi-head Knowledge Distillation for Model Compression	Dec 5, 2020	image-classificationImage Classification	—Unverified
Multi-label Class Incremental Emotion Decoding with Augmented Emotional Semantics Learning	May 31, 2024	class-incremental learningClass Incremental Learning	—Unverified
Multi-label Contrastive Predictive Coding	Jul 20, 2020	Knowledge DistillationMulti-class Classification	—Unverified
Multi-label Emotion Analysis in Conversation via Multimodal Knowledge Distillation	Oct 27, 2023	Emotion RecognitionKnowledge Distillation	—Unverified
Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model	Nov 2, 2022	Knowledge DistillationLanguage Modeling	—Unverified
Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help?	Oct 15, 2021	Knowledge DistillationMachine Translation	—Unverified
Multilingual Neural Machine Translation: Can Linguistic Hierarchies Help?	Nov 1, 2021	Knowledge DistillationMachine Translation	—Unverified
Multi-MLLM Knowledge Distillation for Out-of-Context News Detection	May 28, 2025	Knowledge DistillationMisinformation	—Unverified
Multimodal Commonsense Knowledge Distillation for Visual Question Answering	Nov 5, 2024	Knowledge DistillationQuestion Answering	—Unverified
Multi-modal Cross-domain Self-supervised Pre-training for fMRI and EEG Fusion	Sep 27, 2024	Data AugmentationEEG	—Unverified
Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting	Apr 16, 2022	Few-Shot LearningFew-Shot Object Detection	—Unverified
Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix	Dec 21, 2021	Knowledge Distillation	—Unverified
Multimodal Locally Enhanced Transformer for Continuous Sign Language Recognition	Aug 22, 2023	Knowledge DistillationPosition	—Unverified
Multimodal Prescriptive Deep Learning	Jan 24, 2025	Deep LearningKnowledge Distillation	—Unverified
Multi-Objective Diverse Human Motion Prediction With Knowledge Distillation	Jan 1, 2022	Autonomous DrivingDiversity	—Unverified
Multi-Person Full Body Pose Estimation	Aug 23, 2020	Knowledge DistillationMulti-Person Pose Estimation	—Unverified
Multi-perspective Contrastive Logit Distillation	Nov 16, 2024	Contrastive Learningimage-classification	—Unverified
Multiple Degradation and Reconstruction Network for Single Image Denoising via Knowledge Distillation	Apr 29, 2022	DenoisingImage Denoising	—Unverified
Multi scale Feature Extraction and Fusion for Online Knowledge Distillation	Jun 16, 2022	Knowledge DistillationTransfer Learning	—Unverified
Learning to Purification for Unsupervised Person Re-identification	Apr 21, 2022	Knowledge DistillationPerson Re-Identification	—Unverified
Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching	Nov 16, 2021	Contrastive LearningKnowledge Distillation	—Unverified
Multi-stage Progressive Compression of Conformer Transducer for On-device Speech Recognition	Oct 1, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Multi-Strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension	Aug 1, 2021	Knowledge DistillationMachine Reading Comprehension	—Unverified
Multitask Emotion Recognition Model with Knowledge Distillation and Task Discriminator	Mar 24, 2022	Emotion RecognitionKnowledge Distillation	—Unverified
Multi-Task Learning with Knowledge Distillation for Dense Prediction	Jan 1, 2023	Boundary DetectionDepth Estimation	—Unverified
Multi-Teacher Knowledge Distillation for Incremental Implicitly-Refined Classification	Feb 23, 2022	ClassificationIncremental Learning	—Unverified
Multivariate Prototype Representation for Domain-Generalized Incremental Learning	Sep 24, 2023	class-incremental learningClass Incremental Learning	—Unverified
Multi-View Attention Transfer for Efficient Speech Enhancement	Aug 22, 2022	Knowledge DistillationSpeech Enhancement	—Unverified
Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation	Feb 22, 2021	Dialogue GenerationGeneral Knowledge	—Unverified
Multi-View Knowledge Distillation from Crowd Annotations for Out-of-Domain Generalization	Dec 19, 2022	Domain GeneralizationKnowledge Distillation	—Unverified
Multi-view knowledge distillation transformer for human action recognition	Mar 25, 2023	Action RecognitionKnowledge Distillation	—Unverified
MUSE: Feature Self-Distillation with Mutual Information and Self-Information	Oct 25, 2021	image-classificationImage Classification	—Unverified
MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition	Oct 29, 2023	Knowledge Distillationspeech-recognition	—Unverified
Mutual Adversarial Training: Learning together is better than going alone	Dec 9, 2021	Knowledge Distillation	—Unverified
Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders	Jun 5, 2024	Knowledge DistillationSelf-Supervised Learning	—Unverified
Mutual Learning for Finetuning Click-Through Rate Prediction Models	Jun 17, 2024	Click-Through Rate PredictionKnowledge Distillation	—Unverified
Mutual-Learning Improves End-to-End Speech Translation	Nov 1, 2021	Knowledge DistillationMachine Translation	—Unverified

Show:10 25 50

← PrevPage 84 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified