Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3501–3525 of 4240 papers

Title	Date	Tasks	Status
Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation	Nov 26, 2024	Data-free Knowledge DistillationDiversity	CodeCode Available
Learning from Noisy Crowd Labels with Logics	Feb 13, 2023	Knowledge Distillationnamed-entity-recognition	CodeCode Available
Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition	Feb 28, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Pre-trained Summarization Distillation	Oct 24, 2020	Knowledge DistillationMachine Translation	CodeCode Available
Efficient and Robust Jet Tagging at the LHC with Knowledge Distillation	Nov 23, 2023	Inductive BiasJet Tagging	CodeCode Available
SeqMIA: Sequential-Metric Based Membership Inference Attack	Jul 21, 2024	Inference AttackKnowledge Distillation	CodeCode Available
SeqNAS: Neural Architecture Search for Event Sequence Classification	Jan 6, 2024	Bayesian OptimizationClassification	CodeCode Available
Learning Lightweight Lane Detection CNNs by Self Attention Distillation	Aug 2, 2019	Knowledge DistillationLane Detection	CodeCode Available
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data	Jul 15, 2020	Cross-Lingual NERCross-Lingual Transfer	CodeCode Available
Contrastive Learning in Distilled Models	Jan 23, 2024	Contrastive LearningKnowledge Distillation	CodeCode Available
Language Model Knowledge Distillation for Efficient Question Answering in Spanish	Dec 7, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available
Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias	May 1, 2021	Knowledge DistillationMachine Translation	CodeCode Available
KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer	Feb 22, 2023	Knowledge DistillationTransfer Learning	CodeCode Available
Knowledge Transfer Graph for Deep Collaborative Learning	Sep 10, 2019	Knowledge DistillationTransfer Learning	CodeCode Available
Text Representation Distillation via Information Bottleneck Principle	Nov 9, 2023	Knowledge DistillationRetrieval	CodeCode Available
KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server	Oct 8, 2024	Federated LearningKnowledge Distillation	CodeCode Available
Online Lifelong Generalized Zero-Shot Learning	Mar 19, 2021	Continual LearningGeneralized Zero-Shot Learning	CodeCode Available
Adaptive Temperature Based on Logits Correlation in Knowledge Distillation	Mar 12, 2025	Knowledge Distillation	CodeCode Available
Improving Sequential Recommendations via Bidirectional Temporal Data Augmentation with Pre-training	Dec 13, 2021	Data AugmentationKnowledge Distillation	CodeCode Available
Continual Representation Learning for Biometric Identification	Jun 8, 2020	Continual LearningKnowledge Distillation	CodeCode Available
Continual Panoptic Perception: Towards Multi-modal Incremental Interpretation of Remote Sensing Images	Jul 19, 2024	Caption GenerationContinual Learning	CodeCode Available
Privacy Evaluation Benchmarks for NLP Models	Sep 24, 2024	Knowledge Distillation	CodeCode Available
Knowledge Grafting of Large Language Models	May 24, 2025	Continual LearningKnowledge Distillation	CodeCode Available
Knowledge Extraction with No Observable Data	Dec 1, 2019	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
Learning to Maximize Mutual Information for Chain-of-Thought Distillation	Mar 5, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available

Show:10 25 50

← PrevPage 141 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified