Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4076–4100 of 4240 papers

Title	Date	Tasks	Status
Bridging the Gap between Decision and Logits in Decision-based Knowledge Distillation for Pre-trained Language Models	Jun 15, 2023	Data AugmentationKnowledge Distillation	CodeCode Available
Faster gaze prediction with dense networks and Fisher pruning	Jan 17, 2018	Gaze EstimationGaze Prediction	CodeCode Available
AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation	Oct 1, 2024	Code GenerationHumanEval	CodeCode Available
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation	Jun 11, 2024	Audio ClassificationKnowledge Distillation	CodeCode Available
On Membership Inference Attacks in Knowledge Distillation	May 17, 2025	Knowledge DistillationPrivacy Preserving	CodeCode Available
TAKE: Topic-shift Aware Knowledge sElection for Dialogue Generation	Oct 1, 2022	Dialogue GenerationKnowledge Distillation	CodeCode Available
Towards Multi-Morphology Controllers with Diversity and Knowledge Distillation	Apr 22, 2024	DiversityKnowledge Distillation	CodeCode Available
VECT-GAN: A variationally encoded generative model for overcoming data scarcity in pharmaceutical science	Jan 15, 2025	Generative Adversarial NetworkKnowledge Distillation	CodeCode Available
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model	Oct 26, 2023	Data AugmentationGeneral Knowledge	CodeCode Available
DCA: Dividing and Conquering Amnesia in Incremental Object Detection	Mar 19, 2025	Knowledge Distillationobject-detection	CodeCode Available
SecFormer: Fast and Accurate Privacy-Preserving Inference for Transformer Models via SMPC	Jan 1, 2024	Knowledge DistillationPrivacy Preserving	CodeCode Available
Bridging Modalities: Knowledge Distillation and Masked Training for Translating Multi-Modal Emotion Recognition to Uni-Modal, Speech-Only Emotion Recognition	Jan 4, 2024	Emotion RecognitionKnowledge Distillation	CodeCode Available
On the Byzantine-Resilience of Distillation-Based Federated Learning	Feb 19, 2024	Federated LearningKnowledge Distillation	CodeCode Available
Multi-Teacher Language-Aware Knowledge Distillation for Multilingual Speech Emotion Recognition	Jun 10, 2025	Emotion RecognitionKnowledge Distillation	CodeCode Available
Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study	Nov 8, 2022	AttributeData Augmentation	CodeCode Available
Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation	May 16, 2025	Knowledge Distillation	CodeCode Available
FANFOLD: Graph Normalizing Flows-driven Asymmetric Network for Unsupervised Graph-Level Anomaly Detection	Jun 29, 2024	Anomaly DetectionKnowledge Distillation	CodeCode Available
Data Upcycling Knowledge Distillation for Image Super-Resolution	Sep 25, 2023	Image Super-ResolutionKnowledge Distillation	CodeCode Available
On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals	Jul 30, 2021	ClusteringContrastive Learning	CodeCode Available
AMLNet: Adversarial Mutual Learning Neural Network for Non-AutoRegressive Multi-Horizon Time Series Forecasting	Oct 30, 2023	DecoderDiversity	CodeCode Available
On the Generalization vs Fidelity Paradox in Knowledge Distillation	May 21, 2025	Knowledge DistillationTransfer Learning	CodeCode Available
Segmenting the Future	Apr 24, 2019	Autonomous DrivingDecision Making	CodeCode Available
SeizureNet: Multi-Spectral Deep Feature Learning for Seizure Type Classification	Mar 8, 2019	ClassificationEEG	CodeCode Available
Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection	Nov 30, 2022	3D Object DetectionDepth Estimation	CodeCode Available
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers	Nov 8, 2023	Knowledge DistillationOpenAI Gym	CodeCode Available

Show:10 25 50

← PrevPage 164 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified