Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1901–1950 of 4240 papers

Title	Date	Tasks	Status
IL-NeRF: Incremental Learning for Neural Radiance Fields with Camera Pose Alignment	Dec 10, 2023	Incremental LearningKnowledge Distillation	—Unverified
Head-Tail-Aware KL Divergence in Knowledge Distillation for Spiking Neural Networks	Apr 29, 2025	Knowledge DistillationTransfer Learning	—Unverified
Decoupled Transformer for Scalable Inference in Open-domain Question Answering	Sep 1, 2021	Knowledge DistillationMachine Reading Comprehension	—Unverified
Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer	Jan 21, 2022	Knowledge DistillationTransfer Learning	—Unverified
Headache to Overstock? Promoting Long-tail Items through Debiased Product Bundling	Nov 28, 2024	Knowledge DistillationNavigate	—Unverified
Decoupled Transformer for Scalable Inference in Open-domain Question Answering	Aug 5, 2021	Knowledge DistillationMachine Reading Comprehension	—Unverified
Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks	Apr 19, 2023	Knowledge Distillation	—Unverified
Impossible Triangle: What's Next for Pre-trained Language Models?	Apr 13, 2022	Data AugmentationFew-Shot Learning	—Unverified
AMD: Automatic Multi-step Distillation of Large-scale Vision Models	Jul 5, 2024	image-classificationImage Classification	—Unverified
hdl2v: A Code Translation Dataset for Enhanced LLM Verilog Generation	Jun 5, 2025	Code GenerationCode Translation	—Unverified
Spectral Maps for Learning on Subgraphs	May 30, 2022	Graph LearningKnowledge Distillation	—Unverified
Harnessing Increased Client Participation with Cohort-Parallel Federated Learning	May 24, 2024	Federated Learningimage-classification	—Unverified
Harmonizing knowledge Transfer in Neural Network with Unified Distillation	Sep 27, 2024	Knowledge DistillationTransfer Learning	—Unverified
Improved implicit diffusion model with knowledge distillation to estimate the spatial distribution density of carbon stock in remote sensing imagery	Nov 27, 2024	Knowledge Distillation	—Unverified
HARD: Hard Augmentations for Robust Distillation	May 24, 2023	Data AugmentationDomain Generalization	—Unverified
Hard Gate Knowledge Distillation -- Leverage Calibration for Robust and Reliable Language Model	Oct 22, 2022	Knowledge DistillationLanguage Modeling	—Unverified
BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions	Jan 1, 2025	Knowledge DistillationMotion Estimation	—Unverified
Improved Knowledge Distillation via Adversarial Collaboration	Nov 29, 2021	Knowledge Distillation	—Unverified
AMD: Adaptive Masked Distillation for Object Detection	Jan 31, 2023	Knowledge DistillationModel Compression	—Unverified
HanjaBridge: Resolving Semantic Ambiguity in Korean LLMs via Hanja-Augmented Pre-Training	Jul 15, 2025	Cross-Lingual TransferKnowledge Distillation	—Unverified
Hands-on Guidance for Distilling Object Detectors	Mar 26, 2021	Knowledge DistillationObject	—Unverified
Decoupled Alignment for Robust Plug-and-Play Adaptation	Jun 3, 2024	Knowledge Distillation	—Unverified
Handling Long-tailed Feature Distribution in AdderNets	Dec 1, 2021	Knowledge Distillation	—Unverified
Improve Knowledge Distillation via Label Revision and Data Selection	Apr 3, 2024	Knowledge DistillationModel Compression	—Unverified
De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts	Mar 28, 2024	Causal InferenceData-free Knowledge Distillation	—Unverified
Improving Acoustic Scene Classification in Low-Resource Conditions	Dec 30, 2024	Acoustic Scene ClassificationClassification	—Unverified
GVP: Generative Volumetric Primitives	Mar 31, 2023	Image GenerationKnowledge Distillation	—Unverified
Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation	Jun 12, 2021	DecoderKnowledge Distillation	—Unverified
Improving Autoregressive NMT with Non-Autoregressive Model	Jul 1, 2020	Decoderde-en	—Unverified
Improving CLIP Robustness with Knowledge Distillation and Self-Training	Sep 19, 2023	Knowledge Distillation	—Unverified
Bilateral Memory Consolidation for Continual Learning	Jan 1, 2023	Continual LearningKnowledge Distillation	—Unverified
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation	Apr 17, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Guided Deep Metric Learning	Jun 4, 2022	Few-Shot LearningKnowledge Distillation	—Unverified
GTCOM Neural Machine Translation Systems for WMT19	Aug 1, 2019	Knowledge DistillationLanguage Modeling	—Unverified
Decision Boundary-aware Knowledge Consolidation Generates Better Instance-Incremental Learner	Jun 5, 2024	class-incremental learningClass Incremental Learning	—Unverified
Improving De-Raining Generalization via Neural Reorganization	Jan 1, 2021	Knowledge Distillation	—Unverified
Growing Deep Neural Network Considering with Similarity between Neurons	Aug 23, 2024	Decision MakingKnowledge Distillation	—Unverified
Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space	Apr 1, 2021	Federated LearningKnowledge Distillation	—Unverified
Debias the Black-box: A Fair Ranking Framework via Knowledge Distillation	Aug 24, 2022	FairnessInformation Retrieval	—Unverified
Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation	Apr 9, 2024	Emotion RecognitionFacial Landmark Detection	—Unverified
Improving Feature Generalizability with Multitask Learning in Class Incremental Learning	Apr 26, 2022	class-incremental learningClass Incremental Learning	—Unverified
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition	Jun 9, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Always Strengthen Your Strengths: A Drift-Aware Incremental Learning Framework for CTR Prediction	Apr 17, 2023	Click-Through Rate PredictionDiversity	—Unverified
Adaptively Integrated Knowledge Distillation and Prediction Uncertainty for Continual Learning	Jan 18, 2023	Continual LearningKnowledge Distillation	—Unverified
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging	Dec 12, 2022	Knowledge DistillationQuestion Answering	—Unverified
Improving Knowledge Distillation for BERT Models: Loss Functions, Mapping Methods, and Weight Tuning	Aug 26, 2023	Knowledge DistillationModel Compression	—Unverified
A Closer Look at Knowledge Distillation with Features, Logits, and Gradients	Mar 18, 2022	Incremental LearningKnowledge Distillation	—Unverified
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation	Aug 1, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
AdvFunMatch: When Consistent Teaching Meets Adversarial Robustness	May 24, 2023	Adversarial RobustnessKnowledge Distillation	—Unverified
Group-Mix SAM: Lightweight Solution for Industrial Assembly Line Applications	Mar 15, 2024	Knowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 39 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified