Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 4240 papers

Title	Date	Tasks	Status	Hype
Dynamic Activation with Knowledge Distillation for Energy-Efficient Spiking NN Ensembles	Feb 19, 2025	DisentanglementEnsemble Learning	—Unverified	0
Capturing Rich Behavior Representations: A Dynamic Action Semantic-Aware Graph Transformer for Video Captioning	Feb 19, 2025	Knowledge DistillationObject	—Unverified	0
MambaLiteSR: Image Super-Resolution with Low-Rank Mamba using Knowledge Distillation	Feb 19, 2025	Image Super-ResolutionKnowledge Distillation	—Unverified	0
JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework	Feb 19, 2025	Change DetectionEarth Observation	CodeCode Available	2
Enhancing Semi-supervised Learning with Zero-shot Pseudolabels	Feb 18, 2025	Knowledge Distillation	—Unverified	0
Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models	Feb 18, 2025	Data AugmentationGSM8K	—Unverified	0
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions	Feb 18, 2025	Knowledge DistillationMath	—Unverified	0
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models	Feb 18, 2025	Knowledge DistillationMixture-of-Experts	—Unverified	0
Does Training with Synthetic Data Truly Protect Privacy?	Feb 18, 2025	Data-free Knowledge DistillationDataset Distillation	CodeCode Available	0
Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation	Feb 17, 2025	Knowledge DistillationMath	CodeCode Available	0
Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?	Feb 17, 2025	Knowledge DistillationLanguage Modeling	CodeCode Available	1
Leave No One Behind: Enhancing Diversity While Maintaining Accuracy in Social Recommendation	Feb 17, 2025	DiversityKnowledge Distillation	CodeCode Available	0
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping	Feb 16, 2025	Code GenerationInstruction Following	CodeCode Available	1
DA-Mamba: Domain Adaptive Hybrid Mamba-Transformer Based One-Stage Object Detection	Feb 16, 2025	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Leveraging Conditional Mutual Information to Improve Large Language Model Fine-Tuning For Classification	Feb 16, 2025	Classificationimage-classification	—Unverified	0
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation	Feb 16, 2025	HallucinationKnowledge Distillation	—Unverified	0
CLoCKDistill: Consistent Location-and-Context-aware Knowledge Distillation for DETRs	Feb 15, 2025	DenoisingKnowledge Distillation	—Unverified	0
LLM-driven Knowledge Distillation for Dynamic Text-Attributed Graphs	Feb 15, 2025	Edge ClassificationKnowledge Distillation	—Unverified	0
AIDE: Agentically Improve Visual Language Model with Domain Experts	Feb 13, 2025	Knowledge DistillationLanguage Modeling	—Unverified	0
LLM Pretraining with Continuous Concepts	Feb 12, 2025	Knowledge DistillationLanguage Modeling	—Unverified	0
Vision-Language Models for Edge Networks: A Comprehensive Survey	Feb 11, 2025	Autonomous VehiclesImage Captioning	—Unverified	0
Optimizing Knowledge Distillation in Transformers: Enabling Multi-Head Attention without Alignment Barriers	Feb 11, 2025	image-classificationImage Classification	—Unverified	0
Life-Code: Central Dogma Modeling with Multi-Omics Sequence Unification	Feb 11, 2025	Knowledge Distillation	—Unverified	0
OpenGrok: Enhancing SNS Data Processing with Distilled Knowledge and Mask-like Mechanisms	Feb 11, 2025	Knowledge DistillationMMLU	CodeCode Available	0
Right Time to Learn:Promoting Generalization via Bio-inspired Spacing Effect in Knowledge Distillation	Feb 10, 2025	Knowledge Distillation	CodeCode Available	0
Progressive Collaborative and Semantic Knowledge Fusion for Generative Recommendation	Feb 10, 2025	Knowledge Distillation	—Unverified	0
DROP: Poison Dilution via Knowledge Distillation for Federated Learning	Feb 10, 2025	Data PoisoningFederated Learning	CodeCode Available	0
Rationalization Models for Text-to-SQL	Feb 10, 2025	Knowledge DistillationLanguage Modeling	—Unverified	0
Contrastive Representation Distillation via Multi-Scale Feature Decoupling	Feb 9, 2025	Knowledge DistillationTransfer Learning	—Unverified	0
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models	Feb 9, 2025	Audio-Visual Speech RecognitionAutomatic Speech Recognition	CodeCode Available	1
Synergistic Effects of Knowledge Distillation and Structured Pruning for Self-Supervised Speech Models	Feb 9, 2025	Knowledge DistillationModel Compression	—Unverified	0
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data	Feb 8, 2025	Knowledge Distillation	—Unverified	0
Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector	Feb 8, 2025	Incremental LearningKnowledge Distillation	—Unverified	0
Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark	Feb 8, 2025	Knowledge DistillationObject Tracking	CodeCode Available	2
Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation	Feb 6, 2025	Knowledge DistillationMachine Translation	CodeCode Available	0
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation	Feb 6, 2025	In-Context LearningKnowledge Distillation	—Unverified	0
Revisiting Intermediate-Layer Matching in Knowledge Distillation: Layer-Selection Strategy Doesn't Matter (Much)	Feb 6, 2025	Knowledge Distillation	—Unverified	0
Towards Unified Music Emotion Recognition across Dimensional and Categorical Models	Feb 6, 2025	Emotion RecognitionKnowledge Distillation	CodeCode Available	1
A Unified Knowledge-Distillation and Semi-Supervised Learning Framework to Improve Industrial Ads Delivery Systems	Feb 5, 2025	Knowledge Distillation	—Unverified	0
Training an LLM-as-a-Judge Model: Pipeline, Insights, and Practical Lessons	Feb 5, 2025	Instruction FollowingKnowledge Distillation	—Unverified	0
MIND: Modality-Informed Knowledge Distillation Framework for Multimodal Clinical Prediction Tasks	Feb 3, 2025	ImputationKnowledge Distillation	—Unverified	0
A Framework for Double-Blind Federated Adaptation of Foundation Models	Feb 3, 2025	Federated Learningimage-classification	—Unverified	0
VLM-Assisted Continual learning for Visual Question Answering in Self-Driving	Feb 2, 2025	Autonomous DrivingContinual Learning	—Unverified	0
A method for estimating forest carbon storage distribution density via artificial intelligence generated content model	Feb 2, 2025	Knowledge Distillation	—Unverified	0
FedHPD: Heterogeneous Federated Reinforcement Learning via Policy Distillation	Feb 2, 2025	Knowledge Distillationreinforcement-learning	CodeCode Available	0
Role of Mixup in Topological Persistence Based Knowledge Distillation for Wearable Sensor Data	Feb 2, 2025	Data AugmentationKnowledge Distillation	—Unverified	0
Robust Knowledge Distillation in Federated Learning: Counteracting Backdoor Attacks	Feb 1, 2025	Federated LearningKnowledge Distillation	CodeCode Available	0
Rethinking the Upsampling Layer in Hyperspectral Image Super Resolution	Jan 30, 2025	Hyperspectral Image Super-ResolutionImage Super-Resolution	—Unverified	0
Mini-ResEmoteNet: Leveraging Knowledge Distillation for Human-Centered Design	Jan 30, 2025	Emotion RecognitionFacial Emotion Recognition	—Unverified	0
RL-based Query Rewriting with Distilled LLM for online E-Commerce Systems	Jan 29, 2025	Knowledge DistillationNatural Language Understanding	—Unverified	0

Show:10 25 50

← PrevPage 7 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified