Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3851–3900 of 4240 papers

Title	Date	Tasks	Status
Distilled Neural Networks for Efficient Learning to Rank	Feb 22, 2022	CPUInformation Retrieval	CodeCode Available
MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation	Mar 14, 2024	Knowledge DistillationMachine Translation	CodeCode Available
MuGSI: Distilling GNNs with Multi-Granularity Structural Information for Graph Classification	Jun 28, 2024	ClassificationGraph Classification	CodeCode Available
Distilled Gradual Pruning with Pruned Fine-tuning	Feb 15, 2024	Image ClassificationKnowledge Distillation	CodeCode Available
ST-MFNet Mini: Knowledge Distillation-Driven Frame Interpolation	Feb 16, 2023	Knowledge DistillationNetwork Pruning	CodeCode Available
Multi-aspect Knowledge Distillation with Large Language Model	Jan 23, 2025	image-classificationImage Classification	CodeCode Available
Distilled GPT for Source Code Summarization	Aug 28, 2023	Code SummarizationGPU	CodeCode Available
Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling	May 31, 2024	DenoisingImage Generation	CodeCode Available
Distill-DBDGAN: Knowledge Distillation and Adversarial Learning Framework for Defocus Blur Detection	Feb 1, 2023	Defocus Blur DetectionGenerative Adversarial Network	CodeCode Available
Stolen Subwords: Importance of Vocabularies for Machine Translation Model Stealing	Jan 29, 2024	Knowledge DistillationMachine Translation	CodeCode Available
Multi-fidelity Neural Architecture Search with Knowledge Distillation	Jun 15, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available
StrassenNets: Deep Learning with a Multiplication Budget	Dec 11, 2017	Deep Learningimage-classification	CodeCode Available
DistillCSE: Distilled Contrastive Learning for Sentence Embeddings	Oct 20, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available
Towards a Unified Conversational Recommendation System: Multi-task Learning via Contextualized Knowledge Distillation	Oct 27, 2023	Conversational RecommendationDiversity	CodeCode Available
Distillation Techniques for Pseudo-rehearsal Based Incremental Learning	Jul 8, 2018	Incremental LearningKnowledge Distillation	CodeCode Available
GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision	Apr 7, 2025	Attributeclass-incremental learning	CodeCode Available
Multi-granularity for knowledge distillation	Aug 15, 2021	Knowledge DistillationPerson Re-Identification	CodeCode Available
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes	Jul 1, 2024	Knowledge Distillation	CodeCode Available
Multi-Granularity Structural Knowledge Distillation for Language Model Compression	May 1, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available
WARLearn: Weather-Adaptive Representation Learning	Nov 21, 2024	2D Object DetectionAdversarial Robustness	CodeCode Available
Distillation Learning Guided by Image Reconstruction for One-Shot Medical Image Segmentation	Aug 7, 2024	Data AugmentationImage Reconstruction	CodeCode Available
UFIN: Universal Feature Interaction Network for Multi-Domain Click-Through Rate Prediction	Nov 27, 2023	Click-Through Rate PredictionKnowledge Distillation	CodeCode Available
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability	Jun 8, 2020	FairnessGeneral Classification	CodeCode Available
Spending Your Winning Lottery Better After Drawing It	Jan 8, 2021	Knowledge Distillation	CodeCode Available
Goldfish: An Efficient Federated Unlearning Framework	Apr 4, 2024	Knowledge DistillationMachine Unlearning	CodeCode Available
Goal-Conditioned Q-Learning as Knowledge Distillation	Aug 28, 2022	Knowledge DistillationQ-Learning	CodeCode Available
Curriculum-scheduled Knowledge Distillation from Multiple Pre-trained Teachers for Multi-domain Sequential Recommendation	Jan 1, 2024	Knowledge DistillationRecommendation Systems	CodeCode Available
GNN's Uncertainty Quantification using Self-Distillation	Jun 24, 2025	Knowledge DistillationUncertainty Quantification	CodeCode Available
GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation	May 13, 2024	image-classificationImage Classification	CodeCode Available
GLANCE: Global to Local Architecture-Neutral Concept-based Explanations	Jul 5, 2022	DisentanglementFeature Importance	CodeCode Available
GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment	May 30, 2024	GSM8KKnowledge Distillation	CodeCode Available
GKD: Semi-supervised Graph Knowledge Distillation for Graph-Independent Inference	Apr 8, 2021	Disease Predictiongraph construction	CodeCode Available
Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor	Oct 10, 2020	Dependency ParsingKnowledge Distillation	CodeCode Available
Revisiting Cross-Modal Knowledge Distillation: A Disentanglement Approach for RGBD Semantic Segmentation	May 30, 2025	Autonomous DrivingContrastive Learning	CodeCode Available
Multilingual Neural Machine Translation with Knowledge Distillation	Feb 27, 2019	DiversityKnowledge Distillation	CodeCode Available
Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation	Feb 6, 2025	Knowledge DistillationMachine Translation	CodeCode Available
Automated Knowledge Distillation via Monte Carlo Tree Search	Jan 1, 2023	image-classificationImage Classification	CodeCode Available
Generative Denoise Distillation: Simple Stochastic Noises Induce Efficient Knowledge Transfer for Dense Prediction	Jan 16, 2024	Instance SegmentationKnowledge Distillation	CodeCode Available
Distillation Improves Visual Place Recognition for Low Quality Images	Oct 10, 2023	Knowledge DistillationQuantization	CodeCode Available
Revisiting Distillation and Incremental Classifier Learning	Jul 8, 2018	Incremental LearningKnowledge Distillation	CodeCode Available
Generate, Annotate, and Learn: NLP with Synthetic Text	Jun 11, 2021	Few-Shot LearningImage Classification	CodeCode Available
Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation	Feb 17, 2025	Knowledge DistillationMath	CodeCode Available
Multimodal Fusion SLAM with Fourier Attention	Jun 22, 2025	Knowledge DistillationOptical Flow Estimation	CodeCode Available
Multimodal Industrial Anomaly Detection by Crossmodal Reverse Distillation	Dec 12, 2024	Anomaly DetectionKnowledge Distillation	CodeCode Available
Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective	Feb 3, 2023	Knowledge Distillation	CodeCode Available
Generalizing Teacher Networks for Effective Knowledge Distillation Across Student Architectures	Jul 22, 2024	Knowledge DistillationModel Compression	CodeCode Available
Generalized Knowledge Distillation via Relationship Matching	May 4, 2022	Few-Shot LearningIncremental Learning	CodeCode Available
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation	Mar 26, 2023	Knowledge Distillation	CodeCode Available
Why Skip If You Can Combine: A Simple Knowledge Distillation Technique for Intermediate Layers	Oct 6, 2020	Knowledge DistillationMachine Translation	CodeCode Available
Revisiting Knowledge Distillation: An Inheritance and Exploration Framework	Jul 1, 2021	Knowledge Distillation	CodeCode Available

Show:10 25 50

← PrevPage 78 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified