Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1401–1450 of 4240 papers

Title	Date	Tasks	Status	Score
Improving Question Answering Performance Using Knowledge Distillation and Active Learning	Sep 26, 2021	Active LearningKnowledge Distillation	CodeCode Available	5
Improving Respiratory Sound Classification with Architecture-Agnostic Knowledge Distillation from Ensembles	May 28, 2025	Knowledge DistillationSound Classification	CodeCode Available	5
Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism	Oct 19, 2020	DecoderKnowledge Distillation	CodeCode Available	5
Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection	Nov 30, 2022	3D Object DetectionDepth Estimation	CodeCode Available	5
DSMix: Distortion-Induced Sensitivity Map Based Pre-training for No-Reference Image Quality Assessment	Jul 4, 2024	Data AugmentationImage Quality Assessment	CodeCode Available	5
DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models	Sep 23, 2024	Knowledge DistillationTransfer Learning	CodeCode Available	5
DS_FusionNet: Dynamic Dual-Stream Fusion with Bidirectional Knowledge Distillation for Plant Disease Recognition	Apr 29, 2025	Fine-Grained Image Classificationimage-classification	CodeCode Available	5
Improving generalizability of distilled self-supervised speech processing models under distorted settings	Oct 14, 2022	Knowledge Distillation	CodeCode Available	5
Improving Knowledge Distillation via Transferring Learning Ability	Apr 24, 2023	Knowledge Distillation	CodeCode Available	5
DROP: Poison Dilution via Knowledge Distillation for Federated Learning	Feb 10, 2025	Data PoisoningFederated Learning	CodeCode Available	5
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts	Jul 17, 2023	automatic-speech-translationImitation Learning	CodeCode Available	5
Improving Robustness by Enhancing Weak Subnets	Jan 30, 2022	Adversarial RobustnessData Augmentation	CodeCode Available	5
Ensemble diverse hypotheses and knowledge distillation for unsupervised cross-subject adaptation	Apr 15, 2022	Activity RecognitionDomain Adaptation	CodeCode Available	5
Ensemble Knowledge Distillation for Learning Improved and Efficient Networks	Sep 17, 2019	Ensemble LearningGeneral Classification	CodeCode Available	5
Improving Neural Architecture Search Image Classifiers via Ensemble Learning	Mar 14, 2019	Ensemble LearningImage Classification	CodeCode Available	5
Do You Remember . . . the Future? Weak-to-Strong generalization in 3D Object Detection	Aug 3, 2024	3D Object DetectionKnowledge Distillation	CodeCode Available	5
Improved Knowledge Distillation for Crowd Counting on IoT Device	Aug 2, 2023	Crowd CountingKnowledge Distillation	CodeCode Available	5
Improved Knowledge Distillation via Teacher Assistant	Feb 9, 2019	Knowledge Distillation	CodeCode Available	5
IE-GAN: An Improved Evolutionary Generative Adversarial Network Using a New Fitness Function and a Generic Crossover Operator	Jul 25, 2021	Evolutionary AlgorithmsGenerative Adversarial Network	CodeCode Available	5
Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation	Dec 9, 2023	Adversarial RobustnessFairness	CodeCode Available	5
Are All Linear Regions Created Equal?	Feb 23, 2022	AllKnowledge Distillation	CodeCode Available	5
Image Recognition with Online Lightweight Vision Transformer: A Survey	May 6, 2025	Knowledge DistillationSurvey	CodeCode Available	5
Hybrid Data-Free Knowledge Distillation	Dec 18, 2024	Data-free Knowledge DistillationGenerative Adversarial Network	CodeCode Available	5
Hybrid Attention Model Using Feature Decomposition and Knowledge Distillation for Glucose Forecasting	Nov 16, 2024	Knowledge Distillation	CodeCode Available	5
Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation	Dec 10, 2021	Extractive SummarizationKnowledge Distillation	CodeCode Available	5
HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation	Mar 18, 2024	Knowledge DistillationNER	CodeCode Available	5
Class Incremental Fault Diagnosis under Limited Fault Data via Supervised Contrastive Knowledge Distillation	Jan 16, 2025	Fault DiagnosisKnowledge Distillation	CodeCode Available	5
Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks	Nov 1, 2021	Dialogue State TrackingDiversity	CodeCode Available	5
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression	Oct 16, 2021	Few-Shot LearningKnowledge Distillation	CodeCode Available	5
Attention to detail: inter-resolution knowledge distillation	Jan 11, 2024	Knowledge Distillationwhole slide images	CodeCode Available	5
Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation	May 1, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	5
How to Train the Teacher Model for Effective Knowledge Distillation	Jul 25, 2024	Knowledge Distillation	CodeCode Available	5
HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge Distillation	Dec 24, 2024	Computational EfficiencyHandwritten Text Recognition	CodeCode Available	5
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability	Jun 8, 2020	FairnessGeneral Classification	CodeCode Available	5
How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition	Aug 30, 2024	Face RecognitionFairness	CodeCode Available	5
Dynamic Data-Free Knowledge Distillation by Easy-to-Hard Learning Strategy	Aug 29, 2022	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	5
Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation	Apr 3, 2023	Domain GeneralizationKnowledge Distillation	CodeCode Available	5
HiTSR: A Hierarchical Transformer for Reference-based Super-Resolution	Aug 30, 2024	Image Super-ResolutionKnowledge Distillation	CodeCode Available	5
Highlight Every Step: Knowledge Distillation via Collaborative Teaching	Jul 23, 2019	Knowledge Distillation	CodeCode Available	5
Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance	Feb 10, 2024	Computational EfficiencyKnowledge Distillation	CodeCode Available	5
DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation	May 26, 2025	Knowledge Distillation	CodeCode Available	5
Does Training with Synthetic Data Truly Protect Privacy?	Feb 18, 2025	Data-free Knowledge DistillationDataset Distillation	CodeCode Available	5
Holistic White-light Polyp Classification via Alignment-free Dense Distillation of Auxiliary Optical Chromoendoscopy	May 25, 2025	DiagnosticKnowledge Distillation	CodeCode Available	5
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification	Jul 10, 2024	Computational Efficiencyimage-classification	CodeCode Available	5
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems	Jun 21, 2019	Dialogue EvaluationKnowledge Distillation	CodeCode Available	5
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective	Apr 22, 2024	Contrastive Learningimage-classification	CodeCode Available	5
DMSSN: Distilled Mixed Spectral-Spatial Network for Hyperspectral Salient Object Detection	Mar 31, 2024	Dimensionality ReductionKnowledge Distillation	CodeCode Available	5
Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution	Aug 5, 2024	ClassificationDiversity	CodeCode Available	5
Handling Data Heterogeneity in Federated Learning via Knowledge Distillation and Fusion	Jul 23, 2022	Data-free Knowledge DistillationFairness	CodeCode Available	5
Group Multi-View Transformer for 3D Shape Analysis with Spatial Encoding	Dec 27, 2023	3D Classification3D Shape Recognition	CodeCode Available	5

Show:10 25 50

← PrevPage 29 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified