Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1351–1400 of 4240 papers

Title	Date	Tasks	Status	Score
Inter-Domain Alignment for Predicting High-Resolution Brain Networks Using Teacher-Student Learning	Oct 6, 2021	DecoderDomain Adaptation	CodeCode Available	5
Dynamic Rectification Knowledge Distillation	Jan 27, 2022	Edge-computingKnowledge Distillation	CodeCode Available	5
Interpretable Embedding Procedure Knowledge Transfer via Stacked Principal Component Analysis and Graph Neural Network	Apr 28, 2021	Graph Neural NetworkKnowledge Distillation	CodeCode Available	5
Induced Model Matching: How Restricted Models Can Help Larger Ones	Feb 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	5
InDistill: Information flow-preserving knowledge distillation for model compression	May 20, 2022	Knowledge DistillationModel Compression	CodeCode Available	5
Induced Model Matching: Restricted Models Help Train Full-Featured Models	Jan 15, 2025	Knowledge DistillationLanguage Modeling	CodeCode Available	5
Asymmetric Masked Distillation for Pre-Training Small Foundation Models	Nov 6, 2023	Action ClassificationAction Recognition	CodeCode Available	5
Learning to "Segment Anything" in Thermal Infrared Images through Knowledge Distillation with a Large Scale Dataset SATIR	Apr 17, 2023	Image SegmentationKnowledge Distillation	CodeCode Available	5
Closest Neighbors are Harmful for Lightweight Masked Auto-encoders	Jan 1, 2025	Knowledge Distillation	CodeCode Available	5
3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection	Jul 12, 2024	Knowledge DistillationSocial Media Mental Health Detection	CodeCode Available	5
Complex Facial Expression Recognition Using Deep Knowledge Distillation of Basic Features	Aug 11, 2023	Continual LearningEmotion Recognition	CodeCode Available	5
DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition	Jul 16, 2025	BenchmarkingKnowledge Distillation	CodeCode Available	5
Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data	Oct 10, 2023	Knowledge Distillation	CodeCode Available	5
Leveraging Foundation Models via Knowledge Distillation in Multi-Object Tracking: Distilling DINOv2 Features to FairMOT	Jul 25, 2024	Knowledge DistillationMulti-Object Tracking	CodeCode Available	5
Distilling Knowledge by Mimicking Features	Nov 3, 2020	Knowledge Distillationobject-detection	CodeCode Available	5
Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets	May 24, 2024	Knowledge DistillationMulti-Task Learning	CodeCode Available	5
Incorporating Graph Information in Transformer-based AMR Parsing	Jun 23, 2023	Abstract Meaning RepresentationAMR Parsing	CodeCode Available	5
Improving Stance Detection with Multi-Dataset Learning and Knowledge Distillation	Nov 1, 2021	Knowledge DistillationStance Detection	CodeCode Available	5
UNIKD: UNcertainty-filtered Incremental Knowledge Distillation for Neural Implicit Representation	Dec 21, 2022	3D ReconstructionIncremental Learning	CodeCode Available	5
Improving Question Answering Performance Using Knowledge Distillation and Active Learning	Sep 26, 2021	Active LearningKnowledge Distillation	CodeCode Available	5
Improving Respiratory Sound Classification with Architecture-Agnostic Knowledge Distillation from Ensembles	May 28, 2025	Knowledge DistillationSound Classification	CodeCode Available	5
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training	May 3, 2023	Knowledge DistillationText Generation	CodeCode Available	5
Adversarial Moment-Matching Distillation of Large Language Models	Jun 5, 2024	Imitation LearningInstruction Following	CodeCode Available	5
Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition	Nov 9, 2021	Continual LearningKnowledge Distillation	CodeCode Available	5
Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism	Oct 19, 2020	DecoderKnowledge Distillation	CodeCode Available	5
Improving Neural Architecture Search Image Classifiers via Ensemble Learning	Mar 14, 2019	Ensemble LearningImage Classification	CodeCode Available	5
CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation	Feb 24, 2025	3D Instance SegmentationContinual Learning	CodeCode Available	5
Improving generalizability of distilled self-supervised speech processing models under distorted settings	Oct 14, 2022	Knowledge Distillation	CodeCode Available	5
Improving Knowledge Distillation via Transferring Learning Ability	Apr 24, 2023	Knowledge Distillation	CodeCode Available	5
Improving Robustness by Enhancing Weak Subnets	Jan 30, 2022	Adversarial RobustnessData Augmentation	CodeCode Available	5
Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation	Dec 9, 2023	Adversarial RobustnessFairness	CodeCode Available	5
Dual Correction Strategy for Ranking Distillation in Top-N Recommender System	Sep 8, 2021	Knowledge DistillationRecommendation Systems	CodeCode Available	5
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts	Jul 17, 2023	automatic-speech-translationImitation Learning	CodeCode Available	5
Improved Knowledge Distillation via Teacher Assistant	Feb 9, 2019	Knowledge Distillation	CodeCode Available	5
Improved Knowledge Distillation for Crowd Counting on IoT Device	Aug 2, 2023	Crowd CountingKnowledge Distillation	CodeCode Available	5
IE-GAN: An Improved Evolutionary Generative Adversarial Network Using a New Fitness Function and a Generic Crossover Operator	Jul 25, 2021	Evolutionary AlgorithmsGenerative Adversarial Network	CodeCode Available	5
Improving Neural Topic Models with Wasserstein Knowledge Distillation	Mar 27, 2023	Knowledge DistillationTopic Models	CodeCode Available	5
Locally Differentially Private Distributed Deep Learning via Knowledge Distillation	Feb 7, 2022	Deep LearningKnowledge Distillation	CodeCode Available	5
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation	Sep 22, 2021	cross-modal alignmentKnowledge Distillation	CodeCode Available	5
DSMix: Distortion-Induced Sensitivity Map Based Pre-training for No-Reference Image Quality Assessment	Jul 4, 2024	Data AugmentationImage Quality Assessment	CodeCode Available	5
DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models	Sep 23, 2024	Knowledge DistillationTransfer Learning	CodeCode Available	5
DS_FusionNet: Dynamic Dual-Stream Fusion with Bidirectional Knowledge Distillation for Plant Disease Recognition	Apr 29, 2025	Fine-Grained Image Classificationimage-classification	CodeCode Available	5
Hybrid Attention Model Using Feature Decomposition and Knowledge Distillation for Glucose Forecasting	Nov 16, 2024	Knowledge Distillation	CodeCode Available	5
DROP: Poison Dilution via Knowledge Distillation for Federated Learning	Feb 10, 2025	Data PoisoningFederated Learning	CodeCode Available	5
AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search	Jan 13, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	5
Hybrid Data-Free Knowledge Distillation	Dec 18, 2024	Data-free Knowledge DistillationGenerative Adversarial Network	CodeCode Available	5
Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation	Dec 10, 2021	Extractive SummarizationKnowledge Distillation	CodeCode Available	5
Enhancing New-item Fairness in Dynamic Recommender Systems	Apr 30, 2025	FairnessKnowledge Distillation	CodeCode Available	5
HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation	Mar 18, 2024	Knowledge DistillationNER	CodeCode Available	5
Do You Remember . . . the Future? Weak-to-Strong generalization in 3D Object Detection	Aug 3, 2024	3D Object DetectionKnowledge Distillation	CodeCode Available	5

Show:10 25 50

← PrevPage 28 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified