Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1301–1350 of 4240 papers

Title	Date	Tasks	Status	Score
Collaborative Deep Reinforcement Learning	Feb 19, 2017	Deep Reinforcement LearningKnowledge Distillation	CodeCode Available	5
KDMOS:Knowledge Distillation for Motion Segmentation	Jun 17, 2025	Autonomous DrivingKnowledge Distillation	CodeCode Available	5
Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation	May 16, 2020	Domain AdaptationKnowledge Distillation	CodeCode Available	5
Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs	Jun 5, 2023	Entity AlignmentKnowledge Distillation	CodeCode Available	5
Few Sample Knowledge Distillation for Efficient Network Compression	Dec 5, 2018	Knowledge DistillationNetwork Pruning	CodeCode Available	5
Improved Knowledge Distillation via Full Kernel Matrix Transfer	Sep 30, 2020	Knowledge DistillationModel Compression	CodeCode Available	5
Leveraging Large Language Models for Active Merchant Non-player Characters	Dec 15, 2024	Knowledge Distillation	CodeCode Available	5
Cogni-Net: Cognitive Feature Learning through Deep Visual Perception	Nov 1, 2018	EEGElectroencephalogram (EEG)	CodeCode Available	5
Invariant debiasing learning for recommendation via biased imputation	Dec 28, 2024	ImputationKnowledge Distillation	CodeCode Available	5
Knowledge Distillation For Wireless Edge Learning	Apr 3, 2021	Cloud ComputingFederated Learning	CodeCode Available	5
Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation	Mar 27, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available	5
Adversarial Teacher-Student Representation Learning for Domain Generalization	Dec 1, 2021	Data AugmentationDomain Generalization	CodeCode Available	5
Intra-class Patch Swap for Self-Distillation	May 20, 2025	image-classificationImage Classification	CodeCode Available	5
Interpretable Embedding Procedure Knowledge Transfer via Stacked Principal Component Analysis and Graph Neural Network	Apr 28, 2021	Graph Neural NetworkKnowledge Distillation	CodeCode Available	5
Interpreting and Disentangling Feature Components of Various Complexity from DNNs	Jun 29, 2020	Knowledge Distillation	CodeCode Available	5
Efficient Multitask Dense Predictor via Binarization	May 23, 2024	BinarizationKnowledge Distillation	CodeCode Available	5
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition	Mar 7, 2024	Audio-Visual Speech RecognitionKnowledge Distillation	CodeCode Available	5
Inter-Domain Alignment for Predicting High-Resolution Brain Networks Using Teacher-Student Learning	Oct 6, 2021	DecoderDomain Adaptation	CodeCode Available	5
Interpreting Microbiome Relative Abundance Data Using Symbolic Regression	Oct 18, 2024	DiagnosticKnowledge Distillation	CodeCode Available	5
Instance Temperature Knowledge Distillation	Jun 27, 2024	Decision MakingEfficient Exploration	CodeCode Available	5
EaSyGuide : ESG Issue Identification Framework leveraging Abilities of Generative Large Language Models	Jun 11, 2023	ArticlesKnowledge Distillation	CodeCode Available	5
CL-XABSA: Contrastive Learning for Cross-lingual Aspect-based Sentiment Analysis	Apr 2, 2022	Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA)	CodeCode Available	5
Assessor-Guided Learning for Continual Environments	Mar 21, 2023	Continual LearningIncremental Learning	CodeCode Available	5
A Flexible Multi-Task Model for BERT Serving	Jul 12, 2021	Knowledge Distillationmodel	CodeCode Available	5
Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism	Oct 19, 2020	DecoderKnowledge Distillation	CodeCode Available	5
DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning for Medical Images	Apr 22, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	5
Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering	Jul 20, 2023	ClusteringData Augmentation	CodeCode Available	5
Induced Model Matching: Restricted Models Help Train Full-Featured Models	Jan 15, 2025	Knowledge DistillationLanguage Modeling	CodeCode Available	5
InDistill: Information flow-preserving knowledge distillation for model compression	May 20, 2022	Knowledge DistillationModel Compression	CodeCode Available	5
Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance	Nov 23, 2024	Computational EfficiencyKnowledge Distillation	CodeCode Available	5
Distilling Knowledge by Mimicking Features	Nov 3, 2020	Knowledge Distillationobject-detection	CodeCode Available	5
Induced Model Matching: How Restricted Models Can Help Larger Ones	Feb 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	5
Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning	Dec 27, 2023	Continual Learninggraph construction	CodeCode Available	5
Knowledge Extraction with No Observable Data	Dec 1, 2019	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	5
Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition	Nov 9, 2021	Continual LearningKnowledge Distillation	CodeCode Available	5
PruMUX: Augmenting Data Multiplexing with Model Compression	May 24, 2023	Knowledge Distillationmodel	CodeCode Available	5
Dynamic Rectification Knowledge Distillation	Jan 27, 2022	Edge-computingKnowledge Distillation	CodeCode Available	5
Incorporating Graph Information in Transformer-based AMR Parsing	Jun 23, 2023	Abstract Meaning RepresentationAMR Parsing	CodeCode Available	5
UNIKD: UNcertainty-filtered Incremental Knowledge Distillation for Neural Implicit Representation	Dec 21, 2022	3D ReconstructionIncremental Learning	CodeCode Available	5
Improving Question Answering Performance Using Knowledge Distillation and Active Learning	Sep 26, 2021	Active LearningKnowledge Distillation	CodeCode Available	5
Improving Neural Topic Models with Wasserstein Knowledge Distillation	Mar 27, 2023	Knowledge DistillationTopic Models	CodeCode Available	5
Improving Respiratory Sound Classification with Architecture-Agnostic Knowledge Distillation from Ensembles	May 28, 2025	Knowledge DistillationSound Classification	CodeCode Available	5
Closest Neighbors are Harmful for Lightweight Masked Auto-encoders	Jan 1, 2025	Knowledge Distillation	CodeCode Available	5
Improving Neural Architecture Search Image Classifiers via Ensemble Learning	Mar 14, 2019	Ensemble LearningImage Classification	CodeCode Available	5
3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection	Jul 12, 2024	Knowledge DistillationSocial Media Mental Health Detection	CodeCode Available	5
DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition	Jul 16, 2025	BenchmarkingKnowledge Distillation	CodeCode Available	5
Improving Stance Detection with Multi-Dataset Learning and Knowledge Distillation	Nov 1, 2021	Knowledge DistillationStance Detection	CodeCode Available	5
Improving generalizability of distilled self-supervised speech processing models under distorted settings	Oct 14, 2022	Knowledge Distillation	CodeCode Available	5
Improving Robustness by Enhancing Weak Subnets	Jan 30, 2022	Adversarial RobustnessData Augmentation	CodeCode Available	5
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts	Jul 17, 2023	automatic-speech-translationImitation Learning	CodeCode Available	5

Show:10 25 50

← PrevPage 27 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified