Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3801–3850 of 4240 papers

Title	Date	Tasks	Status
Modality-Independent Brain Lesion Segmentation with Privacy-aware Continual Learning	Mar 26, 2025	Continual LearningKnowledge Distillation	CodeCode Available
Squeezed Deep 6DoF Object Detection Using Knowledge Distillation	Mar 30, 2020	Knowledge DistillationObject	CodeCode Available
Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer	Aug 26, 2022	Autonomous DrivingDepth Estimation	CodeCode Available
Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection	Jul 2, 2024	EEGElectroencephalogram (EEG)	CodeCode Available
Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks	Oct 2, 2023	Knowledge DistillationNode Classification	CodeCode Available
Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA	Mar 10, 2022	Knowledge Distillation	CodeCode Available
Closest Neighbors are Harmful for Lightweight Masked Auto-encoders	Jan 1, 2025	Knowledge Distillation	CodeCode Available
CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation	Feb 24, 2025	3D Instance SegmentationContinual Learning	CodeCode Available
RAIN: RegulArization on Input and Network for Black-Box Domain Adaptation	Aug 22, 2022	Data AugmentationDomain Adaptation	CodeCode Available
Reprogramming Distillation for Medical Foundation Models	Jul 9, 2024	Knowledge DistillationLightweight Deployment	CodeCode Available
Model Compression Techniques in Biometrics Applications: A Survey	Jan 18, 2024	FairnessKnowledge Distillation	CodeCode Available
Unsupervised Training of a Dynamic Context-Aware Deep Denoising Framework for Low-Dose Fluoroscopic Imaging	Oct 29, 2024	DenoisingDiagnostic	CodeCode Available
Comb, Prune, Distill: Towards Unified Pruning for Vision Model Compression	Aug 6, 2024	image-classificationImage Classification	CodeCode Available
Class Incremental Fault Diagnosis under Limited Fault Data via Supervised Contrastive Knowledge Distillation	Jan 16, 2025	Fault DiagnosisKnowledge Distillation	CodeCode Available
StableKD: Breaking Inter-block Optimization Entanglement for Stable Knowledge Distillation	Dec 20, 2023	Knowledge Distillation	CodeCode Available
Toward Extremely Lightweight Distracted Driver Recognition With Distillation-Based Neural Architecture Search and Knowledge Transfer	Feb 9, 2023	Knowledge DistillationNeural Architecture Search	CodeCode Available
Distilling Implicit Multimodal Knowledge into Large Language Models for Zero-Resource Dialogue Generation	May 16, 2024	Dialogue GenerationKnowledge Distillation	CodeCode Available
Distilling Image Dehazing With Heterogeneous Task Imitation	Jun 1, 2020	image-classificationImage Classification	CodeCode Available
Enhancing Heterogeneous Federated Learning with Knowledge Extraction and Multi-Model Fusion	Aug 16, 2022	Federated LearningKnowledge Distillation	CodeCode Available
Modeling Document-level Temporal Structures for Building Temporal Dependency Graphs	Oct 21, 2022	Knowledge DistillationSentence	CodeCode Available
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation	Jun 12, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Data Efficient Stagewise Knowledge Distillation	Nov 15, 2019	Knowledge DistillationModel Compression	CodeCode Available
Weight-Inherited Distillation for Task-Agnostic BERT Compression	May 16, 2023	Knowledge Distillation	CodeCode Available
StatsMerging: Statistics-Guided Model Merging via Task-Specific Teacher Distillation	Jun 5, 2025	Knowledge Distillation	CodeCode Available
Why Not Transform Chat Large Language Models to Non-English?	May 22, 2024	Knowledge Distillation	CodeCode Available
Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems	May 1, 2018	Knowledge DistillationRetrieval	CodeCode Available
Distilling Global and Local Logits With Densely Connected Relations	Jan 1, 2021	image-classificationImage Classification	CodeCode Available
UPFL: Unsupervised Personalized Federated Learning towards New Clients	Jul 29, 2023	Federated LearningKnowledge Distillation	CodeCode Available
GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning	Oct 20, 2024	Image RetrievalImage-text Retrieval	CodeCode Available
Two-stage Textual Knowledge Distillation for End-to-End Spoken Language Understanding	Oct 25, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Distilling Focal Knowledge From Imperfect Expert for 3D Object Detection	Jan 1, 2023	3D geometry3D Object Detection	CodeCode Available
Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression	Apr 7, 2021	General Classificationimage-classification	CodeCode Available
MoMA: Momentum Contrastive Learning with Multi-head Attention-based Knowledge Distillation for Histopathology Image Analysis	Aug 31, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation	Sep 20, 2022	Data AugmentationKnowledge Distillation	CodeCode Available
GSB: Group Superposition Binarization for Vision Transformer with Limited Training Samples	May 13, 2023	BinarizationKnowledge Distillation	CodeCode Available
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices	Jul 12, 2022	Emotion RecognitionKeyword Spotting	CodeCode Available
Group Multi-View Transformer for 3D Shape Analysis with Spatial Encoding	Dec 27, 2023	3D Classification3D Shape Recognition	CodeCode Available
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing	May 31, 2021	Knowledge DistillationUnsupervised Pre-training	CodeCode Available
Automatic Assignment of Radiology Examination Protocols Using Pre-trained Language Models with Knowledge Distillation	Sep 1, 2020	Data AugmentationKnowledge Distillation	CodeCode Available
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available
Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments	May 26, 2025	Data-free Knowledge DistillationFederated Learning	CodeCode Available
Graph Entropy Minimization for Semi-supervised Node Classification	May 31, 2023	ClassificationKnowledge Distillation	CodeCode Available
Rethinking Intermediate Layers design in Knowledge Distillation for Kidney and Liver Tumor Segmentation	Nov 28, 2023	DiagnosticKnowledge Distillation	CodeCode Available
AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search	Jan 13, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available
Graph-based Knowledge Distillation by Multi-head Attention Network	Jul 4, 2019	Inductive BiasKnowledge Distillation	CodeCode Available
Gradient Knowledge Distillation for Pre-trained Language Models	Nov 2, 2022	Knowledge Distillation	CodeCode Available
MSE-Optimal Neural Network Initialization via Layer Fusion	Jan 28, 2020	General ClassificationKnowledge Distillation	CodeCode Available
Automatic adaptation of object detectors to new domains using self-training	Apr 15, 2019	Domain AdaptationKnowledge Distillation	CodeCode Available
MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition	Aug 29, 2024	Face RecognitionKnowledge Distillation	CodeCode Available
STKDRec: Spatial-Temporal Knowledge Distillation for Takeaway Recommendation	Dec 21, 2024	Knowledge DistillationKnowledge Graphs	CodeCode Available

Show:10 25 50

← PrevPage 77 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified