Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2001–2050 of 4240 papers

Title	Date	Tasks	Status
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers	Apr 27, 2022	Knowledge Distillation	—Unverified
GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking	May 29, 2023	Answer GenerationDialogue Generation	—Unverified
Dealing with training and test segmentation mismatch: FBK@IWSLT2021	Jun 23, 2021	Action DetectionActivity Detection	—Unverified
Industry Scale Semi-Supervised Learning for Natural Language Understanding	Mar 29, 2021	intent-classificationIntent Classification	—Unverified
Bi-CryptoNets: Leveraging Different-Level Privacy for Encrypted Inference	Feb 2, 2024	Knowledge DistillationPrivacy Preserving	—Unverified
InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation	Jun 25, 2024	Knowledge Distillation	—Unverified
AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange	Jun 10, 2021	ClassificationGraph Classification	—Unverified
Graph Representation Learning via Multi-task Knowledge Distillation	Nov 11, 2019	Graph Representation LearningKnowledge Distillation	—Unverified
Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation	Apr 13, 2021	Knowledge DistillationTriplet	—Unverified
DDK: Distilling Domain Knowledge for Efficient Large Language Models	Jul 23, 2024	Knowledge Distillation	—Unverified
DCSNet: A Lightweight Knowledge Distillation-Based Model with Explainable AI for Lung Cancer Diagnosis from Histopathological Images	May 14, 2025	DiagnosticKnowledge Distillation	—Unverified
Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer	Oct 9, 2020	Decoderimage-classification	—Unverified
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation	Dec 27, 2020	Knowledge Distillation	—Unverified
Initial Classifier Weights Replay for Memoryless Class Incremental Learning	Aug 31, 2020	Allclass-incremental learning	—Unverified
Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation	Oct 22, 2022	Knowledge DistillationText Generation	—Unverified
DC-CCL: Device-Cloud Collaborative Controlled Learning for Large Vision Models	Mar 18, 2023	Knowledge Distillation	—Unverified
Graph-Based Cross-Domain Knowledge Distillation for Cross-Dataset Text-to-Image Person Retrieval	Jan 25, 2025	Domain AdaptationKnowledge Distillation	—Unverified
Injecting Spatial Information for Monaural Speech Enhancement via Knowledge Distillation	Dec 2, 2022	Knowledge DistillationSpeech Enhancement	—Unverified
Inplace knowledge distillation with teacher assistant for improved training of flexible deep neural networks	May 18, 2021	image-classificationImage Classification	—Unverified
In-situ animal behavior classification using knowledge distillation and fixed-point quantization	Sep 9, 2022	ClassificationKnowledge Distillation	—Unverified
Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks	Nov 21, 2018	Knowledge DistillationModel Compression	—Unverified
Granite Embedding Models	Feb 27, 2025	Information RetrievalKnowledge Distillation	—Unverified
Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models	Feb 27, 2025	Knowledge DistillationModel Compression	—Unverified
Gradient Reweighting: Towards Imbalanced Class-Incremental Learning	Feb 28, 2024	class-incremental learningClass Incremental Learning	—Unverified
Gradient-Guided Knowledge Distillation for Object Detectors	Mar 7, 2023	Knowledge DistillationObject	—Unverified
In Teacher We Trust: Learning Compressed Models for Pedestrian Detection	Dec 1, 2016	Knowledge DistillationPedestrian Detection	—Unverified
Data Techniques For Online End-to-end Speech Recognition	Jan 24, 2020	Data AugmentationDomain Adaptation	—Unverified
Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models	Feb 18, 2025	Data AugmentationGSM8K	—Unverified
Gradient Adversarial Training of Neural Networks	Jun 21, 2018	BIG-bench Machine LearningBinary Classification	—Unverified
Integration of Pre-trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding	Apr 15, 2021	intent-classificationIntent Classification	—Unverified
GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation	May 6, 2024	Knowledge DistillationQuestion Answering	—Unverified
Mining Data Impressions from Deep Models as Substitute for the Unavailable Training Data	Jan 15, 2021	Adversarial RobustnessContinual Learning	—Unverified
Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics	Nov 25, 2024	Knowledge DistillationMulti-Task Learning	—Unverified
Adaptive Label Smoothing with Self-Knowledge	Sep 29, 2021	Knowledge DistillationMachine Translation	—Unverified
Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition	Nov 28, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Intermediate Distillation: Data-Efficient Distillation from Black-Box LLMs for Information Retrieval	Jun 18, 2024	Information RetrievalKnowledge Distillation	—Unverified
Interpretable discovery of new semiconductors with machine learning	Jan 12, 2021	BIG-bench Machine LearningKnowledge Distillation	—Unverified
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation	Oct 29, 2018	Dimensionality ReductionKnowledge Distillation	—Unverified
Interpretable Foreground Object Search As Knowledge Distillation	Jul 20, 2020	Knowledge DistillationObject	—Unverified
Interpretable Traces, Unexpected Outcomes: Investigating the Disconnect in Trace-Based Knowledge Distillation	May 20, 2025	Information RetrievalKnowledge Distillation	—Unverified
Data-Free Knowledge Transfer: A Survey	Dec 31, 2021	Data-free Knowledge DistillationDomain Adaptation	—Unverified
GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation	Mar 28, 2024	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis	Apr 10, 2021	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving	Apr 24, 2023	Autonomous DrivingAutonomous Vehicles	—Unverified
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images	Oct 20, 2023	Data AugmentationData-free Knowledge Distillation	—Unverified
Local-Global Knowledge Distillation in Heterogeneous Federated Learning with Non-IID Data	Jun 30, 2021	Federated LearningKnowledge Distillation	—Unverified
Global Intervention and Distillation for Federated Out-of-Distribution Generalization	Apr 1, 2025	AttributeData Augmentation	—Unverified
Beyond Classification: Knowledge Distillation using Multi-Object Impressions	Oct 27, 2021	ClassificationKnowledge Distillation	—Unverified
All You Need in Knowledge Distillation Is a Tailored Coordinate System	Dec 12, 2024	AllFew-Shot Learning	—Unverified
Adaptive Knowledge Distillation for Classification of Hand Images using Explainable Vision Transformers	Aug 20, 2024	Knowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 41 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified