Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1451–1500 of 4240 papers

Title	Date	Tasks	Status	Score
InDistill: Information flow-preserving knowledge distillation for model compression	May 20, 2022	Knowledge DistillationModel Compression	CodeCode Available	5
Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation	Apr 3, 2023	Domain GeneralizationKnowledge Distillation	CodeCode Available	5
Hybrid Attention Model Using Feature Decomposition and Knowledge Distillation for Glucose Forecasting	Nov 16, 2024	Knowledge Distillation	CodeCode Available	5
Hybrid Data-Free Knowledge Distillation	Dec 18, 2024	Data-free Knowledge DistillationGenerative Adversarial Network	CodeCode Available	5
Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation	Dec 10, 2021	Extractive SummarizationKnowledge Distillation	CodeCode Available	5
Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching	Sep 13, 2022	Contrastive LearningKnowledge Distillation	CodeCode Available	5
HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation	Mar 18, 2024	Knowledge DistillationNER	CodeCode Available	5
Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance	Feb 10, 2024	Computational EfficiencyKnowledge Distillation	CodeCode Available	5
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression	Oct 16, 2021	Few-Shot LearningKnowledge Distillation	CodeCode Available	5
DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation	May 26, 2025	Knowledge Distillation	CodeCode Available	5
Context Unaware Knowledge Distillation for Image Retrieval	Jul 19, 2022	Image RetrievalKnowledge Distillation	CodeCode Available	5
Does Training with Synthetic Data Truly Protect Privacy?	Feb 18, 2025	Data-free Knowledge DistillationDataset Distillation	CodeCode Available	5
HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge Distillation	Dec 24, 2024	Computational EfficiencyHandwritten Text Recognition	CodeCode Available	5
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems	Jun 21, 2019	Dialogue EvaluationKnowledge Distillation	CodeCode Available	5
Dynamic Data-Free Knowledge Distillation by Easy-to-Hard Learning Strategy	Aug 29, 2022	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	5
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective	Apr 22, 2024	Contrastive Learningimage-classification	CodeCode Available	5
DMSSN: Distilled Mixed Spectral-Spatial Network for Hyperspectral Salient Object Detection	Mar 31, 2024	Dimensionality ReductionKnowledge Distillation	CodeCode Available	5
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels	Sep 10, 2023	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	5
Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution	Aug 5, 2024	ClassificationDiversity	CodeCode Available	5
How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition	Aug 30, 2024	Face RecognitionFairness	CodeCode Available	5
How to Train the Teacher Model for Effective Knowledge Distillation	Jul 25, 2024	Knowledge Distillation	CodeCode Available	5
HiTSR: A Hierarchical Transformer for Reference-based Super-Resolution	Aug 30, 2024	Image Super-ResolutionKnowledge Distillation	CodeCode Available	5
Highlight Every Step: Knowledge Distillation via Collaborative Teaching	Jul 23, 2019	Knowledge Distillation	CodeCode Available	5
Advancing Compressed Video Action Recognition through Progressive Knowledge Distillation	Jul 2, 2024	Action RecognitionKnowledge Distillation	CodeCode Available	5
Holistic White-light Polyp Classification via Alignment-free Dense Distillation of Auxiliary Optical Chromoendoscopy	May 25, 2025	DiagnosticKnowledge Distillation	CodeCode Available	5
Applying Knowledge Distillation to Improve Weed Mapping With Drones	Oct 8, 2023	Knowledge DistillationManagement	CodeCode Available	5
Chemical transformer compression for accelerating both training and inference of molecular modeling	May 16, 2022	Knowledge DistillationModel Compression	CodeCode Available	5
Distribution Aligned Semantics Adaption for Lifelong Person Re-Identification	May 30, 2024	Knowledge DistillationPerson Re-Identification	CodeCode Available	5
Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation	Nov 29, 2019	Knowledge Distillationreinforcement-learning	CodeCode Available	5
Exploring Hyperspectral Anomaly Detection with Human Vision: A Small Target Aware Detector	Jan 2, 2024	Anomaly DetectionKnowledge Distillation	CodeCode Available	5
TinyBERT: Distilling BERT for Natural Language Understanding	Sep 23, 2019	Knowledge DistillationLanguage Modelling	CodeCode Available	5
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification	Jul 10, 2024	Computational Efficiencyimage-classification	CodeCode Available	5
Induced Model Matching: How Restricted Models Can Help Larger Ones	Feb 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	5
Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax	May 28, 2021	Data AugmentationKnowledge Distillation	CodeCode Available	5
Group Multi-View Transformer for 3D Shape Analysis with Spatial Encoding	Dec 27, 2023	3D Classification3D Shape Recognition	CodeCode Available	5
GSB: Group Superposition Binarization for Vision Transformer with Limited Training Samples	May 13, 2023	BinarizationKnowledge Distillation	CodeCode Available	5
GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning	Oct 20, 2024	Image RetrievalImage-text Retrieval	CodeCode Available	5
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing	May 31, 2021	Knowledge DistillationUnsupervised Pre-training	CodeCode Available	5
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation	Jun 12, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5
Exploring Social Media for Early Detection of Depression in COVID-19 Patients	Feb 23, 2023	Knowledge Distillation	CodeCode Available	5
Exploring Target Representations for Masked Autoencoders	Sep 8, 2022	Image ClassificationInstance Segmentation	CodeCode Available	5
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available	5
Graph-based Knowledge Distillation by Multi-head Attention Network	Jul 4, 2019	Inductive BiasKnowledge Distillation	CodeCode Available	5
Graph Entropy Minimization for Semi-supervised Node Classification	May 31, 2023	ClassificationKnowledge Distillation	CodeCode Available	5
Distill n' Explain: explaining graph neural networks using simple surrogates	Mar 17, 2023	Knowledge Distillation	CodeCode Available	5
GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision	Apr 7, 2025	Attributeclass-incremental learning	CodeCode Available	5
Distilling Virtual Examples for Long-tailed Recognition	Mar 28, 2021	Knowledge DistillationLong-tail Learning	CodeCode Available	5
Distilling Universal and Joint Knowledge for Cross-Domain Model Compression on Time Series Data	Jul 7, 2023	Knowledge DistillationModel Compression	CodeCode Available	5
Spending Your Winning Lottery Better After Drawing It	Jan 8, 2021	Knowledge Distillation	CodeCode Available	5
A Dual-Contrastive Framework for Low-Resource Cross-Lingual Named Entity Recognition	Apr 2, 2022	Contrastive LearningCross-Lingual NER	CodeCode Available	5

Show:10 25 50

← PrevPage 30 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified