Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–400 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Distilling Knowledge from Refinement in Multiple Instance Detection Networks	Apr 23, 2020	Knowledge DistillationMultiple Instance Learning	CodeCode Available	1	5
Distilling Knowledge via Knowledge Review	Apr 19, 2021	Instance SegmentationKnowledge Distillation	CodeCode Available	1	5
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance	Sep 21, 2023	Domain GeneralizationKnowledge Distillation	CodeCode Available	1	5
Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation	May 24, 2022	Graph ClassificationKnowledge Distillation	CodeCode Available	1	5
Complementary Relation Contrastive Distillation	Mar 29, 2021	Knowledge DistillationRelation	CodeCode Available	1	5
A semi-supervised Teacher-Student framework for surgical tool detection and localization	Aug 21, 2022	Knowledge DistillationPseudo Label	CodeCode Available	1	5
Attention Weighted Local Descriptors	Apr 19, 2023	3D ReconstructionHomography Estimation	CodeCode Available	1	5
Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval	Jul 31, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Dynamic Knowledge Distillation for Pre-trained Language Models	Sep 23, 2021	Knowledge Distillation	CodeCode Available	1	5
Domain Consistency Representation Learning for Lifelong Person Re-Identification	Sep 30, 2024	AttributeKnowledge Distillation	CodeCode Available	1	5
AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learning	Aug 13, 2021	Adversarial AttackAdversarial Robustness	CodeCode Available	1	5
Audio Embeddings as Teachers for Music Classification	Jun 30, 2023	ClassificationInformation Retrieval	CodeCode Available	1	5
CaMEL: Mean Teacher Learning for Image Captioning	Feb 21, 2022	Image CaptioningKnowledge Distillation	CodeCode Available	1	5
Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup	Dec 17, 2020	InformativenessKnowledge Distillation	CodeCode Available	1	5
CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection	Jan 1, 2024	3D Object DetectionKnowledge Distillation	CodeCode Available	1	5
Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space	Dec 1, 2020	DiversityKnowledge Distillation	CodeCode Available	1	5
Action knowledge for video captioning with graph neural networks	Mar 16, 2023	Action RecognitionGraph Neural Network	CodeCode Available	1	5
Conformer and Blind Noisy Students for Improved Image Quality Assessment	Apr 27, 2022	Image Quality AssessmentImage Restoration	CodeCode Available	1	5
AICSD: Adaptive Inter-Class Similarity Distillation for Semantic Segmentation	Aug 8, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available	1	5
CoNMix for Source-free Single and Multi-target Domain Adaptation	Nov 7, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	1	5
Camera clustering for scalable stream-based active distillation	Apr 16, 2024	ClusteringKnowledge Distillation	CodeCode Available	1	5
Consistent Representation Learning for Continual Relation Extraction	Mar 5, 2022	Continual Relation ExtractionContrastive Learning	CodeCode Available	1	5
Designing Large Foundation Models for Efficient Training and Inference: A Survey	Sep 3, 2024	Knowledge DistillationModel Compression	CodeCode Available	1	5
Content-Aware GAN Compression	Apr 6, 2021	Image GenerationImage Manipulation	CodeCode Available	1	5
Channel Gating Neural Networks	May 29, 2018	Knowledge DistillationNetwork Pruning	CodeCode Available	1	5
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability	Jul 6, 2023	Few-Shot Image ClassificationImage Classification	CodeCode Available	1	5
AIM 2024 Challenge on UHD Blind Photo Quality Assessment	Sep 24, 2024	4kComputational Efficiency	CodeCode Available	1	5
Continual All-in-One Adverse Weather Removal with Knowledge Replay on a Unified Network Structure	Mar 12, 2024	AllContinual Learning	CodeCode Available	1	5
Continual Collaborative Distillation for Recommender System	May 29, 2024	Knowledge DistillationRecommendation Systems	CodeCode Available	1	5
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?	Dec 16, 2022	3D Point Cloud ClassificationFew-Shot 3D Point Cloud Classification	CodeCode Available	1	5
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks	Jun 15, 2020	AutoMLKnowledge Distillation	CodeCode Available	1	5
Continual evaluation for lifelong learning: Identifying the stability gap	May 26, 2022	Continual LearningIncremental Learning	CodeCode Available	1	5
Distilling Visual Priors from Self-Supervised Learning	Aug 1, 2020	ClassificationContrastive Learning	CodeCode Available	1	5
Distilling Dense Representations for Ranking using Tightly-Coupled Teachers	Oct 22, 2020	Knowledge Distillation	CodeCode Available	1	5
Contrastive Model Inversion for Data-Free Knowledge Distillation	May 18, 2021	Contrastive LearningData-free Knowledge Distillation	CodeCode Available	1	5
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification	Nov 4, 2023	ClassificationCross-Domain Few-Shot	CodeCode Available	1	5
DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation	Apr 19, 2022	Dialogue GenerationKnowledge Distillation	CodeCode Available	1	5
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation	Jul 3, 2021	Knowledge DistillationModel Compression	CodeCode Available	1	5
Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos	Apr 16, 2021	Activity RecognitionDiversity	CodeCode Available	1	5
Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation	Apr 21, 2022	backdoor defenseKnowledge Distillation	CodeCode Available	1	5
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade	Dec 29, 2020	Knowledge DistillationModel Selection	CodeCode Available	1	5
Cross-Layer Distillation with Semantic Calibration	Dec 6, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1	5
Avatar Knowledge Distillation: Self-ensemble Teacher Paradigm with Uncertainty	May 4, 2023	Knowledge Distillationobject-detection	CodeCode Available	1	5
A Knowledge Distillation Framework For Enhancing Ear-EEG Based Sleep Staging With Scalp-EEG Data	Oct 27, 2022	Domain AdaptationEEG	CodeCode Available	1	5
Cross-category Video Highlight Detection via Set-based Learning	Aug 26, 2021	Domain AdaptationHighlight Detection	CodeCode Available	1	5
Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing	Apr 1, 2020	Knowledge DistillationRetrieval	CodeCode Available	1	5
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models	May 15, 2023	3D Object DetectionImage Captioning	CodeCode Available	1	5
Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object Detection	Jan 1, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method	Jun 11, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection	Aug 28, 2023	Binary ClassificationClassification	CodeCode Available	1	5

Show:10 25 50

← PrevPage 8 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified