Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 601–650 of 4240 papers

Title	Date	Tasks	Status	Hype
Boosting Multi-Label Image Classification with Complementary Parallel Self-Distillation	May 23, 2022	image-classificationImage Classification	CodeCode Available	1
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection	May 23, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1
IDEAL: Query-Efficient Data-Free Learning from Black-box Models	May 23, 2022	Knowledge Distillation	CodeCode Available	1
Knowledge Distillation via the Target-aware Transformer	May 22, 2022	Knowledge Distillation	CodeCode Available	1
Knowledge Distillation from A Stronger Teacher	May 21, 2022	image-classificationImage Classification	CodeCode Available	1
Exploring Extreme Parameter Compression for Pre-trained Language Models	May 20, 2022	Knowledge DistillationTensor Decomposition	CodeCode Available	1
Directed Acyclic Transformer for Non-Autoregressive Machine Translation	May 16, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1
Knowledge Distillation Meets Open-Set Semi-Supervised Learning	May 13, 2022	Face RecognitionKnowledge Distillation	CodeCode Available	1
DistilProtBert: A distilled protein language model used to distinguish between real proteins and their randomly shuffled counterparts	May 10, 2022	Dimensionality ReductionKnowledge Distillation	CodeCode Available	1
Spot-adaptive Knowledge Distillation	May 5, 2022	Knowledge Distillation	CodeCode Available	1
Nearest Neighbor Knowledge Distillation for Neural Machine Translation	May 1, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1
Curriculum Learning for Dense Retrieval Distillation	Apr 28, 2022	Knowledge DistillationPassage Retrieval	CodeCode Available	1
Conformer and Blind Noisy Students for Improved Image Quality Assessment	Apr 27, 2022	Image Quality AssessmentImage Restoration	CodeCode Available	1
Proto2Proto: Can you recognize the car, the way I do?	Apr 25, 2022	Knowledge Distillation	CodeCode Available	1
On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation	Apr 23, 2022	Knowledge DistillationRecommendation Systems	CodeCode Available	1
Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation	Apr 21, 2022	backdoor defenseKnowledge Distillation	CodeCode Available	1
Modeling Missing Annotations for Incremental Learning in Object Detection	Apr 19, 2022	Incremental LearningInstance Segmentation	CodeCode Available	1
DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation	Apr 19, 2022	Dialogue GenerationKnowledge Distillation	CodeCode Available	1
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation	Apr 15, 2022	Knowledge DistillationMixture-of-Experts	CodeCode Available	1
LRH-Net: A Multi-Level Knowledge Distillation Approach for Low-Resource Heart Network	Apr 11, 2022	Knowledge Distillation	CodeCode Available	1
Overcoming Catastrophic Forgetting in Incremental Object Detection via Elastic Response Distillation	Apr 5, 2022	Class-Incremental Object DetectionIncremental Learning	CodeCode Available	1
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation	Apr 2, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation	Apr 1, 2022	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1
Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense Embeddings	Apr 1, 2022	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Feature Structure Distillation with Centered Kernel Alignment in BERT Transferring	Apr 1, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	1
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher	Mar 31, 2022	AllData Free Quantization	CodeCode Available	1
Self-Distillation from the Last Mini-Batch for Consistency Regularization	Mar 30, 2022	Knowledge Distillation	CodeCode Available	1
Rainbow Keywords: Efficient Incremental Learning for Online Spoken Keyword Spotting	Mar 30, 2022	Data AugmentationDiversity	CodeCode Available	1
Monitored Distillation for Positive Congruent Depth Completion	Mar 30, 2022	Depth CompletionImage Reconstruction	CodeCode Available	1
Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection	Mar 29, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Uncertainty-aware Contrastive Distillation for Incremental Semantic Segmentation	Mar 26, 2022	Contrastive Learningimage-classification	CodeCode Available	1
Knowledge Distillation with the Reused Teacher Classifier	Mar 26, 2022	Knowledge Distillation	CodeCode Available	1
PCA-Based Knowledge Distillation Towards Lightweight and Content-Style Balanced Photorealistic Style Transfer Models	Mar 25, 2022	Knowledge DistillationStyle Transfer	CodeCode Available	1
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks	Mar 25, 2022	Incremental LearningKnowledge Distillation	CodeCode Available	1
Rich Feature Construction for the Optimization-Generalization Dilemma	Mar 24, 2022	Inductive BiasKnowledge Distillation	CodeCode Available	1
Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction	Mar 24, 2022	Grammatical Error CorrectionKnowledge Distillation	CodeCode Available	1
R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning	Mar 24, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1
SSD-KD: A Self-supervised Diverse Knowledge Distillation Method for Lightweight Skin Lesion Classification Using Dermoscopic Images	Mar 22, 2022	Knowledge DistillationLesion Classification	CodeCode Available	1
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization	Mar 21, 2022	Knowledge DistillationModel Compression	CodeCode Available	1
Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation	Mar 21, 2022	Document-level Relation ExtractionKnowledge Distillation	CodeCode Available	1
Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation	Mar 20, 2022	Knowledge DistillationLanguage Modelling	CodeCode Available	1
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning	Mar 17, 2022	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation	Mar 17, 2022	Data AugmentationHellaSwag	CodeCode Available	1
Graph Flow: Cross-layer Graph Flow Distillation for Dual Efficient Medical Image Segmentation	Mar 16, 2022	Image SegmentationKnowledge Distillation	CodeCode Available	1
SATS: Self-Attention Transfer for Continual Semantic Segmentation	Mar 15, 2022	Continual Semantic SegmentationKnowledge Distillation	CodeCode Available	1
Unified Visual Transformer Compression	Mar 15, 2022	Knowledge Distillation	CodeCode Available	1
Representation Compensation Networks for Continual Semantic Segmentation	Mar 10, 2022	Class Incremental LearningContinual Learning	CodeCode Available	1
Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability	Mar 10, 2022	Knowledge Distillation	CodeCode Available	1
Prediction-Guided Distillation for Dense Object Detection	Mar 10, 2022	Dense Object DetectionKnowledge Distillation	CodeCode Available	1
Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation	Mar 8, 2022	Continual LearningKnowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 13 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified