Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–850 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Cross-category Video Highlight Detection via Set-based Learning	Aug 26, 2021	Domain AdaptationHighlight Detection	CodeCode Available	1	5
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty	Aug 3, 2023	Knowledge Distillation	CodeCode Available	1	5
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher	Mar 31, 2022	AllData Free Quantization	CodeCode Available	1	5
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models	May 23, 2024	Knowledge DistillationMath	CodeCode Available	1	5
Data-Free Knowledge Distillation for Heterogeneous Federated Learning	May 20, 2021	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1	5
itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection	May 31, 2022	3D Object DetectionCloud Detection	CodeCode Available	1	5
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer	Nov 14, 2022	Image ClassificationKnowledge Distillation	CodeCode Available	1	5
Cross-Layer Distillation with Semantic Calibration	Dec 6, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1	5
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification	Nov 4, 2023	ClassificationCross-Domain Few-Shot	CodeCode Available	1	5
FreeKD: Knowledge Distillation via Semantic Frequency Prompt	Nov 20, 2023	Knowledge Distillation	CodeCode Available	1	5
Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation	Dec 17, 2020	3D Human Pose EstimationKnowledge Distillation	CodeCode Available	1	5
Is Synthetic Data From Diffusion Models Ready for Knowledge Distillation?	May 22, 2023	Data-free Knowledge DistillationFew-Shot Learning	CodeCode Available	1	5
From My View to Yours: Ego-Augmented Learning in Large Vision Language Models for Understanding Exocentric Daily Living Activities	Jan 10, 2025	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1	5
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing	Jan 18, 2022	AllClassification	CodeCode Available	1	5
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval	Oct 7, 2022	Knowledge DistillationRetrieval	CodeCode Available	1	5
CrossMatch: Enhance Semi-Supervised Medical Image Segmentation with Perturbation Strategies and Knowledge Distillation	May 1, 2024	Image SegmentationKnowledge Distillation	CodeCode Available	1	5
Balanced Knowledge Distillation for Long-tailed Learning	Apr 21, 2021	Knowledge Distillation	CodeCode Available	1	5
Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image Retrieval	Oct 19, 2022	Cross-Modal RetrievalImage Retrieval	CodeCode Available	1	5
Cross-modality Data Augmentation for End-to-End Sign Language Translation	May 18, 2023	Data AugmentationKnowledge Distillation	CodeCode Available	1	5
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation	Mar 25, 2021	Domain AdaptationKnowledge Distillation	CodeCode Available	1	5
Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection	Nov 14, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1	5
A New Knowledge Distillation Network for Incremental Few-Shot Surface Defect Detection	Sep 1, 2022	Defect DetectionKnowledge Distillation	CodeCode Available	1	5
Generative Adversarial Super-Resolution at the Edge with Knowledge Distillation	Sep 7, 2022	CPUGenerative Adversarial Network	CodeCode Available	1	5
Generative Model-based Feature Knowledge Distillation for Action Recognition	Dec 14, 2023	Action DetectionAction Recognition	CodeCode Available	1	5
RankFormer: Listwise Learning-to-Rank Using Listwide Labels	Jun 9, 2023	Knowledge DistillationLearning-To-Rank	CodeCode Available	1	5
Generative Bias for Robust Visual Question Answering	Aug 1, 2022	Knowledge DistillationQuestion Answering	CodeCode Available	1	5
Aligned Structured Sparsity Learning for Efficient Image Super-Resolution	Dec 1, 2021	Image Super-ResolutionKnowledge Distillation	CodeCode Available	1	5
Generic-to-Specific Distillation of Masked Autoencoders	Feb 28, 2023	Decoderimage-classification	CodeCode Available	1	5
GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets	Aug 26, 2024	AllData Augmentation	CodeCode Available	1	5
CSAKD: Knowledge Distillation with Cross Self-Attention for Hyperspectral and Multispectral Image Fusion	Jun 28, 2024	Knowledge DistillationSuper-Resolution	CodeCode Available	1	5
Intra-class Feature Variation Distillation for Semantic Segmentation	Aug 1, 2020	Knowledge DistillationSegmentation	CodeCode Available	1	5
CTC-based Non-autoregressive Textless Speech-to-Speech Translation	Jun 11, 2024	Knowledge DistillationMachine Translation	CodeCode Available	1	5
GlobalFlowNet: Video Stabilization using Deep Distilled Global Motion Estimates	Oct 25, 2022	Knowledge DistillationOptical Flow Estimation	CodeCode Available	1	5
Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation	Jan 3, 2023	BenchmarkingFew-shot Instance Segmentation	CodeCode Available	1	5
Cumulative Spatial Knowledge Distillation for Vision Transformers	Jul 17, 2023	Inductive BiasKnowledge Distillation	CodeCode Available	1	5
Curriculum Learning for Dense Retrieval Distillation	Apr 28, 2022	Knowledge DistillationPassage Retrieval	CodeCode Available	1	5
Curriculum Temperature for Knowledge Distillation	Nov 29, 2022	Image ClassificationKnowledge Distillation	CodeCode Available	1	5
BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA Acceleration	Jul 31, 2023	CPUFault Diagnosis	CodeCode Available	1	5
Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks	Nov 22, 2019	DecoderGeneral Knowledge	CodeCode Available	1	5
Gradient-based Intra-attention Pruning on Pre-trained Language Models	Dec 15, 2022	Knowledge Distillation	CodeCode Available	1	5
Remembering Normality: Memory-guided Knowledge Distillation for Unsupervised Anomaly Detection	Jan 1, 2023	Anomaly DetectionKnowledge Distillation	CodeCode Available	1	5
Representation Compensation Networks for Continual Semantic Segmentation	Mar 10, 2022	Class Incremental LearningContinual Learning	CodeCode Available	1	5
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks	Oct 30, 2017	3D Action RecognitionAction Recognition	CodeCode Available	1	5
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation	Feb 5, 2024	Knowledge Distillation	CodeCode Available	1	5
Rethinking Centered Kernel Alignment in Knowledge Distillation	Jan 22, 2024	image-classificationImage Classification	CodeCode Available	1	5
Rethinking Data Augmentation for Robust Visual Question Answering	Jul 18, 2022	Data AugmentationKnowledge Distillation	CodeCode Available	1	5
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model	Dec 2, 2024	cross-modal alignmentKnowledge Distillation	CodeCode Available	1	5
Graph-based Knowledge Distillation: A survey and experimental evaluation	Feb 27, 2023	Knowledge DistillationSelf-Knowledge Distillation	CodeCode Available	1	5
Dark Experience for General Continual Learning: a Strong, Simple Baseline	Apr 15, 2020	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
Data-Free Class-Incremental Hand Gesture Recognition	Jan 1, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 17 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified