Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1	5
Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image Retrieval	Oct 19, 2022	Cross-Modal RetrievalImage Retrieval	CodeCode Available	1	5
Cross-modality Data Augmentation for End-to-End Sign Language Translation	May 18, 2023	Data AugmentationKnowledge Distillation	CodeCode Available	1	5
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation	Jun 1, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1	5
Black-box Few-shot Knowledge Distillation	Jul 25, 2022	image-classificationImage Classification	CodeCode Available	1	5
Boosting Multi-Label Image Classification with Complementary Parallel Self-Distillation	May 23, 2022	image-classificationImage Classification	CodeCode Available	1	5
Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection	Nov 14, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1	5
Curriculum Learning for Dense Retrieval Distillation	Apr 28, 2022	Knowledge DistillationPassage Retrieval	CodeCode Available	1	5
Data-Free Class-Incremental Hand Gesture Recognition	Jan 1, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval	Oct 11, 2022	Knowledge DistillationQuantization	CodeCode Available	1	5
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing	Mar 10, 2024	Image RetrievalKnowledge Distillation	CodeCode Available	1	5
BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation	Jun 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Cross-Layer Distillation with Semantic Calibration	Dec 6, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1	5
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation	Feb 5, 2024	Knowledge DistillationRetrieval	CodeCode Available	1	5
Bidirectional Distillation for Top-K Recommender System	Jun 5, 2021	Knowledge DistillationModel Compression	CodeCode Available	1	5
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones	Mar 10, 2021	Knowledge Distillationobject-detection	CodeCode Available	1	5
Bi-directional Weakly Supervised Knowledge Distillation for Whole Slide Image Classification	Oct 7, 2022	Classificationimage-classification	CodeCode Available	1	5
BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation	Jul 12, 2024	Knowledge Distillation	CodeCode Available	1	5
CrossKD: Cross-Head Knowledge Distillation for Object Detection	Jun 20, 2023	Dense Object DetectionKnowledge Distillation	CodeCode Available	1	5
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification	Nov 4, 2023	ClassificationCross-Domain Few-Shot	CodeCode Available	1	5
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model	Aug 2, 2023	HallucinationImage Captioning	CodeCode Available	1	5
Contrastive Representation Distillation	Oct 23, 2019	Contrastive LearningKnowledge Distillation	CodeCode Available	1	5
BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for BEV 3D Object Detection	Dec 1, 2022	3D Object DetectionAutonomous Driving	CodeCode Available	1	5
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection	Nov 17, 2022	3D Object DetectionDepth Estimation	CodeCode Available	1	5
SimDistill: Simulated Multi-modal Distillation for BEV 3D Object Detection	Mar 29, 2023	3D geometry3D Object Detection	CodeCode Available	1	5
Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing	Apr 1, 2020	Knowledge DistillationRetrieval	CodeCode Available	1	5
Contrastive Deep Supervision	Jul 12, 2022	Contrastive LearningFine-Grained Image Classification	CodeCode Available	1	5
Continual Learning for LiDAR Semantic Segmentation: Class-Incremental and Coarse-to-Fine strategies on Sparse Data	Apr 8, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
Contrastive Distillation on Intermediate Representations for Language Model Compression	Sep 29, 2020	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Continual evaluation for lifelong learning: Identifying the stability gap	May 26, 2022	Continual LearningIncremental Learning	CodeCode Available	1	5
BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA Acceleration	Jul 31, 2023	CPUFault Diagnosis	CodeCode Available	1	5
Continual Learning for Image Segmentation with Dynamic Query	Nov 29, 2023	Continual LearningDiversity	CodeCode Available	1	5
Contrastive Model Inversion for Data-Free Knowledge Distillation	May 18, 2021	Contrastive LearningData-free Knowledge Distillation	CodeCode Available	1	5
Cross-category Video Highlight Detection via Set-based Learning	Aug 26, 2021	Domain AdaptationHighlight Detection	CodeCode Available	1	5
CrossMatch: Enhance Semi-Supervised Medical Image Segmentation with Perturbation Strategies and Knowledge Distillation	May 1, 2024	Image SegmentationKnowledge Distillation	CodeCode Available	1	5
Data-Free Knowledge Distillation for Heterogeneous Federated Learning	May 20, 2021	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1	5
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty	Aug 3, 2023	Knowledge Distillation	CodeCode Available	1	5
Designing Large Foundation Models for Efficient Training and Inference: A Survey	Sep 3, 2024	Knowledge DistillationModel Compression	CodeCode Available	1	5
AIM 2024 Challenge on UHD Blind Photo Quality Assessment	Sep 24, 2024	4kComputational Efficiency	CodeCode Available	1	5
Avatar Knowledge Distillation: Self-ensemble Teacher Paradigm with Uncertainty	May 4, 2023	Knowledge Distillationobject-detection	CodeCode Available	1	5
AdaptGuard: Defending Against Universal Attacks for Model Adaptation	Mar 19, 2023	Knowledge Distillationmodel	CodeCode Available	1	5
Bayes Conditional Distribution Estimation for Knowledge Distillation Based on Conditional Mutual Information	Jan 16, 2024	Knowledge Distillation	CodeCode Available	1	5
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression	Sep 7, 2021	Knowledge DistillationQuantization	CodeCode Available	1	5
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1	5
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade	Dec 29, 2020	Knowledge DistillationModel Selection	CodeCode Available	1	5
A Knowledge Distillation Framework For Enhancing Ear-EEG Based Sleep Staging With Scalp-EEG Data	Oct 27, 2022	Domain AdaptationEEG	CodeCode Available	1	5
Better Estimation of the KL Divergence Between Language Models	Apr 14, 2025	Knowledge Distillation	CodeCode Available	1	5
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing	Feb 7, 2020	Knowledge DistillationModel Compression	CodeCode Available	1	5
Adjoined Networks: A Training Paradigm with Applications to Network Compression	Jun 10, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1	5
Backdoor Attacks on Self-Supervised Learning	May 21, 2021	Backdoor AttackInductive Bias	CodeCode Available	1	5

Show:10 25 50

← PrevPage 4 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified