Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–500 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Improved Techniques for Training Adaptive Deep Networks	Aug 17, 2019	Computational EfficiencyKnowledge Distillation	CodeCode Available	1	5
ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian Splatting	Jan 7, 2025	3D ReconstructionKnowledge Distillation	CodeCode Available	1	5
Improving Continual Relation Extraction by Distinguishing Analogous Semantics	May 11, 2023	Continual Relation ExtractionKnowledge Distillation	CodeCode Available	1	5
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation	Jul 3, 2021	Knowledge DistillationModel Compression	CodeCode Available	1	5
Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems	Oct 1, 2019	Edge-computingImage Classification	CodeCode Available	1	5
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval	Oct 11, 2022	Knowledge DistillationQuantization	CodeCode Available	1	5
Enhancing Low-Density EEG-Based Brain-Computer Interfaces with Similarity-Keeping Knowledge Distillation	Dec 6, 2022	EEGEeg Decoding	CodeCode Available	1	5
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing	Mar 10, 2024	Image RetrievalKnowledge Distillation	CodeCode Available	1	5
AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection	May 21, 2024	Knowledge DistillationPedestrian Detection	CodeCode Available	1	5
BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation	Jul 12, 2024	Knowledge Distillation	CodeCode Available	1	5
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1	5
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1	5
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks	Mar 25, 2022	Incremental LearningKnowledge Distillation	CodeCode Available	1	5
Coaching a Teachable Student	Jun 16, 2023	CARLA longest6Knowledge Distillation	CodeCode Available	1	5
Improving Simultaneous Machine Translation with Monolingual Data	Dec 2, 2022	HallucinationKnowledge Distillation	CodeCode Available	1	5
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method	Jun 11, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Understanding the Role of the Projector in Knowledge Distillation	Mar 20, 2023	image-classificationImage Classification	CodeCode Available	1	5
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation	Jun 1, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1	5
Blockwisely Supervised Neural Architecture Search with Knowledge Distillation	Nov 29, 2019	Knowledge DistillationNeural Architecture Search	CodeCode Available	1	5
Distilling Holistic Knowledge with Graph Neural Networks	Aug 12, 2021	Knowledge Distillation	CodeCode Available	1	5
Distilling Knowledge from Refinement in Multiple Instance Detection Networks	Apr 23, 2020	Knowledge DistillationMultiple Instance Learning	CodeCode Available	1	5
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability	Jul 6, 2023	Few-Shot Image ClassificationImage Classification	CodeCode Available	1	5
Distilling Knowledge from Reader to Retriever for Question Answering	Dec 8, 2020	Information RetrievalKnowledge Distillation	CodeCode Available	1	5
Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition	May 19, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
Distilling Knowledge from Self-Supervised Teacher by Embedding Graph Alignment	Nov 23, 2022	Knowledge DistillationRepresentation Learning	CodeCode Available	1	5
Distilling Knowledge via Knowledge Review	Apr 19, 2021	Instance SegmentationKnowledge Distillation	CodeCode Available	1	5
Boosting Light-Weight Depth Estimation Via Knowledge Distillation	May 13, 2021	Computational EfficiencyDepth Estimation	CodeCode Available	1	5
Distilling Knowledge via Intermediate Classifiers	Feb 28, 2021	Knowledge DistillationTransfer Learning	CodeCode Available	1	5
CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation	Aug 26, 2022	3D Action RecognitionAction Recognition	CodeCode Available	1	5
Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models	Dec 19, 2024	Bilevel OptimizationKnowledge Distillation	CodeCode Available	1	5
Distilling Meta Knowledge on Heterogeneous Graph for Illicit Drug Trafficker Detection on Social Media	Dec 1, 2021	Knowledge DistillationMarketing	CodeCode Available	1	5
Intra-Document Cascading: Learning to Select Passages for Neural Document Ranking	May 20, 2021	Document RankingKnowledge Distillation	CodeCode Available	1	5
Distilling Object Detectors via Decoupled Features	Mar 26, 2021	image-classificationImage Classification	CodeCode Available	1	5
Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models	Nov 2, 2023	Data AugmentationDomain Generalization	CodeCode Available	1	5
Distilling Object Detectors with Feature Richness	Nov 1, 2021	Knowledge DistillationModel Compression	CodeCode Available	1	5
itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection	May 31, 2022	3D Object DetectionCloud Detection	CodeCode Available	1	5
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings	Oct 23, 2022	Acoustic Unit DiscoveryContrastive Learning	CodeCode Available	1	5
Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning	Aug 18, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving	May 24, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	1	5
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models	May 23, 2024	Knowledge DistillationMath	CodeCode Available	1	5
BPKD: Boundary Privileged Knowledge Distillation For Semantic Segmentation	Jun 13, 2023	Knowledge DistillationSegmentation	CodeCode Available	1	5
Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation	Oct 15, 2024	Knowledge DistillationRgb-T Tracking	CodeCode Available	1	5
KDAS: Knowledge Distillation via Attention Supervision Framework for Polyp Segmentation	Dec 13, 2023	Knowledge DistillationMedical Image Segmentation	CodeCode Available	1	5
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and Quantization	Nov 30, 2020	Knowledge DistillationModel Compression	CodeCode Available	1	5
Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection	Jul 16, 2024	Knowledge Distillationobject-detection	CodeCode Available	1	5
Distilling Visual Priors from Self-Supervised Learning	Aug 1, 2020	ClassificationContrastive Learning	CodeCode Available	1	5
Advantage-Guided Distillation for Preference Alignment in Small Language Models	Feb 25, 2025	Knowledge Distillation	CodeCode Available	1	5
Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection	Aug 28, 2023	Binary ClassificationClassification	CodeCode Available	1	5
Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation	Oct 10, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1	5
APSNet: Attention Based Point Cloud Sampling	Oct 11, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 10 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified