Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–500 of 4240 papers

Title	Date	Tasks	Status	Hype
Graph-less Collaborative Filtering	Mar 15, 2023	Collaborative FilteringContrastive Learning	CodeCode Available	1
Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint	Jan 1, 2023	Data AugmentationData-free Knowledge Distillation	CodeCode Available	1
Graph Relation Distillation for Efficient Biomedical Instance Segmentation	Jan 12, 2024	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Data-Free Network Quantization With Adversarial Knowledge Distillation	May 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images	Mar 30, 2021	Knowledge DistillationSegmentation	CodeCode Available	1
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval	Oct 11, 2022	Knowledge DistillationQuantization	CodeCode Available	1
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method	Jun 11, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing	Mar 10, 2024	Image RetrievalKnowledge Distillation	CodeCode Available	1
AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection	May 21, 2024	Knowledge DistillationPedestrian Detection	CodeCode Available	1
BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation	Jul 12, 2024	Knowledge Distillation	CodeCode Available	1
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1
Black-box Few-shot Knowledge Distillation	Jul 25, 2022	image-classificationImage Classification	CodeCode Available	1
Adaptive Multi-Teacher Multi-level Knowledge Distillation	Mar 6, 2021	Knowledge Distillation	CodeCode Available	1
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models	Mar 28, 2023	DecoderHuman-Object Interaction Detection	CodeCode Available	1
Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be Secretly Coded into the Classifiers' Outputs	May 25, 2021	AttributeKnowledge Distillation	CodeCode Available	1
Understanding the Role of the Projector in Knowledge Distillation	Mar 20, 2023	image-classificationImage Classification	CodeCode Available	1
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation	Jun 1, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1
Data-Free Knowledge Distillation for Heterogeneous Federated Learning	May 20, 2021	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1
Human-Inspired Multi-Agent Navigation using Knowledge Distillation	Mar 18, 2021	Collision AvoidanceKnowledge Distillation	CodeCode Available	1
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation	Apr 2, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1
I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval	Jun 4, 2023	Knowledge DistillationPassage Retrieval	CodeCode Available	1
iDAT: inverse Distillation Adapter-Tuning	Mar 23, 2024	image-classificationImage Classification	CodeCode Available	1
Implicit Chain of Thought Reasoning via Knowledge Distillation	Nov 2, 2023	Knowledge DistillationMath	CodeCode Available	1
Improved Techniques for Training Adaptive Deep Networks	Aug 17, 2019	Computational EfficiencyKnowledge Distillation	CodeCode Available	1
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors	Jan 1, 2021	image-classificationImage Classification	CodeCode Available	1
Boosting Light-Weight Depth Estimation Via Knowledge Distillation	May 13, 2021	Computational EfficiencyDepth Estimation	CodeCode Available	1
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation	Oct 6, 2020	Knowledge DistillationPassage Ranking	CodeCode Available	1
DA-Mamba: Domain Adaptive Hybrid Mamba-Transformer Based One-Stage Object Detection	Feb 16, 2025	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Data Diversification: A Simple Strategy For Neural Machine Translation	Nov 5, 2019	Knowledge DistillationMachine Translation	CodeCode Available	1
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving	May 24, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
Improving Simultaneous Machine Translation with Monolingual Data	Dec 2, 2022	HallucinationKnowledge Distillation	CodeCode Available	1
Data Efficient Language-supervised Zero-shot Recognition with Optimal Transport Distillation	Dec 17, 2021	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Incremental Multi-Target Domain Adaptation for Object Detection with Efficient Domain Transfer	Apr 13, 2021	Domain AdaptationIncremental Learning	CodeCode Available	1
Advantage-Guided Distillation for Preference Alignment in Small Language Models	Feb 25, 2025	Knowledge Distillation	CodeCode Available	1
Information Theoretic Representation Distillation	Dec 1, 2021	Classification with Binary Weight NetworkKnowledge Distillation	CodeCode Available	1
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings	Oct 23, 2022	Acoustic Unit DiscoveryContrastive Learning	CodeCode Available	1
Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning	Aug 18, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1
Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection	Mar 29, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model	Mar 20, 2024	Drug DiscoveryKnowledge Distillation	CodeCode Available	1
BPKD: Boundary Privileged Knowledge Distillation For Semantic Segmentation	Jun 13, 2023	Knowledge DistillationSegmentation	CodeCode Available	1
Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation	Oct 15, 2024	Knowledge DistillationRgb-T Tracking	CodeCode Available	1
Discriminative and Consistent Representation Distillation	Jul 16, 2024	Causal InferenceContrastive Learning	CodeCode Available	1
Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation	Dec 17, 2020	3D Human Pose EstimationKnowledge Distillation	CodeCode Available	1
Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection	Jul 16, 2024	Knowledge Distillationobject-detection	CodeCode Available	1
CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers	Apr 9, 2024	Knowledge DistillationZero-shot Generalization	CodeCode Available	1
APSNet: Attention Based Point Cloud Sampling	Oct 11, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	1
Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection	Aug 28, 2023	Binary ClassificationClassification	CodeCode Available	1
KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation	Nov 19, 2020	Domain AdaptationKnowledge Distillation	CodeCode Available	1
DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners	Jul 4, 2024	Audio ClassificationAudio Tagging	CodeCode Available	1

Show:10 25 50

← PrevPage 10 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified