Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3501–3550 of 4240 papers

Title	Date	Tasks	Status
Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation	Nov 26, 2024	Data-free Knowledge DistillationDiversity	CodeCode Available
Learning from Noisy Crowd Labels with Logics	Feb 13, 2023	Knowledge Distillationnamed-entity-recognition	CodeCode Available
Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition	Feb 28, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Pre-trained Summarization Distillation	Oct 24, 2020	Knowledge DistillationMachine Translation	CodeCode Available
Efficient and Robust Jet Tagging at the LHC with Knowledge Distillation	Nov 23, 2023	Inductive BiasJet Tagging	CodeCode Available
SeqMIA: Sequential-Metric Based Membership Inference Attack	Jul 21, 2024	Inference AttackKnowledge Distillation	CodeCode Available
SeqNAS: Neural Architecture Search for Event Sequence Classification	Jan 6, 2024	Bayesian OptimizationClassification	CodeCode Available
Learning Lightweight Lane Detection CNNs by Self Attention Distillation	Aug 2, 2019	Knowledge DistillationLane Detection	CodeCode Available
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data	Jul 15, 2020	Cross-Lingual NERCross-Lingual Transfer	CodeCode Available
Contrastive Learning in Distilled Models	Jan 23, 2024	Contrastive LearningKnowledge Distillation	CodeCode Available
Language Model Knowledge Distillation for Efficient Question Answering in Spanish	Dec 7, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available
Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias	May 1, 2021	Knowledge DistillationMachine Translation	CodeCode Available
KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer	Feb 22, 2023	Knowledge DistillationTransfer Learning	CodeCode Available
Knowledge Transfer Graph for Deep Collaborative Learning	Sep 10, 2019	Knowledge DistillationTransfer Learning	CodeCode Available
Text Representation Distillation via Information Bottleneck Principle	Nov 9, 2023	Knowledge DistillationRetrieval	CodeCode Available
KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server	Oct 8, 2024	Federated LearningKnowledge Distillation	CodeCode Available
Online Lifelong Generalized Zero-Shot Learning	Mar 19, 2021	Continual LearningGeneralized Zero-Shot Learning	CodeCode Available
Adaptive Temperature Based on Logits Correlation in Knowledge Distillation	Mar 12, 2025	Knowledge Distillation	CodeCode Available
Improving Sequential Recommendations via Bidirectional Temporal Data Augmentation with Pre-training	Dec 13, 2021	Data AugmentationKnowledge Distillation	CodeCode Available
Continual Representation Learning for Biometric Identification	Jun 8, 2020	Continual LearningKnowledge Distillation	CodeCode Available
Continual Panoptic Perception: Towards Multi-modal Incremental Interpretation of Remote Sensing Images	Jul 19, 2024	Caption GenerationContinual Learning	CodeCode Available
Privacy Evaluation Benchmarks for NLP Models	Sep 24, 2024	Knowledge Distillation	CodeCode Available
Knowledge Grafting of Large Language Models	May 24, 2025	Continual LearningKnowledge Distillation	CodeCode Available
Knowledge Extraction with No Observable Data	Dec 1, 2019	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
Learning to Maximize Mutual Information for Chain-of-Thought Distillation	Mar 5, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available
Knowledge Distillation with Reptile Meta-Learning for Pretrained Language Model Compression	Oct 1, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available
Privacy in Practice: Private COVID-19 Detection in X-Ray Images (Extended Version)	Nov 21, 2022	Knowledge DistillationMembership Inference Attack	CodeCode Available
Privacy-preserving Early Detection of Epileptic Seizures in Videos	Sep 15, 2023	Knowledge DistillationOptical Flow Estimation	CodeCode Available
The Curious Case of Hallucinations in Neural Machine Translation	Apr 14, 2021	HallucinationKnowledge Distillation	CodeCode Available
Knowledge Distillation with Adversarial Samples Supporting Decision Boundary	May 15, 2018	Adversarial AttackKnowledge Distillation	CodeCode Available
Learning to "Segment Anything" in Thermal Infrared Images through Knowledge Distillation with a Large Scale Dataset SATIR	Apr 17, 2023	Image SegmentationKnowledge Distillation	CodeCode Available
The Devil is in the Data: Learning Fair Graph Neural Networks via Partial Knowledge Distillation	Nov 29, 2023	FairnessKnowledge Distillation	CodeCode Available
SFT-KD-Recon: Learning a Student-friendly Teacher for Knowledge Distillation in Magnetic Resonance Image Reconstruction	Apr 11, 2023	Image ReconstructionKnowledge Distillation	CodeCode Available
SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification	Nov 28, 2022	Few-Shot Image ClassificationFew-Shot Learning	CodeCode Available
Shape-intensity knowledge distillation for robust medical image segmentation	Sep 26, 2024	Image SegmentationKnowledge Distillation	CodeCode Available
Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective	May 29, 2023	Knowledge DistillationReinforcement Learning (RL)	CodeCode Available
Assessor-Guided Learning for Continual Environments	Mar 21, 2023	Continual LearningIncremental Learning	CodeCode Available
Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices	Dec 1, 2021	Knowledge DistillationModel Compression	CodeCode Available
Learning without Forgetting for 3D Point Cloud Objects	Jun 27, 2021	Knowledge Distillation	CodeCode Available
Are All Linear Regions Created Equal?	Feb 23, 2022	AllKnowledge Distillation	CodeCode Available
Vision Transformers for Small Histological Datasets Learned through Knowledge Distillation	May 27, 2023	Airbubbles DetectionAnomaly Detection	CodeCode Available
Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games	Apr 14, 2023	Knowledge Distillationtext-based games	CodeCode Available
Continual Knowledge Distillation for Neural Machine Translation	Dec 18, 2022	Knowledge DistillationMachine Translation	CodeCode Available
Leave No One Behind: Enhancing Diversity While Maintaining Accuracy in Social Recommendation	Feb 17, 2025	DiversityKnowledge Distillation	CodeCode Available
Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation	Nov 26, 2024	Code GenerationContrastive Learning	CodeCode Available
EaSyGuide : ESG Issue Identification Framework leveraging Abilities of Generative Large Language Models	Jun 11, 2023	ArticlesKnowledge Distillation	CodeCode Available
LENAS: Learning-based Neural Architecture Search and Ensemble for 3D Radiotherapy Dose Prediction	Jun 12, 2021	DiversityEnsemble Learning	CodeCode Available
Knowledge Distillation via Instance Relationship Graph	Jun 1, 2019	Knowledge Distillation	CodeCode Available
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems	Jun 21, 2019	Dialogue EvaluationKnowledge Distillation	CodeCode Available
Knowledge distillation to effectively attain both region-of-interest and global semantics from an image where multiple objects appear	Jul 11, 2024	Knowledge Distillationobject-detection	CodeCode Available

Show:10 25 50

← PrevPage 71 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified