Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3401–3450 of 4240 papers

Title	Date	Tasks	Status
NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation	Jun 17, 2024	Knowledge DistillationNeRF	—Unverified
No Forgetting Learning: Memory-free Continual Learning	Mar 6, 2025	Continual LearningKnowledge Distillation	—Unverified
Noise-Tolerant Few-Shot Unsupervised Adapter for Vision-Language Models	Sep 26, 2023	image-classificationImage Classification	—Unverified
Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation	Jan 14, 2020	Knowledge Distillation	—Unverified
Noisy Neural Network Compression for Analog Storage Devices	Oct 19, 2020	Knowledge DistillationModel Compression	—Unverified
Non-Autoregressive Sign Language Production via Knowledge Distillation	Aug 12, 2022	Knowledge DistillationSign Language Production	—Unverified
Non-target Divergence Hypothesis: Toward Understanding Domain Gaps in Cross-Modal Knowledge Distillation	Sep 4, 2024	Knowledge Distillation	—Unverified
No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices	Feb 16, 2022	Federated LearningKnowledge Distillation	—Unverified
Normalized Feature Distillation for Semantic Segmentation	Jul 12, 2022	Knowledge DistillationModel Compression	—Unverified
Not All Knowledge Is Created Equal: Mutual Distillation of Confident Knowledge	Jun 2, 2021	AllKnowledge Distillation	—Unverified
Not All Regions are Worthy to be Distilled: Region-aware Knowledge Distillation Towards Efficient Image-to-Image Translation	Sep 29, 2021	AllContrastive Learning	—Unverified
Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering	May 15, 2022	Domain GeneralizationKnowledge Distillation	—Unverified
NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation	Dec 10, 2023	Knowledge Distillation	—Unverified
Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation	Jul 7, 2021	Fine-Grained Visual RecognitionKnowledge Distillation	—Unverified
NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21	Nov 16, 2021	Data AugmentationKnowledge Distillation	—Unverified
NVIDIA NeMo’s Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21	Nov 1, 2021	Data AugmentationKnowledge Distillation	—Unverified
NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM	Oct 28, 2021	Knowledge DistillationNatural Language Understanding	—Unverified
NYCU-TWO at Memotion 3: Good Foundation, Good Teacher, then you have Good Meme Analysis	Feb 13, 2023	Knowledge DistillationSentiment Analysis	—Unverified
oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes	Mar 30, 2023	Knowledge DistillationModel Compression	—Unverified
Object-centric Cross-modal Feature Distillation for Event-based Object Detection	Nov 9, 2023	Knowledge DistillationObject	—Unverified
Object-Centric Diffusion for Efficient Video Editing	Jan 11, 2024	Knowledge DistillationObject	—Unverified
OccludeNeRF: Geometric-aware 3D Scene Inpainting with Collaborative Score Distillation in NeRF	Apr 1, 2025	DenoisingKnowledge Distillation	—Unverified
Occlusion-Robust FAU Recognition by Mining Latent Space of Masked Autoencoders	Dec 8, 2022	Knowledge Distillation	—Unverified
Offline-to-Online Knowledge Distillation for Video Instance Segmentation	Feb 15, 2023	Data AugmentationInstance Segmentation	—Unverified
Oh! We Freeze: Improving Quantized Knowledge Distillation via Signal Propagation Analysis for Large Language Models	Mar 26, 2024	Knowledge DistillationQuantization	—Unverified
OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery	Mar 22, 2025	Knowledge Distillation	—Unverified
On Accelerating Edge AI: Optimizing Resource-Constrained Environments	Jan 25, 2025	Knowledge DistillationModel Compression	—Unverified
On Compressing U-net Using Knowledge Distillation	Dec 1, 2018	Knowledge Distillation	—Unverified
Deakin RF-Sensing: Experiments on Correlated Knowledge Distillation for Monitoring Human Postures with Radios	May 24, 2023	Knowledge Distillation	—Unverified
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation	Jul 6, 2023	Keyword SpottingKnowledge Distillation	—Unverified
On Distilling the Displacement Knowledge for Few-Shot Class-Incremental Learning	Dec 15, 2024	class-incremental learningClass Incremental Learning	—Unverified
One Category One Prompt: Dataset Distillation using Diffusion Models	Mar 11, 2024	Dataset DistillationKnowledge Distillation	—Unverified
One-Class Knowledge Distillation for Spoofing Speech Detection	Sep 15, 2023	Binary ClassificationKnowledge Distillation	—Unverified
On effects of Knowledge Distillation on Transfer Learning	Oct 18, 2022	image-classificationImage Classification	—Unverified
One General Teacher for Multi-Data Multi-Task: A New Knowledge Distillation Framework for Discourse Relation Analysis	Nov 16, 2021	Knowledge DistillationMulti-Task Learning	—Unverified
On Elastic Language Models	Nov 13, 2023	Information RetrievalKnowledge Distillation	—Unverified
One-Shot Federated Learning for LEO Constellations that Reduces Convergence Time from Days to 90 Minutes	May 21, 2023	Federated LearningKnowledge Distillation	—Unverified
On Estimating the Training Cost of Conversational Recommendation Systems	Nov 10, 2020	Conversational RecommendationKnowledge Distillation	—Unverified
One-stop Training of Multiple Capacity Models	May 23, 2023	Knowledge DistillationMachine Translation	—Unverified
One Student Knows All Experts Know: From Sparse to Dense	Jan 26, 2022	AllKnowledge Distillation	—Unverified
One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers	Jun 2, 2021	Knowledge DistillationLanguage Modeling	—Unverified
On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process	Dec 18, 2024	Knowledge DistillationTransfer Learning	—Unverified
On Generalizing Beyond Domains in Cross-Domain Continual Learning	Mar 8, 2022	Continual LearningKnowledge Distillation	—Unverified
On Good Practices for Task-Specific Distillation of Large Pretrained Visual Models	Feb 17, 2024	Data AugmentationKnowledge Distillation	—Unverified
On Knowledge Distillation for Direct Speech Translation	Dec 9, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
On Knowledge Distillation for Translating Erroneous Speech Transcriptions	Aug 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
On Knowledge distillation from complex networks for response prediction	Jun 1, 2019	Knowledge DistillationQuestion Answering	—Unverified
Online Continual Learning For Visual Food Classification	Aug 15, 2021	ClassificationContinual Learning	—Unverified
Online Continual Learning via the Meta-learning Update with Multi-scale Knowledge Distillation and Data Augmentation	Sep 12, 2022	Continual LearningData Augmentation	—Unverified
Online Cross-Layer Knowledge Distillation on Graph Neural Networks with Deep Supervision	Oct 25, 2022	Knowledge DistillationModel Compression	—Unverified

Show:10 25 50

← PrevPage 69 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified