Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1050 of 4240 papers

Title	Date	Tasks	Status
EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models	Sep 22, 2024	Knowledge Distillation	—Unverified
Addressing Bias Through Ensemble Learning and Regularized Fine-Tuning	Feb 1, 2024	Ensemble LearningKnowledge Distillation	—Unverified
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser	Nov 28, 2023	3D Face AnimationContrastive Learning	—Unverified
Towards Complementary Knowledge Distillation for Efficient Dense Image Prediction	Jan 24, 2024	Implicit RelationsInstance Segmentation	—Unverified
Diffusion-Augmented Coreset Expansion for Scalable Dataset Distillation	Dec 5, 2024	Bilevel OptimizationComputational Efficiency	—Unverified
Disentanglement, Visualization and Analysis of Complex Features in DNNs	Jan 1, 2021	DisentanglementKnowledge Distillation	—Unverified
Improving Neural Ranking via Lossless Knowledge Distillation	Sep 30, 2021	Knowledge DistillationLearning-To-Rank	—Unverified
An Efficient Federated Distillation Learning System for Multi-task Time Series Classification	Dec 30, 2021	Knowledge DistillationTime Series	—Unverified
DualDE: Dually Distilling Knowledge Graph Embedding for Faster and Cheaper Reasoning	Sep 13, 2020	Graph EmbeddingKnowledge Distillation	—Unverified
Bridging the gap between Human Action Recognition and Online Action Detection	Jan 21, 2021	Action DetectionAction Recognition	—Unverified
Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation	Mar 5, 2020	Domain AdaptationKnowledge Distillation	—Unverified
Distill and De-bias: Mitigating Bias in Face Verification using Knowledge Distillation	Dec 17, 2021	AttributeFace Recognition	—Unverified
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning	Dec 20, 2022	Knowledge DistillationMachine Translation	—Unverified
A Cohesive Distillation Architecture for Neural Language Models	Jan 12, 2023	Knowledge DistillationLanguage Modeling	—Unverified
Differentiable Feature Aggregation Search for Knowledge Distillation	Aug 2, 2020	Knowledge DistillationModel Compression	—Unverified
DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia Obfuscation in Transcribed Speech	Oct 5, 2024	HallucinationKnowledge Distillation	—Unverified
Knowledge Distillation Decision Tree for Unravelling Black-box Machine Learning Models	Jun 9, 2022	Knowledge Distillation	—Unverified
Distillation-Enabled Knowledge Alignment for Generative Semantic Communications in AIGC Provisioning Tasks	Jun 24, 2025	Knowledge DistillationSemantic Communication	—Unverified
Distillation-Enhanced Physical Adversarial Attacks	Jan 4, 2025	Adversarial AttackKnowledge Distillation	—Unverified
ECAT: A Entire space Continual and Adaptive Transfer Learning Framework for Cross-Domain Recommendation	Jul 2, 2024	Domain AdaptationKnowledge Distillation	—Unverified
StableMamba: Distillation-free Scaling of Large SSMs for Images and Videos	Sep 18, 2024	Action Recognitionimage-classification	—Unverified
Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models	Apr 7, 2024	Contrastive LearningDiagnostic	—Unverified
DFM: Dialogue Foundation Model for Universal Large-Scale Dialogue-Oriented Task Learning	May 25, 2022	Dialogue GenerationDiversity	—Unverified
Bootstrapped Representation Learning for Skeleton-Based Action Recognition	Feb 4, 2022	Action RecognitionData Augmentation	—Unverified
An Efficient Detection and Control System for Underwater Docking using Machine Learning and Realistic Simulation: A Comprehensive Approach	Nov 2, 2023	Generative Adversarial NetworkImage-to-Image Translation	—Unverified
Dialect Identification through Adversarial Learning and Knowledge Distillation on Romanian BERT	Apr 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
DiagrammaticLearning: A Graphical Language for Compositional Training Regimes	Jan 2, 2025	Knowledge DistillationMulti-Task Learning	—Unverified
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping	Jun 8, 2023	DenoisingKnowledge Distillation	—Unverified
DFRD: Data-Free Robustness Distillation for Heterogeneous Federated Learning	Sep 24, 2023	Data-free Knowledge DistillationDiversity	—Unverified
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization	May 18, 2023	BenchmarkingGPU	—Unverified
An Efficient Active Learning Pipeline for Legal Text Classification	Nov 15, 2022	Active LearningClassification	—Unverified
DistillGrasp: Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects	Aug 1, 2024	Depth CompletionFeature Correlation	—Unverified
DFMSD: Dual Feature Masking Stage-wise Knowledge Distillation for Object Detection	Jul 18, 2024	Knowledge DistillationObject	—Unverified
An Effective Deep Network for Head Pose Estimation without Keypoints	Oct 25, 2022	Gaze EstimationHead Pose Estimation	—Unverified
DeViT: Decomposing Vision Transformers for Collaborative Inference in Edge Devices	Sep 10, 2023	Collaborative InferenceGPU	—Unverified
Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models	Mar 30, 2022	Knowledge Distillation	—Unverified
Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation	Apr 11, 2024	Depth EstimationDepth Prediction	—Unverified
Deep Face Recognition Model Compression via Knowledge Transfer and Distillation	Jun 3, 2019	Face RecognitionKnowledge Distillation	—Unverified
Developing Multi-Task Recommendations with Long-Term Rewards via Policy Distilled Reinforcement Learning	Jan 27, 2020	Deep Reinforcement LearningKnowledge Distillation	—Unverified
DETRDistill: A Universal Knowledge Distillation Framework for DETR-families	Nov 17, 2022	Knowledge Distillationobject-detection	—Unverified
Detecting Optimism in Tweets using Knowledge Distillation and Linguistic Analysis of Optimism	Jun 1, 2022	Hate Speech DetectionKnowledge Distillation	—Unverified
Analyzing the Importance of Blank for CTC-Based Knowledge Distillation	Jun 2, 2025	Automatic Speech RecognitionKnowledge Distillation	—Unverified
Dynamic Y-KD: A Hybrid Approach to Continual Instance Segmentation	Mar 10, 2023	Continual LearningIncremental Learning	—Unverified
EasyDistill: A Comprehensive Toolkit for Effective Knowledge Distillation of Large Language Models	May 27, 2025	Knowledge Distillation	—Unverified
EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing	Apr 30, 2022	Few-Shot LearningKnowledge Distillation	—Unverified
EchoLM: Accelerating LLM Serving with Real-time Knowledge Distillation	Jan 22, 2025	Knowledge DistillationResponse Generation	—Unverified
Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation	Aug 28, 2024	Knowledge DistillationLanguage Modelling	—Unverified
Designing Parameter and Compute Efficient Diffusion Transformers using Distillation	Feb 20, 2025	Knowledge DistillationNVIDIA Jetson Orin Nano	—Unverified
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement	Mar 3, 2024	Automatic Speech RecognitionKeyword Spotting	—Unverified
Designing an Improved Deep Learning-based Model for COVID-19 Recognition in Chest X-ray Images: A Knowledge Distillation Approach	Jan 6, 2023	Knowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 21 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified