Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2901–2950 of 4240 papers

Title	Date	Tasks	Status
Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion	Nov 8, 2024	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models	Oct 2, 2023	Knowledge DistillationLanguage Modelling	—Unverified
Towards Long-Tailed Recognition for Graph Classification via Collaborative Experts	Aug 31, 2023	Contrastive LearningGraph Classification	—Unverified
Towards Making Deep Transfer Learning Never Hurt	Nov 18, 2019	AllKnowledge Distillation	—Unverified
Towards Model Agnostic Federated Learning Using Knowledge Distillation	Oct 28, 2021	Federated LearningKnowledge Distillation	—Unverified
Towards Non-task-specific Distillation of BERT via Sentence Representation Approximation	Apr 7, 2020	Knowledge DistillationSentence	—Unverified
Towards On-Board Panoptic Segmentation of Multispectral Satellite Images	Apr 5, 2022	Knowledge DistillationPanoptic Segmentation	—Unverified
Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge	Jun 25, 2024	Knowledge Distillation	—Unverified
Towards Oracle Knowledge Distillation with Neural Architecture Search	Nov 29, 2019	image-classificationImage Classification	—Unverified
Towards Personalized Federated Learning via Comprehensive Knowledge Distillation	Nov 6, 2024	Federated LearningKnowledge Distillation	—Unverified
Towards Robust Classification with Image Quality Assessment	Apr 14, 2020	ClassificationGeneral Classification	—Unverified
Towards Satellite Non-IID Imagery: A Spectral Clustering-Assisted Federated Learning Approach	Oct 17, 2024	Earth ObservationFederated Learning	—Unverified
Towards Scalable and Generalizable Earth Observation Data Mining via Foundation Model Composition	Jun 25, 2025	Earth ObservationKnowledge Distillation	—Unverified
Towards Scalable & Efficient Interaction-Aware Planning in Autonomous Vehicles using Knowledge Distillation	Apr 2, 2024	Autonomous VehiclesDecision Making	—Unverified
Towards Streaming Egocentric Action Anticipation	Oct 11, 2021	Action AnticipationKnowledge Distillation	—Unverified
SOCRATES: Text-based Human Search and Approach using a Robot Dog	Feb 10, 2023	Knowledge Distillation	—Unverified
Towards Unconstrained 2D Pose Estimation of the Human Spine	Apr 10, 2025	2D Pose EstimationActive Learning	—Unverified
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning	Dec 17, 2020	Deep LearningKnowledge Distillation	—Unverified
Towards Understanding Knowledge Distillation	May 27, 2021	Knowledge DistillationTransfer Learning	—Unverified
Do we need Label Regularization to Fine-tune Pre-trained Language Models?	May 25, 2022	Knowledge DistillationModel Compression	—Unverified
Towards Unsupervised Crowd Counting via Regression-Detection Bi-knowledge Transfer	Aug 12, 2020	Crowd CountingKnowledge Distillation	—Unverified
Towards Vector Optimization on Low-Dimensional Vector Symbolic Architecture	Feb 19, 2025	Knowledge Distillation	—Unverified
Towards Zero-Shot Knowledge Distillation for Natural Language Processing	Dec 31, 2020	Knowledge DistillationModel Compression	—Unverified
Toxicity Detection can be Sensitive to the Conversational Context	Nov 19, 2021	Data AugmentationKnowledge Distillation	—Unverified
Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition	Apr 29, 2025	GSM8KKnowledge Distillation	—Unverified
Training an LLM-as-a-Judge Model: Pipeline, Insights, and Practical Lessons	Feb 5, 2025	Instruction FollowingKnowledge Distillation	—Unverified
Training Domain Draft Models for Speculative Decoding: Best Practices and Insights	Mar 10, 2025	Knowledge Distillation	—Unverified
Training Self-localization Models for Unseen Unfamiliar Places via Teacher-to-Student Data-Free Knowledge Transfer	Mar 13, 2024	Continual LearningImage Retrieval	—Unverified
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks	Sep 2, 2017	General ClassificationKnowledge Distillation	—Unverified
Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification	Mar 31, 2022	Knowledge DistillationSpeaker Verification	—Unverified
TransFair: Transferring Fairness from Ocular Disease Classification to Progression Prediction	Nov 24, 2024	ClassificationFairness	—Unverified
Transferable Deployment of Semantic Edge Inference Systems via Unsupervised Domain Adaption	Apr 16, 2025	DecoderDomain Adaptation	—Unverified
Transfer Learning with Pre-trained Conditional Generative Models	Apr 27, 2022	Knowledge DistillationTransfer Learning	—Unverified
Transferring Knowledge from Structure-aware Self-attention Language Model to Sequence-to-Sequence Semantic Parsing	Jan 16, 2022	Code GenerationKnowledge Distillation	—Unverified
Transferring Knowledge from Structure-aware Self-attention Language Model to Sequence-to-Sequence Semantic Parsing	Oct 1, 2022	Code GenerationKnowledge Distillation	—Unverified
Transferring Learning Trajectories of Neural Networks	May 23, 2023	Knowledge Distillation	—Unverified
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation	Mar 17, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Transformer-Based Fault-Tolerant Control for Fixed-Wing UAVs Using Knowledge Distillation and In-Context Adaptation	Nov 5, 2024	Fault DetectionIn-Context Learning	—Unverified
Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI	Oct 11, 2024	Autonomous VehiclesIntrusion Detection	—Unverified
TransformMix: Learning Transformation and Mixing Strategies from Data	Mar 19, 2024	Data AugmentationKnowledge Distillation	—Unverified
Translate-Distill: Learning Cross-Language Dense Retrieval by Translation and Distillation	Jan 9, 2024	Information RetrievalKnowledge Distillation	—Unverified
Tree Knowledge Distillation for Compressing Transformer-Based Language Models	Jan 16, 2022	Knowledge Distillation	—Unverified
Tree-Like Decision Distillation	Jun 19, 2021	Decision MakingKnowledge Distillation	—Unverified
TriDeNT: Triple Deep Network Training for Privileged Knowledge Distillation in Histopathology	Dec 4, 2023	Knowledge Distillation	—Unverified
Trigger is Not Sufficient: Exploiting Frame-aware Knowledge for Implicit Event Argument Extraction	Aug 1, 2021	Event Argument ExtractionKnowledge Distillation	—Unverified
TRILLsson: Distilled Universal Paralinguistic Speech Representations	Mar 1, 2022	Emotion RecognitionKnowledge Distillation	—Unverified
Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning	Jan 1, 2025	class-incremental learningClass Incremental Learning	—Unverified
TripLe: Revisiting Pretrained Model Reuse and Progressive Learning for Efficient Vision Transformer Scaling and Searching	Jan 1, 2023	Knowledge DistillationNeural Architecture Search	—Unverified
Triplet Knowledge Distillation	May 25, 2023	Face Recognitionimage-classification	—Unverified
Triple-View Knowledge Distillation for Semi-Supervised Semantic Segmentation	Sep 22, 2023	DecoderFeature Importance	—Unverified

Show:10 25 50

← PrevPage 59 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified