Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2301–2350 of 4240 papers

Title	Date	Tasks	Status
Toward Efficient Deep Spiking Neuron Networks:A Survey On Compression	Jun 3, 2024	Knowledge DistillationQuantization	—Unverified
Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation	Nov 30, 2024	FairnessGraph Representation Learning	—Unverified
Toward Model-centric Heterogeneous Federated Graph Learning: A Knowledge-driven Approach	Jan 22, 2025	DiversityGraph Learning	—Unverified
Toward Multiple Specialty Learners for Explaining GNNs via Online Knowledge Distillation	Oct 20, 2022	Knowledge Distillation	—Unverified
Towards a better understanding of Vector Quantized Autoencoders	May 1, 2019	Knowledge DistillationMachine Translation	—Unverified
Towards Active Participant-Centric Vertical Federated Learning: Some Representations May Be All You Need	Oct 23, 2024	AllFederated Learning	—Unverified
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval	Mar 16, 2023	Image RetrievalKnowledge Distillation	—Unverified
Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text	Dec 14, 2021	image-classificationImage Classification	—Unverified
Towards a Unified View of Affinity-Based Knowledge Distillation	Sep 30, 2022	image-classificationImage Classification	—Unverified
Towards a Universal Continuous Knowledge Base	Dec 25, 2020	Knowledge Distillationtext-classification	—Unverified
Towards Better Query Classification with Multi-Expert Knowledge Condensation in JD Ads Search	Aug 2, 2023	Knowledge Distillation	—Unverified
Reconsidering Learning Objectives in Unbiased Recommendation with Unobserved Confounders	Jun 7, 2022	Generalization BoundsKnowledge Distillation	—Unverified
Towards Building Secure UAV Navigation with FHE-aware Knowledge Distillation	Nov 1, 2024	Knowledge DistillationReinforcement Learning (RL)	—Unverified
Towards Collaborative Fairness in Federated Learning Under Imbalanced Covariate Shift	Jul 11, 2025	Collaborative FairnessFairness	—Unverified
Towards Comparable Knowledge Distillation in Semantic Image Segmentation	Sep 7, 2023	Image SegmentationKnowledge Distillation	—Unverified
Towards Cross-modality Medical Image Segmentation with Online Mutual Knowledge Distillation	Oct 4, 2020	Cardiac SegmentationImage Segmentation	—Unverified
Towards Developing a Multilingual and Code-Mixed Visual Question Answering System by Knowledge Distillation	Sep 10, 2021	Knowledge DistillationQuestion Answering	—Unverified
Towards domain generalisation in ASR with elitist sampling and ensemble knowledge distillation	Mar 1, 2023	Domain AdaptationKnowledge Distillation	—Unverified
Towards Efficient Task-Driven Model Reprogramming with Foundation Models	Apr 5, 2023	Knowledge DistillationTransfer Learning	—Unverified
Towards Explaining Autonomy with Verbalised Decision Tree States	Sep 28, 2022	Knowledge Distillation	—Unverified
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis	Mar 23, 2022	Expressive Speech SynthesisKnowledge Distillation	—Unverified
Towards Few-Call Model Stealing via Active Self-Paced Knowledge Distillation and Diffusion-Based Image Generation	Sep 29, 2023	Image GenerationKnowledge Distillation	—Unverified
Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation	Oct 2, 2023	counterfactualKnowledge Distillation	—Unverified
Towards Full Utilization on Mask Task for Distilling PLMs into NMT	Sep 17, 2021	Knowledge DistillationMachine Translation	—Unverified
Towards General and Fast Video Derain via Knowledge Distillation	Aug 10, 2023	DecoderKnowledge Distillation	—Unverified
CAM-loss: Towards Learning Spatially Discriminative Feature Representations	Sep 3, 2021	Few-Shot Learningimage-classification	—Unverified
Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion	Nov 8, 2024	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models	Oct 2, 2023	Knowledge DistillationLanguage Modelling	—Unverified
Towards Long-Tailed Recognition for Graph Classification via Collaborative Experts	Aug 31, 2023	Contrastive LearningGraph Classification	—Unverified
Towards Making Deep Transfer Learning Never Hurt	Nov 18, 2019	AllKnowledge Distillation	—Unverified
Towards Model Agnostic Federated Learning Using Knowledge Distillation	Oct 28, 2021	Federated LearningKnowledge Distillation	—Unverified
Towards Non-task-specific Distillation of BERT via Sentence Representation Approximation	Apr 7, 2020	Knowledge DistillationSentence	—Unverified
Towards On-Board Panoptic Segmentation of Multispectral Satellite Images	Apr 5, 2022	Knowledge DistillationPanoptic Segmentation	—Unverified
Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge	Jun 25, 2024	Knowledge Distillation	—Unverified
Towards Oracle Knowledge Distillation with Neural Architecture Search	Nov 29, 2019	image-classificationImage Classification	—Unverified
Towards Personalized Federated Learning via Comprehensive Knowledge Distillation	Nov 6, 2024	Federated LearningKnowledge Distillation	—Unverified
Towards Robust Classification with Image Quality Assessment	Apr 14, 2020	ClassificationGeneral Classification	—Unverified
Towards Satellite Non-IID Imagery: A Spectral Clustering-Assisted Federated Learning Approach	Oct 17, 2024	Earth ObservationFederated Learning	—Unverified
Towards Scalable and Generalizable Earth Observation Data Mining via Foundation Model Composition	Jun 25, 2025	Earth ObservationKnowledge Distillation	—Unverified
Towards Scalable & Efficient Interaction-Aware Planning in Autonomous Vehicles using Knowledge Distillation	Apr 2, 2024	Autonomous VehiclesDecision Making	—Unverified
Towards Streaming Egocentric Action Anticipation	Oct 11, 2021	Action AnticipationKnowledge Distillation	—Unverified
SOCRATES: Text-based Human Search and Approach using a Robot Dog	Feb 10, 2023	Knowledge Distillation	—Unverified
Towards Unconstrained 2D Pose Estimation of the Human Spine	Apr 10, 2025	2D Pose EstimationActive Learning	—Unverified
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning	Dec 17, 2020	Deep LearningKnowledge Distillation	—Unverified
Towards Understanding Knowledge Distillation	May 27, 2021	Knowledge DistillationTransfer Learning	—Unverified
Do we need Label Regularization to Fine-tune Pre-trained Language Models?	May 25, 2022	Knowledge DistillationModel Compression	—Unverified
Towards Unsupervised Crowd Counting via Regression-Detection Bi-knowledge Transfer	Aug 12, 2020	Crowd CountingKnowledge Distillation	—Unverified
Towards Vector Optimization on Low-Dimensional Vector Symbolic Architecture	Feb 19, 2025	Knowledge Distillation	—Unverified
Towards Zero-Shot Knowledge Distillation for Natural Language Processing	Dec 31, 2020	Knowledge DistillationModel Compression	—Unverified
Toxicity Detection can be Sensitive to the Conversational Context	Nov 19, 2021	Data AugmentationKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 47 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified