Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–950 of 4240 papers

Title	Date	Tasks	Status	Hype
Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study	Jul 9, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	0
Reprogramming Distillation for Medical Foundation Models	Jul 9, 2024	Knowledge DistillationLightweight Deployment	CodeCode Available	0
Less is More: Efficient Brain-Inspired Learning for Autonomous Driving Trajectory Prediction	Jul 9, 2024	Autonomous DrivingDecision Making	—Unverified	0
DεpS: Delayed ε-Shrinking for Faster Once-For-All Training	Jul 8, 2024	AllGPU	—Unverified	0
Leveraging Topological Guidance for Improved Knowledge Distillation	Jul 7, 2024	image-classificationImage Classification	CodeCode Available	0
Topological Persistence Guided Knowledge Distillation for Wearable Sensor Data	Jul 7, 2024	Activity RecognitionDeep Learning	—Unverified	0
Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models	Jul 7, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	2
Federated Knowledge Transfer Fine-tuning Large Server Model with Resource-Constrained IoT Clients	Jul 7, 2024	Federated LearningKnowledge Distillation	—Unverified	0
Improving Knowledge Distillation in Transfer Learning with Layer-wise Learning Rates	Jul 5, 2024	Knowledge DistillationTransfer Learning	—Unverified	0
Understanding the Gains from Repeated Self-Distillation	Jul 5, 2024	Knowledge Distillationregression	—Unverified	0
AMD: Automatic Multi-step Distillation of Large-scale Vision Models	Jul 5, 2024	image-classificationImage Classification	—Unverified	0
DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners	Jul 4, 2024	Audio ClassificationAudio Tagging	CodeCode Available	1
DSMix: Distortion-Induced Sensitivity Map Based Pre-training for No-Reference Image Quality Assessment	Jul 4, 2024	Data AugmentationImage Quality Assessment	CodeCode Available	0
Relative Difficulty Distillation for Semantic Segmentation	Jul 4, 2024	Knowledge DistillationSemantic Segmentation	CodeCode Available	0
Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners	Jul 4, 2024	Domain GeneralizationFew-Shot Learning	—Unverified	0
Edge AI-Enabled Chicken Health Detection Based on Enhanced FCOS-Lite and Knowledge Distillation	Jul 3, 2024	Knowledge DistillationQuantization	—Unverified	0
Supporting Cross-language Cross-project Bug Localization Using Pre-trained Language Models	Jul 3, 2024	Contrastive LearningCPU	—Unverified	0
MLKD-BERT: Multi-level Knowledge Distillation for Pre-trained Language Models	Jul 3, 2024	Extractive Question-AnsweringKnowledge Distillation	—Unverified	0
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment	Jul 3, 2024	ChatbotComputational Efficiency	—Unverified	0
Unified Anomaly Detection methods on Edge Device using Knowledge Distillation and Quantization	Jul 3, 2024	Anomaly DetectionCPU	—Unverified	0
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation	Jul 3, 2024	Domain GeneralizationKnowledge Distillation	CodeCode Available	2
Accelerated Proton Resonance Frequency-based Magnetic Resonance Thermometry by Optimized Deep Learning Method	Jul 3, 2024	Knowledge Distillation	CodeCode Available	0
A Unified Framework for 3D Scene Understanding	Jul 3, 2024	Contrastive LearningKnowledge Distillation	CodeCode Available	2
Advancing Compressed Video Action Recognition through Progressive Knowledge Distillation	Jul 2, 2024	Action RecognitionKnowledge Distillation	CodeCode Available	0
ECAT: A Entire space Continual and Adaptive Transfer Learning Framework for Cross-Domain Recommendation	Jul 2, 2024	Domain AdaptationKnowledge Distillation	—Unverified	0
Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection	Jul 2, 2024	EEGElectroencephalogram (EEG)	CodeCode Available	0
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application	Jul 2, 2024	Knowledge DistillationSurvey	—Unverified	0
Self-Cooperation Knowledge Distillation for Novel Class Discovery	Jul 2, 2024	Knowledge DistillationNovel Class Discovery	—Unverified	0
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes	Jul 1, 2024	Knowledge Distillation	CodeCode Available	0
AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition	Jul 1, 2024	Face RecognitionKnowledge Distillation	CodeCode Available	1
BAPO: Base-Anchored Preference Optimization for Overcoming Forgetting in Large Language Models Personalization	Jun 30, 2024	Continual LearningGeneral Knowledge	—Unverified	0
FANFOLD: Graph Normalizing Flows-driven Asymmetric Network for Unsupervised Graph-Level Anomaly Detection	Jun 29, 2024	Anomaly DetectionKnowledge Distillation	CodeCode Available	0
Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization	Jun 29, 2024	Knowledge Distillation	—Unverified	0
CSAKD: Knowledge Distillation with Cross Self-Attention for Hyperspectral and Multispectral Image Fusion	Jun 28, 2024	Knowledge DistillationSuper-Resolution	CodeCode Available	1
MuGSI: Distilling GNNs with Multi-Granularity Structural Information for Graph Classification	Jun 28, 2024	ClassificationGraph Classification	CodeCode Available	0
Direct Preference Knowledge Distillation for Large Language Models	Jun 28, 2024	Knowledge Distillation	—Unverified	0
Instance Temperature Knowledge Distillation	Jun 27, 2024	Decision MakingEfficient Exploration	CodeCode Available	0
Aligning Teacher with Student Preferences for Tailored Training Data Generation	Jun 27, 2024	In-Context LearningKnowledge Distillation	—Unverified	0
On Reducing Activity with Distillation and Regularization for Energy Efficient Spiking Neural Networks	Jun 26, 2024	Knowledge Distillation	—Unverified	0
ConStyle v2: A Strong Prompter for All-in-One Image Restoration	Jun 26, 2024	AllGPU	CodeCode Available	1
Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge	Jun 25, 2024	Knowledge Distillation	—Unverified	0
Highly Constrained Coded Aperture Imaging Systems Design Via a Knowledge Distillation Approach	Jun 25, 2024	Image ReconstructionKnowledge Distillation	—Unverified	0
Sequential Editing for Lifelong Training of Speech Recognition Models	Jun 25, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Preserving Node Distinctness in Graph Autoencoders via Similarity Distillation	Jun 25, 2024	DecoderKnowledge Distillation	—Unverified	0
Knowledge Distillation in Automated Annotation: Supervised Text Classification with LLM-Generated Training Labels	Jun 25, 2024	ArticlesIn-Context Learning	—Unverified	0
MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for Effective-and-Efficient Vision-and-Language Navigation	Jun 25, 2024	Knowledge DistillationTest unseen	CodeCode Available	1
Dual-Space Knowledge Distillation for Large Language Models	Jun 25, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	2
InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation	Jun 25, 2024	Knowledge Distillation	—Unverified	0
Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition	Jun 25, 2024	Knowledge DistillationMicro Expression Recognition	CodeCode Available	1
WAVE: Weight Template for Adaptive Initialization of Variable-sized Models	Jun 25, 2024	Knowledge DistillationTransfer Learning	—Unverified	0

Show:10 25 50

← PrevPage 19 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified