Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4201–4240 of 4240 papers

Title	Date	Tasks	Status
Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning	May 31, 2024	Action RecognitionContrastive Learning	CodeCode Available
Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image Classification	Oct 31, 2024	Earth Observationimage-classification	CodeCode Available
Asymmetric Masked Distillation for Pre-Training Small Foundation Models	Nov 6, 2023	Action ClassificationAction Recognition	CodeCode Available
BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation	Oct 2, 2024	Knowledge DistillationTime Series Analysis	CodeCode Available
Enhance Language Identification using Dual-mode Model with Knowledge Distillation	Mar 7, 2022	Knowledge DistillationLanguage Identification	CodeCode Available
Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation	Oct 14, 2024	Knowledge Distillation	CodeCode Available
Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous Data	Oct 24, 2023	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
Emulating Quantum Dynamics with Neural Networks via Knowledge Distillation	Mar 19, 2022	Knowledge Distillation	CodeCode Available
Co-Teaching for Unsupervised Domain Adaptation and Expansion	Apr 4, 2022	Domain Adaptationimage-classification	CodeCode Available
Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision	Sep 21, 2023	Action RecognitionKnowledge Distillation	CodeCode Available
EGAD: Evolving Graph Representation Learning with Self-Attention and Knowledge Distillation for Live Video Streaming Events	Nov 11, 2020	Graph Representation LearningKnowledge Distillation	CodeCode Available
Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance	Nov 23, 2024	Computational EfficiencyKnowledge Distillation	CodeCode Available
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation	Aug 7, 2023	Knowledge DistillationSentence	CodeCode Available
Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation	Jun 5, 2022	3D Semantic SegmentationKnowledge Distillation	CodeCode Available
Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks	Jul 3, 2021	Knowledge DistillationModel Compression	CodeCode Available
Efficient Sub-structured Knowledge Distillation	Mar 9, 2022	Knowledge DistillationStructured Prediction	CodeCode Available
Semi-Online Knowledge Distillation	Nov 23, 2021	Knowledge DistillationModel Compression	CodeCode Available
Temporal Action Proposal Generation With Action Frequency Adaptive Network	Jun 23, 2023	Knowledge DistillationTemporal Action Proposal Generation	CodeCode Available
POS-Constrained Parallel Decoding for Non-autoregressive Generation	Aug 1, 2021	Knowledge DistillationPOS	CodeCode Available
Active Object Detection with Knowledge Aggregation and Distillation from Large Models	May 21, 2024	Active Object DetectionDecision Making	CodeCode Available
Efficient Speech Translation through Model Compression and Knowledge Distillation	May 26, 2025	Knowledge DistillationModel Compression	CodeCode Available
Training convolutional neural networks with cheap convolutions and online distillation	Sep 28, 2019	Knowledge Distillation	CodeCode Available
Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification	Oct 4, 2022	Knowledge DistillationMetric Learning	CodeCode Available
Correlation Congruence for Knowledge Distillation	Apr 3, 2019	Face Recognitionimage-classification	CodeCode Available
Biomed-DPT: Dual Modality Prompt Tuning for Biomedical Vision-Language Models	May 8, 2025	Clinical KnowledgeDiagnostic	CodeCode Available
PowerSkel: A Device-Free Framework Using CSI Signal for Human Skeleton Estimation in Power Station	Mar 4, 2024	Knowledge DistillationPose Estimation	CodeCode Available
A Diffusion Model and Knowledge Distillation Framework for Robust Coral Detection in Complex Underwater Environments	Jan 6, 2025	2D Object DetectionKnowledge Distillation	CodeCode Available
PP-ShiTu: A Practical Lightweight Image Recognition System	Nov 1, 2021	Face RecognitionKnowledge Distillation	CodeCode Available
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation	Jul 6, 2021	Continual LearningDomain Adaptation	CodeCode Available
Cooperative Retriever and Ranker in Deep Recommenders	Jun 28, 2022	Knowledge DistillationRecommendation Systems	CodeCode Available
Cooperative Knowledge Distillation: A Learner Agnostic Approach	Feb 2, 2024	counterfactualKnowledge Distillation	CodeCode Available
Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation	May 17, 2019	Knowledge Distillation	CodeCode Available
Asymmetrical Reciprocity-based Federated Learning for Resolving Disparities in Medical Diagnosis	Dec 27, 2024	DiagnosticFederated Learning	CodeCode Available
Efficient Multitask Dense Predictor via Binarization	May 23, 2024	BinarizationKnowledge Distillation	CodeCode Available
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition	Mar 7, 2024	Audio-Visual Speech RecognitionKnowledge Distillation	CodeCode Available
Sentence Embedder Guided Utterance Encoder (SEGUE) for Spoken Language Understanding	May 20, 2023	Knowledge DistillationSentence	CodeCode Available
Evolving Knowledge Mining for Class Incremental Segmentation	Jun 3, 2023	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available
PreFallKD: Pre-Impact Fall Detection via CNN-ViT Knowledge Distillation	Mar 7, 2023	Data AugmentationKnowledge Distillation	CodeCode Available
Efficient Lung Ultrasound Severity Scoring Using Dedicated Feature Extractor	Jan 21, 2025	DiagnosticKnowledge Distillation	CodeCode Available
Cooperative Classification and Rationalization for Graph Generalization	Mar 10, 2024	ClassificationGraph Classification	CodeCode Available

Show:10 25 50

← PrevPage 85 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified