Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1301–1350 of 4240 papers

Title	Date	Tasks	Status	Hype
GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation	Feb 17, 2024	Knowledge Distillationobject-detection	CodeCode Available	1
On Good Practices for Task-Specific Distillation of Large Pretrained Visual Models	Feb 17, 2024	Data AugmentationKnowledge Distillation	—Unverified	0
Knowledge Distillation Based on Transformed Teacher Matching	Feb 17, 2024	Knowledge Distillation	CodeCode Available	1
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation	Feb 16, 2024	Knowledge DistillationQuantization	CodeCode Available	4
Incremental Sequence Labeling: A Tale of Two Shifts	Feb 16, 2024	Incremental LearningKnowledge Distillation	CodeCode Available	2
Cultural Commonsense Knowledge for Intercultural Dialogues	Feb 16, 2024	Knowledge DistillationSpecificity	—Unverified	0
FedD2S: Personalized Data-Free Federated Knowledge Distillation	Feb 16, 2024	Data-free Knowledge DistillationFairness	—Unverified	0
Distilled Gradual Pruning with Pruned Fine-tuning	Feb 15, 2024	Image ClassificationKnowledge Distillation	CodeCode Available	0
Walsh-domain Neural Network for Power Amplifier Behavioral Modelling and Digital Predistortion	Feb 15, 2024	Knowledge Distillation	—Unverified	0
Model Compression and Efficient Inference for Large Language Models: A Survey	Feb 15, 2024	Knowledge DistillationModel Compression	—Unverified	0
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models	Feb 15, 2024	Knowledge Distillation	CodeCode Available	0
Leveraging Large Language Models for Enhanced NLP Task Performance through Knowledge Distillation and Optimized Training Strategies	Feb 14, 2024	Knowledge Distillationnamed-entity-recognition	—Unverified	0
FedSiKD: Clients Similarity and Knowledge Distillation: Addressing Non-i.i.d. and Constraints in Federated Learning	Feb 14, 2024	Federated LearningKnowledge Distillation	CodeCode Available	0
Integrating ChatGPT into Secure Hospital Networks: A Case Study on Improving Radiology Report Analysis	Feb 14, 2024	Contrastive LearningKnowledge Distillation	—Unverified	0
APALU: A Trainable, Adaptive Activation Function for Deep Learning Networks	Feb 13, 2024	Anomaly DetectionDeep Learning	—Unverified	0
Two-Stage Multi-task Self-Supervised Learning for Medical Image Segmentation	Feb 11, 2024	Auxiliary LearningImage Segmentation	—Unverified	0
Training Heterogeneous Client Models using Knowledge Distillation in Serverless Federated Learning	Feb 11, 2024	Federated LearningKnowledge Distillation	CodeCode Available	1
Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance	Feb 10, 2024	Computational EfficiencyKnowledge Distillation	CodeCode Available	0
Embedding Compression for Teacher-to-Student Knowledge Transfer	Feb 9, 2024	Knowledge DistillationTransfer Learning	—Unverified	0
Multi-source-free Domain Adaptation via Uncertainty-aware Adaptive Distillation	Feb 9, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available	0
Large Language Model Meets Graph Neural Network in Knowledge Distillation	Feb 8, 2024	Contrastive LearningGraph Attention	—Unverified	0
Knowledge Distillation for Road Detection based on cross-model Semi-Supervised Learning	Feb 7, 2024	Knowledge DistillationRoad Segmentation	—Unverified	0
Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation	Feb 7, 2024	DiversityKnowledge Distillation	CodeCode Available	0
EfficientViT-SAM: Accelerated Segment Anything Model Without Accuracy Loss	Feb 7, 2024	DecoderGPU	—Unverified	0
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models	Feb 6, 2024	Few-Shot LearningKnowledge Distillation	CodeCode Available	1
DistiLLM: Towards Streamlined Distillation for Large Language Models	Feb 6, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	3
A Survey on Transformer Compression	Feb 5, 2024	Knowledge DistillationMamba	—Unverified	0
Large Language Model Distilling Medication Recommendation Model	Feb 5, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation	Feb 5, 2024	Knowledge DistillationRetrieval	CodeCode Available	1
Dual Knowledge Distillation for Efficient Sound Event Detection	Feb 5, 2024	Event DetectionKnowledge Distillation	—Unverified	0
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation	Feb 5, 2024	Knowledge Distillation	CodeCode Available	1
LQER: Low-Rank Quantization Error Reconstruction for LLMs	Feb 4, 2024	Knowledge DistillationQuantization	CodeCode Available	1
Cooperative Knowledge Distillation: A Learner Agnostic Approach	Feb 2, 2024	counterfactualKnowledge Distillation	CodeCode Available	0
Bi-CryptoNets: Leveraging Different-Level Privacy for Encrypted Inference	Feb 2, 2024	Knowledge DistillationPrivacy Preserving	—Unverified	0
Spiking CenterNet: A Distillation-boosted Spiking Neural Network for Object Detection	Feb 2, 2024	DecoderKnowledge Distillation	—Unverified	0
Class incremental learning with probability dampening and cascaded gated classifier	Feb 2, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	0
Faster Inference of Integer SWIN Transformer by Removing the GELU Activation	Feb 2, 2024	GPUimage-classification	—Unverified	0
Addressing Bias Through Ensemble Learning and Regularized Fine-Tuning	Feb 1, 2024	Ensemble LearningKnowledge Distillation	—Unverified	0
Dual-Student Knowledge Distillation Networks for Unsupervised Anomaly Detection	Feb 1, 2024	Anomaly DetectionAnomaly Segmentation	—Unverified	0
Augmenting Offline Reinforcement Learning with State-only Interactions	Feb 1, 2024	D4RLData Augmentation	—Unverified	0
Scavenging Hyena: Distilling Transformers into Long Convolution Models	Jan 31, 2024	Knowledge Distillation	—Unverified	0
EPSD: Early Pruning with Self-Distillation for Efficient Model Compression	Jan 31, 2024	Knowledge DistillationModel Compression	—Unverified	0
LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation	Jan 30, 2024	HallucinationKnowledge Distillation	CodeCode Available	2
Stolen Subwords: Importance of Vocabularies for Machine Translation Model Stealing	Jan 29, 2024	Knowledge DistillationMachine Translation	CodeCode Available	0
TQCompressor: improving tensor decomposition methods in neural networks via permutations	Jan 29, 2024	Knowledge DistillationModel Compression	CodeCode Available	0
Face to Cartoon Incremental Super-Resolution using Knowledge Distillation	Jan 27, 2024	HallucinationIncremental Learning	—Unverified	0
Dynamic Transformer Architecture for Continual Learning of Multimodal Tasks	Jan 27, 2024	Continual LearningEdge-computing	—Unverified	0
Distilling Privileged Multimodal Information for Expression Recognition using Optimal Transport	Jan 27, 2024	DiversityKnowledge Distillation	—Unverified	0
A Comprehensive Survey of Compression Algorithms for Language Models	Jan 27, 2024	Knowledge DistillationQuantization	—Unverified	0
Large Language Model Guided Knowledge Distillation for Time Series Anomaly Detection	Jan 26, 2024	Anomaly DetectionKnowledge Distillation	—Unverified	0

Show:10 25 50

← PrevPage 27 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified