Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2251–2275 of 4240 papers

Title	Date	Tasks	Status
Boosting Summarization with Normalizing Flows and Aggressive Training	Nov 1, 2023	DecoderKnowledge Distillation	CodeCode Available
Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision	Oct 31, 2023	InformativenessKnowledge Distillation	—Unverified
AMLNet: Adversarial Mutual Learning Neural Network for Non-AutoRegressive Multi-Horizon Time Series Forecasting	Oct 30, 2023	DecoderDiversity	CodeCode Available
MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition	Oct 29, 2023	Knowledge Distillationspeech-recognition	—Unverified
RCKD: Response-Based Cross-Task Knowledge Distillation for Pathological Image Analysis	Oct 29, 2023	Image ClassificationKnowledge Distillation	—Unverified
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation	Oct 29, 2023	DiversityEvolutionary Algorithms	—Unverified
ODM3D: Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object Detection	Oct 28, 2023	3D Object DetectionAutonomous Driving	CodeCode Available
Efficient Object Detection in Optical Remote Sensing Imagery via Attention-based Feature Distillation	Oct 28, 2023	Knowledge DistillationObject	—Unverified
Discourse Structures Guided Fine-grained Propaganda Identification	Oct 28, 2023	AttributeKnowledge Distillation	CodeCode Available
Towards a Unified Conversational Recommendation System: Multi-task Learning via Contextualized Knowledge Distillation	Oct 27, 2023	Conversational RecommendationDiversity	CodeCode Available
Multi-label Emotion Analysis in Conversation via Multimodal Knowledge Distillation	Oct 27, 2023	Emotion RecognitionKnowledge Distillation	—Unverified
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP	Oct 26, 2023	image-classificationImage Classification	—Unverified
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model	Oct 26, 2023	Data AugmentationGeneral Knowledge	CodeCode Available
SonoSAMTrack -- Segment and Track Anything on Ultrasound Images	Oct 25, 2023	Knowledge Distillation	—Unverified
TOP-Training: Target-Oriented Pretraining for Medical Extractive Question Answering	Oct 25, 2023	Domain AdaptationExtractive Question-Answering	CodeCode Available
Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous Data	Oct 24, 2023	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
Wakening Past Concepts without Past Data: Class-Incremental Learning from Online Placebos	Oct 24, 2023	class-incremental learningClass Incremental Learning	—Unverified
ABKD: Graph Neural Network Compression with Attention-Based Knowledge Distillation	Oct 24, 2023	Drug DiscoveryFake News Detection	—Unverified
MCC-KD: Multi-CoT Consistent Knowledge Distillation	Oct 23, 2023	DiversityKnowledge Distillation	CodeCode Available
Leveraging Complementary Attention maps in vision transformers for OCT image analysis	Oct 21, 2023	Knowledge Distillation	—Unverified
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images	Oct 20, 2023	Data AugmentationData-free Knowledge Distillation	—Unverified
DistillCSE: Distilled Contrastive Learning for Sentence Embeddings	Oct 20, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available
GenDistiller: Distilling Pre-trained Language Models based on Generative Models	Oct 20, 2023	Knowledge DistillationLanguage Modeling	—Unverified
Enhancing Abstractiveness of Summarization Models through Calibrated Distillation	Oct 20, 2023	Abstractive Text SummarizationInformativeness	—Unverified
Leveraging Knowledge Distillation for Efficient Deep Reinforcement Learning in Resource-Constrained Environments	Oct 16, 2023	Decision MakingDeep Reinforcement Learning	CodeCode Available

Show:10 25 50

← PrevPage 91 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified