Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2251–2300 of 4240 papers

Title	Date	Tasks	Status
Boosting Summarization with Normalizing Flows and Aggressive Training	Nov 1, 2023	DecoderKnowledge Distillation	CodeCode Available
Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision	Oct 31, 2023	InformativenessKnowledge Distillation	—Unverified
AMLNet: Adversarial Mutual Learning Neural Network for Non-AutoRegressive Multi-Horizon Time Series Forecasting	Oct 30, 2023	DecoderDiversity	CodeCode Available
MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition	Oct 29, 2023	Knowledge Distillationspeech-recognition	—Unverified
RCKD: Response-Based Cross-Task Knowledge Distillation for Pathological Image Analysis	Oct 29, 2023	Image ClassificationKnowledge Distillation	—Unverified
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation	Oct 29, 2023	DiversityEvolutionary Algorithms	—Unverified
ODM3D: Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object Detection	Oct 28, 2023	3D Object DetectionAutonomous Driving	CodeCode Available
Efficient Object Detection in Optical Remote Sensing Imagery via Attention-based Feature Distillation	Oct 28, 2023	Knowledge DistillationObject	—Unverified
Discourse Structures Guided Fine-grained Propaganda Identification	Oct 28, 2023	AttributeKnowledge Distillation	CodeCode Available
Towards a Unified Conversational Recommendation System: Multi-task Learning via Contextualized Knowledge Distillation	Oct 27, 2023	Conversational RecommendationDiversity	CodeCode Available
Multi-label Emotion Analysis in Conversation via Multimodal Knowledge Distillation	Oct 27, 2023	Emotion RecognitionKnowledge Distillation	—Unverified
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP	Oct 26, 2023	image-classificationImage Classification	—Unverified
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model	Oct 26, 2023	Data AugmentationGeneral Knowledge	CodeCode Available
SonoSAMTrack -- Segment and Track Anything on Ultrasound Images	Oct 25, 2023	Knowledge Distillation	—Unverified
TOP-Training: Target-Oriented Pretraining for Medical Extractive Question Answering	Oct 25, 2023	Domain AdaptationExtractive Question-Answering	CodeCode Available
Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous Data	Oct 24, 2023	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
Wakening Past Concepts without Past Data: Class-Incremental Learning from Online Placebos	Oct 24, 2023	class-incremental learningClass Incremental Learning	—Unverified
ABKD: Graph Neural Network Compression with Attention-Based Knowledge Distillation	Oct 24, 2023	Drug DiscoveryFake News Detection	—Unverified
MCC-KD: Multi-CoT Consistent Knowledge Distillation	Oct 23, 2023	DiversityKnowledge Distillation	CodeCode Available
Leveraging Complementary Attention maps in vision transformers for OCT image analysis	Oct 21, 2023	Knowledge Distillation	—Unverified
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images	Oct 20, 2023	Data AugmentationData-free Knowledge Distillation	—Unverified
DistillCSE: Distilled Contrastive Learning for Sentence Embeddings	Oct 20, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available
GenDistiller: Distilling Pre-trained Language Models based on Generative Models	Oct 20, 2023	Knowledge DistillationLanguage Modeling	—Unverified
Enhancing Abstractiveness of Summarization Models through Calibrated Distillation	Oct 20, 2023	Abstractive Text SummarizationInformativeness	—Unverified
Leveraging Knowledge Distillation for Efficient Deep Reinforcement Learning in Resource-Constrained Environments	Oct 16, 2023	Decision MakingDeep Reinforcement Learning	CodeCode Available
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models	Oct 13, 2023	Knowledge Distillation	—Unverified
Revisiting Multi-modal 3D Semantic Segmentation in Real-world Autonomous Driving	Oct 13, 2023	3D Semantic SegmentationAutonomous Driving	—Unverified
DistillSpec: Improving Speculative Decoding via Knowledge Distillation	Oct 12, 2023	Knowledge DistillationLanguage Modelling	—Unverified
Retrieve Anything To Augment Large Language Models	Oct 11, 2023	Knowledge DistillationRetrieval	—Unverified
Distilling Efficient Vision Transformers from CNNs for Semantic Segmentation	Oct 11, 2023	Knowledge DistillationSemantic Segmentation	—Unverified
Distillation Improves Visual Place Recognition for Low Quality Images	Oct 10, 2023	Knowledge DistillationQuantization	CodeCode Available
Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data	Oct 10, 2023	Knowledge Distillation	CodeCode Available
Knowledge Distillation for Anomaly Detection	Oct 9, 2023	Anomaly DetectionKnowledge Distillation	—Unverified
What do larger image classifiers memorise?	Oct 9, 2023	image-classificationImage Classification	—Unverified
Applying Knowledge Distillation to Improve Weed Mapping With Drones	Oct 8, 2023	Knowledge DistillationManagement	CodeCode Available
Fair Feature Importance Scores for Interpreting Tree-Based Methods and Surrogates	Oct 6, 2023	FairnessFeature Importance	—Unverified
DED: Diagnostic Evidence Distillation for acne severity grading on face images	Oct 5, 2023	Acne Severity GradingDiagnostic	CodeCode Available
Improving Knowledge Distillation with Teacher's Explanation	Oct 4, 2023	Knowledge Distillation	—Unverified
Talking Models: Distill Pre-trained Knowledge to Downstream Models via Interactive Communication	Oct 4, 2023	DecoderKnowledge Distillation	—Unverified
I^2KD-SLU: An Intra-Inter Knowledge Distillation Framework for Zero-Shot Cross-Lingual Spoken Language Understanding	Oct 4, 2023	Intent DetectionKnowledge Distillation	—Unverified
Heterogeneous Federated Learning Using Knowledge Codistillation	Oct 4, 2023	Federated Learningimage-classification	—Unverified
Can a student Large Language Model perform as well as it's teacher?	Oct 3, 2023	Knowledge DistillationLanguage Modeling	—Unverified
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models	Oct 2, 2023	Knowledge DistillationLanguage Modelling	—Unverified
KGEx: Explaining Knowledge Graph Embeddings via Subgraph Sampling and Knowledge Distillation	Oct 2, 2023	Knowledge DistillationKnowledge Graph Embeddings	—Unverified
Learnable Cross-modal Knowledge Distillation for Multi-modal Learning with Missing Modality	Oct 2, 2023	Knowledge Distillation	—Unverified
Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation	Oct 2, 2023	counterfactualKnowledge Distillation	—Unverified
Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks	Oct 2, 2023	Knowledge DistillationNode Classification	CodeCode Available
Adaptive Decoupled Pose Knowledge Distillation	Oct 1, 2023	Knowledge DistillationPose Estimation	CodeCode Available
Distilling Inductive Bias: Knowledge Distillation Beyond Model Compression	Sep 30, 2023	Inductive BiasKnowledge Distillation	—Unverified
Promoting Generalized Cross-lingual Question Answering in Few-resource Scenarios via Self-knowledge Distillation	Sep 29, 2023	Cross-Lingual Question AnsweringCross-Lingual Transfer	CodeCode Available

Show:10 25 50

← PrevPage 46 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified