Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1851–1900 of 4240 papers

Title	Date	Tasks	Status	Hype
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1
GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model	Jun 11, 2023	General KnowledgeKnowledge Distillation	CodeCode Available	1
EaSyGuide : ESG Issue Identification Framework leveraging Abilities of Generative Large Language Models	Jun 11, 2023	ArticlesKnowledge Distillation	CodeCode Available	0
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method	Jun 11, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition	Jun 9, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
RankFormer: Listwise Learning-to-Rank Using Listwide Labels	Jun 9, 2023	Knowledge DistillationLearning-To-Rank	CodeCode Available	1
The economic trade-offs of large language models: A case study	Jun 8, 2023	Knowledge DistillationPrompt Engineering	—Unverified	0
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping	Jun 8, 2023	DenoisingKnowledge Distillation	—Unverified	0
Population-Based Evolutionary Gaming for Unsupervised Person Re-identification	Jun 8, 2023	DiversityKnowledge Distillation	—Unverified	0
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks	Jun 7, 2023	Audio ClassificationAudio Tagging	CodeCode Available	1
Faithful Knowledge Distillation	Jun 7, 2023	Adversarial RobustnessKnowledge Distillation	—Unverified	0
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining	Jun 6, 2023	Knowledge DistillationModel-based Reinforcement Learning	CodeCode Available	0
Orca: Progressive Learning from Complex Explanation Traces of GPT-4	Jun 5, 2023	Imitation LearningKnowledge Distillation	CodeCode Available	1
Zero shot framework for satellite image restoration	Jun 5, 2023	DisentanglementImage Restoration	—Unverified	0
Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs	Jun 5, 2023	Entity AlignmentKnowledge Distillation	CodeCode Available	0
I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval	Jun 4, 2023	Knowledge DistillationPassage Retrieval	CodeCode Available	1
Revisiting Data-Free Knowledge Distillation with Poisoned Teachers	Jun 4, 2023	Backdoor Defense for Data-Free Distillation with Poisoned TeachersData-free Knowledge Distillation	CodeCode Available	1
Modular Transformers: Compressing Transformers into Modularized Layers for Flexible Efficient Inference	Jun 4, 2023	DecoderKnowledge Distillation	—Unverified	0
Evolving Knowledge Mining for Class Incremental Segmentation	Jun 3, 2023	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available	0
Deep Classifier Mimicry without Data Access	Jun 3, 2023	Knowledge Distillation	CodeCode Available	0
Group channel pruning and spatial attention distilling for object detection	Jun 2, 2023	Knowledge DistillationModel Compression	—Unverified	0
Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models	Jun 2, 2023	Knowledge Distillation	—Unverified	0
Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23	Jun 2, 2023	Knowledge DistillationMachine Translation	—Unverified	0
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation	Jun 1, 2023	automatic-speech-translationCross-Lingual Transfer	—Unverified	0
Teacher Agent: A Knowledge Distillation-Free Framework for Rehearsal-based Video Incremental Learning	Jun 1, 2023	Incremental LearningKnowledge Distillation	CodeCode Available	0
Accurate and Structured Pruning for Efficient Automatic Speech Recognition	May 31, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Graph Entropy Minimization for Semi-supervised Node Classification	May 31, 2023	ClassificationKnowledge Distillation	CodeCode Available	0
PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning	May 31, 2023	Common Sense Reasoningcounterfactual	CodeCode Available	1
KEYword based Sampling (KEYS) for Large Language Models	May 30, 2023	Knowledge DistillationLanguage Modeling	—Unverified	0
Are Large Kernels Better Teachers than Transformers for ConvNets?	May 30, 2023	Knowledge Distillation	CodeCode Available	2
Research on Multilingual News Clustering Based on Cross-Language Word Embeddings	May 30, 2023	ClusteringKnowledge Distillation	—Unverified	0
Semi-supervised Pathological Image Segmentation via Cross Distillation of Multiple Attentions	May 30, 2023	DecoderImage Segmentation	CodeCode Available	1
A Recipe for Efficient SBIR Models: Combining Relative Triplet Loss with Batch Normalization and Knowledge Distillation	May 30, 2023	Data AugmentationImage Retrieval	—Unverified	0
Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective	May 29, 2023	Knowledge DistillationReinforcement Learning (RL)	CodeCode Available	0
GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking	May 29, 2023	Answer GenerationDialogue Generation	—Unverified	0
Learning to Learn from APIs: Black-Box Data-Free Meta-Learning	May 28, 2023	Few-Shot LearningKnowledge Distillation	CodeCode Available	1
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models	May 28, 2023	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval	May 28, 2023	Image RetrievalKnowledge Distillation	—Unverified	0
Towards Better Entity Linking with Multi-View Enhanced Distillation	May 27, 2023	Entity LinkingKnowledge Distillation	CodeCode Available	1
FoPro-KD: Fourier Prompted Effective Knowledge Distillation for Long-Tailed Medical Image Recognition	May 27, 2023	image-classificationImage Classification	CodeCode Available	1
One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification	May 27, 2023	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
Knowledge Distillation Performs Partial Variance Reduction	May 27, 2023	Knowledge Distillation	CodeCode Available	0
Vision Transformers for Small Histological Datasets Learned through Knowledge Distillation	May 27, 2023	Airbubbles DetectionAnomaly Detection	CodeCode Available	0
Improving Knowledge Distillation via Regularizing Feature Norm and Direction	May 26, 2023	Domain AdaptationKnowledge Distillation	CodeCode Available	1
ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression	May 26, 2023	Knowledge Distillation	—Unverified	0
A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models	May 26, 2023	Knowledge Distillation	—Unverified	0
Knowledge Diffusion for Distillation	May 25, 2023	Denoisingimage-classification	CodeCode Available	1
Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages	May 25, 2023	Knowledge DistillationMachine Translation	—Unverified	0
OVO: Open-Vocabulary Occupancy	May 25, 2023	Knowledge DistillationPrediction	CodeCode Available	1
On the Impact of Knowledge Distillation for Model Interpretability	May 25, 2023	Knowledge Distillation	—Unverified	0

Show:10 25 50

← PrevPage 38 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified