Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2451–2500 of 4240 papers

Title	Date	Tasks	Status	Hype
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation	Oct 12, 2022	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available	1
SaiT: Sparse Vision Transformers through Adaptive Token Pruning	Oct 11, 2022	Knowledge Distillation	CodeCode Available	0
Linkless Link Prediction via Relational Distillation	Oct 11, 2022	Knowledge DistillationLink Prediction	—Unverified	0
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR	Oct 11, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Detect, Distill and Update: Learned DB Systems Facing Out of Distribution Data	Oct 11, 2022	Knowledge DistillationSynthetic Data Generation	CodeCode Available	0
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval	Oct 11, 2022	Knowledge DistillationQuantization	CodeCode Available	1
APSNet: Attention Based Point Cloud Sampling	Oct 11, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	1
PP-StructureV2: A Stronger Document Analysis System	Oct 11, 2022	Key Information ExtractionKnowledge Distillation	—Unverified	0
Meta-Learning with Self-Improving Momentum Target	Oct 11, 2022	Knowledge DistillationMeta-Learning	CodeCode Available	1
ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning	Oct 11, 2022	Cross-Domain Few-Shotcross-domain few-shot learning	CodeCode Available	1
The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes	Oct 11, 2022	Active LearningKnowledge Distillation	—Unverified	0
Patch-based Knowledge Distillation for Lifelong Person Re-Identification	Oct 10, 2022	Continual LearningKnowledge Distillation	CodeCode Available	1
Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again	Oct 10, 2022	Knowledge Distillation	—Unverified	0
Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation	Oct 10, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks	Oct 10, 2022	domain classificationintent-classification	—Unverified	0
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis	Oct 9, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	2
Students taught by multimodal teachers are superior action recognizers	Oct 9, 2022	Action RecognitionKnowledge Distillation	—Unverified	0
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts	Oct 8, 2022	Domain GeneralizationKnowledge Distillation	CodeCode Available	1
Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization	Oct 7, 2022	Knowledge Distillationspeaker-diarization	—Unverified	0
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval	Oct 7, 2022	Knowledge DistillationRetrieval	CodeCode Available	1
Bi-directional Weakly Supervised Knowledge Distillation for Whole Slide Image Classification	Oct 7, 2022	Classificationimage-classification	CodeCode Available	1
IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors	Oct 7, 2022	Knowledge Distillationobject-detection	CodeCode Available	1
CLIP model is an Efficient Continual Learner	Oct 6, 2022	Continual LearningIncremental Learning	CodeCode Available	1
Effective Self-supervised Pre-training on Low-compute Networks without Distillation	Oct 6, 2022	AttributeInstance Segmentation	CodeCode Available	1
AlphaFold Distillation for Protein Design	Oct 5, 2022	DiversityDrug Discovery	CodeCode Available	1
Meta-Ensemble Parameter Learning	Oct 5, 2022	Knowledge DistillationMeta-Learning	—Unverified	0
Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation	Oct 5, 2022	Graph Representation LearningKnowledge Distillation	—Unverified	0
Domain Discrepancy Aware Distillation for Model Aggregation in Federated Learning	Oct 4, 2022	Federated LearningKnowledge Distillation	—Unverified	0
Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification	Oct 4, 2022	Knowledge DistillationMetric Learning	CodeCode Available	0
Knowledge Distillation based Contextual Relevance Matching for E-commerce Product Search	Oct 4, 2022	Knowledge Distillation	—Unverified	0
A Study on the Efficiency and Generalization of Light Hybrid Retrievers	Oct 4, 2022	Adversarial AttackContrastive Learning	—Unverified	0
Robust Active Distillation	Oct 3, 2022	Active LearningInformativeness	—Unverified	0
Attention Distillation: self-supervised vision transformer students need more guidance	Oct 3, 2022	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
Knowledge Transfer with Visual Prompt in multi-modal Dialogue Understanding and Generation	Oct 1, 2022	Dialogue UnderstandingKnowledge Distillation	—Unverified	0
Knowledge Distillation with Reptile Meta-Learning for Pretrained Language Model Compression	Oct 1, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	0
Improving Zero-Shot Multilingual Text Generation via Iterative Distillation	Oct 1, 2022	Knowledge DistillationText Generation	—Unverified	0
TAKE: Topic-shift Aware Knowledge sElection for Dialogue Generation	Oct 1, 2022	Dialogue GenerationKnowledge Distillation	CodeCode Available	0
Transferring Knowledge from Structure-aware Self-attention Language Model to Sequence-to-Sequence Semantic Parsing	Oct 1, 2022	Code GenerationKnowledge Distillation	—Unverified	0
One-Teacher and Multiple-Student Knowledge Distillation on Sentiment Classification	Oct 1, 2022	Ensemble LearningKnowledge Distillation	CodeCode Available	0
Sentiment Interpretable Logic Tensor Network for Aspect-Term Sentiment Analysis	Oct 1, 2022	Computational EfficiencyKnowledge Distillation	—Unverified	0
Multi-stage Progressive Compression of Conformer Transducer for On-device Speech Recognition	Oct 1, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning	Sep 30, 2022	ECG ClassificationKnowledge Distillation	CodeCode Available	1
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models	Sep 30, 2022	Knowledge Distillationobject-detection	CodeCode Available	0
Designing and Training of Lightweight Neural Networks on Edge Devices using Early Halting in Knowledge Distillation	Sep 30, 2022	Knowledge Distillation	—Unverified	0
Slimmable Networks for Contrastive Self-supervised Learning	Sep 30, 2022	Contrastive LearningKnowledge Distillation	CodeCode Available	0
Using Knowledge Distillation to improve interpretable models in a retail banking context	Sep 30, 2022	Data AugmentationKnowledge Distillation	—Unverified	0
Towards a Unified View of Affinity-Based Knowledge Distillation	Sep 30, 2022	image-classificationImage Classification	—Unverified	0
Label driven Knowledge Distillation for Federated Learning with non-IID Data	Sep 29, 2022	Federated LearningKnowledge Distillation	—Unverified	0
Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition	Sep 29, 2022	Face RecognitionKnowledge Distillation	CodeCode Available	1
Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights	Sep 29, 2022	Knowledge DistillationNeural Architecture Search	CodeCode Available	1

Show:10 25 50

← PrevPage 50 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified