Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3051–3100 of 4240 papers

Title	Date	Tasks	Status
Knowledge Distillation via Weighted Ensemble of Teaching Assistants	Jun 23, 2022	Ensemble LearningKnowledge Distillation	—Unverified
Conformer with dual-mode chunked attention for joint online and offline ASR	Jun 22, 2022	Knowledge Distillation	—Unverified
Knowledge Distillation for Oriented Object Detection on Aerial Images	Jun 20, 2022	Knowledge DistillationModel Compression	—Unverified
Revisiting Self-Distillation	Jun 17, 2022	Knowledge DistillationModel Compression	—Unverified
Multi scale Feature Extraction and Fusion for Online Knowledge Distillation	Jun 16, 2022	Knowledge DistillationTransfer Learning	—Unverified
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks	Jun 14, 2022	Knowledge Distillationreinforcement-learning	—Unverified
Toward Student-Oriented Teacher Network Training For Knowledge Distillation	Jun 14, 2022	Data AugmentationKnowledge Distillation	—Unverified
FreeTransfer-X: Safe and Label-Free Cross-Lingual Transfer from Off-the-Shelf Models	Jun 14, 2022	Cross-Lingual TransferDiagnostic	—Unverified
Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation	Jun 13, 2022	image-classificationImage Classification	CodeCode Available
Robust Distillation for Worst-class Performance	Jun 13, 2022	Knowledge Distillation	—Unverified
Federated Bayesian Neural Regression: A Scalable Global Federated Gaussian Process	Jun 13, 2022	Federated LearningKnowledge Distillation	—Unverified
Reducing Capacity Gap in Knowledge Distillation with Review Mechanism for Crowd Counting	Jun 11, 2022	Computational EfficiencyCrowd Counting	CodeCode Available
SDQ: Stochastic Differentiable Quantization with Mixed Precision	Jun 9, 2022	Knowledge DistillationNeural Architecture Search	—Unverified
Knowledge Distillation Decision Tree for Unravelling Black-box Machine Learning Models	Jun 9, 2022	Knowledge Distillation	—Unverified
Narrowing the Coordinate-frame Gap in Behavior Prediction Models: Distillation for Efficient and Accurate Scene-centric Motion Forecasting	Jun 8, 2022	Autonomous DrivingKnowledge Distillation	—Unverified
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation	Jun 7, 2022	Knowledge DistillationQuestion Answering	CodeCode Available
Self-Knowledge Distillation based Self-Supervised Learning for Covid-19 Detection from Chest X-Ray Images	Jun 7, 2022	Knowledge DistillationSelf-Knowledge Distillation	—Unverified
Reconsidering Learning Objectives in Unbiased Recommendation with Unobserved Confounders	Jun 7, 2022	Generalization BoundsKnowledge Distillation	—Unverified
Confidence-aware Self-Semantic Distillation on Knowledge Graph Embedding	Jun 7, 2022	Graph EmbeddingKnowledge Distillation	—Unverified
Evaluation-oriented Knowledge Distillation for Deep Face Recognition	Jun 6, 2022	Face RecognitionKnowledge Distillation	—Unverified
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models	Jun 5, 2022	Knowledge DistillationLipreading	—Unverified
Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation	Jun 5, 2022	3D Semantic SegmentationKnowledge Distillation	CodeCode Available
Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training	Jun 5, 2022	Knowledge Distillation	—Unverified
Extreme Compression for Pre-trained Transformers Made Simple and Efficient	Jun 4, 2022	Knowledge DistillationQuantization	—Unverified
Guided Deep Metric Learning	Jun 4, 2022	Few-Shot LearningKnowledge Distillation	—Unverified
3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation	Jun 2, 2022	Contrastive LearningKnowledge Distillation	—Unverified
ORC: Network Group-based Knowledge Distillation using Online Role Change	Jun 1, 2022	Knowledge Distillation	CodeCode Available
Generalized Supervised Contrastive Learning	Jun 1, 2022	Contrastive LearningKnowledge Distillation	—Unverified
Detecting Optimism in Tweets using Knowledge Distillation and Linguistic Analysis of Optimism	Jun 1, 2022	Hate Speech DetectionKnowledge Distillation	—Unverified
Searching for COMETINHO: The Little Metric That Could	Jun 1, 2022	Computational EfficiencyKnowledge Distillation	—Unverified
What Knowledge Gets Distilled in Knowledge Distillation?	May 31, 2022	Knowledge Distillation	—Unverified
VFed-SSD: Towards Practical Vertical Federated Advertising	May 31, 2022	Federated LearningKnowledge Distillation	—Unverified
Spectral Maps for Learning on Subgraphs	May 30, 2022	Graph LearningKnowledge Distillation	—Unverified
Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions	May 30, 2022	6D Pose Estimation6D Pose Estimation using RGB	—Unverified
A General Multiple Data Augmentation Based Framework for Training Deep Neural Networks	May 29, 2022	Data Augmentationimage-classification	—Unverified
MiniDisc: Minimal Distillation Schedule for Language Model Compression	May 29, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available
One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive Translation	May 28, 2022	Knowledge DistillationMachine Translation	CodeCode Available
Parameter-Efficient and Student-Friendly Knowledge Distillation	May 28, 2022	Knowledge DistillationTransfer Learning	—Unverified
Region-aware Knowledge Distillation for Efficient Image-to-Image Translation	May 25, 2022	Contrastive Learningimage-classification	—Unverified
Do we need Label Regularization to Fine-tune Pre-trained Language Models?	May 25, 2022	Knowledge DistillationModel Compression	—Unverified
DFM: Dialogue Foundation Model for Universal Large-Scale Dialogue-Oriented Task Learning	May 25, 2022	Dialogue GenerationDiversity	—Unverified
CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing	May 24, 2022	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
LILA-BOTI : Leveraging Isolated Letter Accumulations By Ordering Teacher Insights for Bangla Handwriting Recognition	May 23, 2022	Handwriting RecognitionKnowledge Distillation	CodeCode Available
Aligning Logits Generatively for Principled Black-Box Knowledge Distillation	May 21, 2022	Federated LearningKnowledge Distillation	CodeCode Available
InDistill: Information flow-preserving knowledge distillation for model compression	May 20, 2022	Knowledge DistillationModel Compression	CodeCode Available
Simple Regularisation for Uncertainty-Aware Knowledge Distillation	May 19, 2022	BIG-bench Machine LearningDiversity	—Unverified
ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval	May 18, 2022	Knowledge DistillationOpen-Domain Question Answering	—Unverified
Prompting to Distill: Boosting Data-Free Knowledge Distillation via Reinforced Prompt	May 16, 2022	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
Chemical transformer compression for accelerating both training and inference of molecular modeling	May 16, 2022	Knowledge DistillationModel Compression	CodeCode Available
Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering	May 15, 2022	Domain GeneralizationKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 62 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified