Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2651–2700 of 4240 papers

Title	Date	Tasks	Status	Hype
Extreme compression of sentence-transformer ranker models: faster inference, longer battery life, and less storage on edge devices	Jun 29, 2022	Dimensionality ReductionKnowledge Distillation	—Unverified	0
Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?	Jun 29, 2022	image-classificationImage Classification	CodeCode Available	1
Knowledge Distillation of Transformer-based Language Models Revisited	Jun 29, 2022	GPUKnowledge Distillation	—Unverified	0
QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design	Jun 28, 2022	Acoustic Scene ClassificationKnowledge Distillation	—Unverified	0
Cooperative Retriever and Ranker in Deep Recommenders	Jun 28, 2022	Knowledge DistillationRecommendation Systems	CodeCode Available	0
Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search	Jun 27, 2022	Bayesian OptimizationKnowledge Distillation	—Unverified	0
Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification	Jun 26, 2022	GPUimage-classification	—Unverified	0
Feature Representation Learning for Robust Retinal Disease Detection from Optical Coherence Tomography Images	Jun 24, 2022	DecoderKnowledge Distillation	CodeCode Available	0
Mixed Sample Augmentation for Online Distillation	Jun 24, 2022	Data AugmentationKnowledge Distillation	—Unverified	0
Knowledge Distillation via Weighted Ensemble of Teaching Assistants	Jun 23, 2022	Ensemble LearningKnowledge Distillation	—Unverified	0
Conformer with dual-mode chunked attention for joint online and offline ASR	Jun 22, 2022	Knowledge Distillation	—Unverified	0
Knowledge Distillation for Oriented Object Detection on Aerial Images	Jun 20, 2022	Knowledge DistillationModel Compression	—Unverified	0
MetaFed: Federated Learning among Federations with Cyclic Knowledge Distillation for Personalized Healthcare	Jun 17, 2022	Federated LearningKnowledge Distillation	CodeCode Available	2
Revisiting Self-Distillation	Jun 17, 2022	Knowledge DistillationModel Compression	—Unverified	0
Multi scale Feature Extraction and Fusion for Online Knowledge Distillation	Jun 16, 2022	Knowledge DistillationTransfer Learning	—Unverified	0
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks	Jun 14, 2022	Knowledge Distillationreinforcement-learning	—Unverified	0
FreeTransfer-X: Safe and Label-Free Cross-Lingual Transfer from Off-the-Shelf Models	Jun 14, 2022	Cross-Lingual TransferDiagnostic	—Unverified	0
Toward Student-Oriented Teacher Network Training For Knowledge Distillation	Jun 14, 2022	Data AugmentationKnowledge Distillation	—Unverified	0
Robust Distillation for Worst-class Performance	Jun 13, 2022	Knowledge Distillation	—Unverified	0
Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation	Jun 13, 2022	image-classificationImage Classification	CodeCode Available	0
Federated Bayesian Neural Regression: A Scalable Global Federated Gaussian Process	Jun 13, 2022	Federated LearningKnowledge Distillation	—Unverified	0
The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation	Jun 13, 2022	Knowledge DistillationTransfer Learning	CodeCode Available	1
Reducing Capacity Gap in Knowledge Distillation with Review Mechanism for Crowd Counting	Jun 11, 2022	Computational EfficiencyCrowd Counting	CodeCode Available	0
Knowledge Distillation Decision Tree for Unravelling Black-box Machine Learning Models	Jun 9, 2022	Knowledge Distillation	—Unverified	0
SDQ: Stochastic Differentiable Quantization with Mixed Precision	Jun 9, 2022	Knowledge DistillationNeural Architecture Search	—Unverified	0
Narrowing the Coordinate-frame Gap in Behavior Prediction Models: Distillation for Efficient and Accurate Scene-centric Motion Forecasting	Jun 8, 2022	Autonomous DrivingKnowledge Distillation	—Unverified	0
Reconsidering Learning Objectives in Unbiased Recommendation with Unobserved Confounders	Jun 7, 2022	Generalization BoundsKnowledge Distillation	—Unverified	0
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation	Jun 7, 2022	Knowledge DistillationQuestion Answering	CodeCode Available	0
Confidence-aware Self-Semantic Distillation on Knowledge Graph Embedding	Jun 7, 2022	Graph EmbeddingKnowledge Distillation	—Unverified	0
Self-Knowledge Distillation based Self-Supervised Learning for Covid-19 Detection from Chest X-Ray Images	Jun 7, 2022	Knowledge DistillationSelf-Knowledge Distillation	—Unverified	0
Evaluation-oriented Knowledge Distillation for Deep Face Recognition	Jun 6, 2022	Face RecognitionKnowledge Distillation	—Unverified	0
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models	Jun 5, 2022	Knowledge DistillationLipreading	—Unverified	0
Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation	Jun 5, 2022	3D Semantic SegmentationKnowledge Distillation	CodeCode Available	0
Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training	Jun 5, 2022	Knowledge Distillation	—Unverified	0
Guided Deep Metric Learning	Jun 4, 2022	Few-Shot LearningKnowledge Distillation	—Unverified	0
Extreme Compression for Pre-trained Transformers Made Simple and Efficient	Jun 4, 2022	Knowledge DistillationQuantization	—Unverified	0
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers	Jun 4, 2022	Knowledge DistillationQuantization	CodeCode Available	2
3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation	Jun 2, 2022	Contrastive LearningKnowledge Distillation	—Unverified	0
Detecting Optimism in Tweets using Knowledge Distillation and Linguistic Analysis of Optimism	Jun 1, 2022	Hate Speech DetectionKnowledge Distillation	—Unverified	0
ORC: Network Group-based Knowledge Distillation using Online Role Change	Jun 1, 2022	Knowledge Distillation	CodeCode Available	0
Generalized Supervised Contrastive Learning	Jun 1, 2022	Contrastive LearningKnowledge Distillation	—Unverified	0
Searching for COMETINHO: The Little Metric That Could	Jun 1, 2022	Computational EfficiencyKnowledge Distillation	—Unverified	0
VFed-SSD: Towards Practical Vertical Federated Advertising	May 31, 2022	Federated LearningKnowledge Distillation	—Unverified	0
What Knowledge Gets Distilled in Knowledge Distillation?	May 31, 2022	Knowledge Distillation	—Unverified	0
itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection	May 31, 2022	3D Object DetectionCloud Detection	CodeCode Available	1
Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions	May 30, 2022	6D Pose Estimation6D Pose Estimation using RGB	—Unverified	0
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch	May 30, 2022	Continuous ControlDeep Reinforcement Learning	CodeCode Available	1
Spectral Maps for Learning on Subgraphs	May 30, 2022	Graph LearningKnowledge Distillation	—Unverified	0
Towards Efficient 3D Object Detection with Knowledge Distillation	May 30, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1
A General Multiple Data Augmentation Based Framework for Training Deep Neural Networks	May 29, 2022	Data Augmentationimage-classification	—Unverified	0

Show:10 25 50

← PrevPage 54 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified