Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2876–2900 of 4240 papers

Title	Date	Tasks	Status
BEBERT: Efficient and Robust Binary Ensemble BERT	Oct 28, 2022	BinarizationComputational Efficiency	CodeCode Available
Semi-UFormer: Semi-supervised Uncertainty-aware Transformer for Image Dehazing	Oct 28, 2022	Image DehazingKnowledge Distillation	—Unverified
Li3DeTr: A LiDAR based 3D Detection Transformer	Oct 27, 2022	Autonomous DrivingDecoder	—Unverified
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks	Oct 27, 2022	Knowledge DistillationQuantization	—Unverified
Fast DistilBERT on CPUs	Oct 27, 2022	Knowledge DistillationModel Compression	—Unverified
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation	Oct 27, 2022	Feature EngineeringKnowledge Distillation	—Unverified
Long-tailed Food Classification	Oct 26, 2022	ClassificationData Augmentation	—Unverified
Online Cross-Layer Knowledge Distillation on Graph Neural Networks with Deep Supervision	Oct 25, 2022	Knowledge DistillationModel Compression	—Unverified
An Effective Deep Network for Head Pose Estimation without Keypoints	Oct 25, 2022	Gaze EstimationHead Pose Estimation	—Unverified
Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation	Oct 25, 2022	Knowledge DistillationSentence	—Unverified
Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models	Oct 24, 2022	Knowledge DistillationModel Compression	—Unverified
Respecting Transfer Gap in Knowledge Distillation	Oct 23, 2022	Knowledge Distillation	—Unverified
Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation	Oct 22, 2022	Knowledge DistillationText Generation	—Unverified
Hard Gate Knowledge Distillation -- Leverage Calibration for Robust and Reliable Language Model	Oct 22, 2022	Knowledge DistillationLanguage Modeling	—Unverified
Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks	Oct 21, 2022	Knowledge Distillationtext-classification	—Unverified
Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation	Oct 21, 2022	Data AugmentationDiversity	—Unverified
Modeling Document-level Temporal Structures for Building Temporal Dependency Graphs	Oct 21, 2022	Knowledge DistillationSentence	CodeCode Available
Distilling the Undistillable: Learning from a Nasty Teacher	Oct 21, 2022	Knowledge Distillation	CodeCode Available
Semi-supervised object detection based on single-stage detector for thighbone fracture localization	Oct 20, 2022	Fracture detectionImage Augmentation	—Unverified
Toward Multiple Specialty Learners for Explaining GNNs via Online Knowledge Distillation	Oct 20, 2022	Knowledge Distillation	—Unverified
Similarity of Neural Architectures using Adversarial Attack Transferability	Oct 20, 2022	Adversarial AttackDiversity	—Unverified
ADPS: Asymmetric Distillation Post-Segmentation for Image Anomaly Detection	Oct 19, 2022	Anomaly DetectionAnomaly Localization	—Unverified
A baseline revisited: Pushing the limits of multi-segment models for context-aware translation	Oct 19, 2022	Knowledge DistillationTranslation	—Unverified
On effects of Knowledge Distillation on Transfer Learning	Oct 18, 2022	image-classificationImage Classification	—Unverified

Show:10 25 50

← PrevPage 116 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified