Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2851–2900 of 4240 papers

Title	Date	Tasks	Status
Knowledge Distillation from Cross Teaching Teachers for Efficient Semi-Supervised Abdominal Organ Segmentation in CT	Nov 11, 2022	Image SegmentationKnowledge Distillation	CodeCode Available
PILE: Pairwise Iterative Logits Ensemble for Multi-Teacher Labeled Distillation	Nov 11, 2022	Knowledge Distillation	—Unverified
FAN-Trans: Online Knowledge Distillation for Facial Action Unit Detection	Nov 11, 2022	Action Unit DetectionFace Alignment	—Unverified
Knowledge Distillation for Federated Learning: a Practical Guide	Nov 9, 2022	Federated LearningKnowledge Distillation	—Unverified
Bridging Fairness and Environmental Sustainability in Natural Language Processing	Nov 8, 2022	Dimensionality ReductionFairness	—Unverified
Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study	Nov 8, 2022	AttributeData Augmentation	CodeCode Available
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization	Nov 7, 2022	Knowledge Distillation	—Unverified
Closing the Gap between Client and Global Model Performance in Heterogeneous Federated Learning	Nov 7, 2022	Federated LearningKnowledge Distillation	—Unverified
Breaking the trade-off in personalized speech enhancement with cross-task knowledge distillation	Nov 5, 2022	Knowledge DistillationSpeech Enhancement	—Unverified
LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation via Similarity Knowledge Distillation	Nov 2, 2022	DecoderKnowledge Distillation	—Unverified
Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model	Nov 2, 2022	Knowledge DistillationLanguage Modeling	—Unverified
Gradient Knowledge Distillation for Pre-trained Language Models	Nov 2, 2022	Knowledge Distillation	CodeCode Available
ARDIR: Improving Robustness using Knowledge Distillation of Internal Representation	Nov 1, 2022	Knowledge Distillation	—Unverified
Fairness without Demographics through Knowledge Distillation	Nov 1, 2022	FairnessKnowledge Distillation	CodeCode Available
Lightweight Sound Event Detection Model with RepVGG Architecture	Nov 1, 2022	Event DetectionKnowledge Distillation	—Unverified
Enhancing Chinese Multi-Label Text Classification Performance with Response-based Knowledge Distillation	Nov 1, 2022	Knowledge DistillationMulti Label Text Classification	—Unverified
Maximum Likelihood Distillation for Robust Modulation Classification	Nov 1, 2022	ClassificationKnowledge Distillation	—Unverified
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation	Oct 31, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Lightweight Neural Network with Knowledge Distillation for CSI Feedback	Oct 31, 2022	Knowledge Distillation	—Unverified
Generative Negative Text Replay for Continual Vision-Language Pretraining	Oct 31, 2022	Continual Learningimage-classification	—Unverified
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM	Oct 31, 2022	Computational EfficiencyKnowledge Distillation	—Unverified
Application of Knowledge Distillation to Multi-task Speech Representation Learning	Oct 29, 2022	Keyword SpottingKnowledge Distillation	—Unverified
Completely Heterogeneous Federated Learning	Oct 28, 2022	Data-free Knowledge DistillationFederated Learning	—Unverified
Teacher-Student Architecture for Knowledge Learning: A Survey	Oct 28, 2022	Knowledge DistillationMulti-Task Learning	—Unverified
Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes?	Oct 28, 2022	Knowledge DistillationMedical Code Prediction	—Unverified
BEBERT: Efficient and Robust Binary Ensemble BERT	Oct 28, 2022	BinarizationComputational Efficiency	CodeCode Available
Semi-UFormer: Semi-supervised Uncertainty-aware Transformer for Image Dehazing	Oct 28, 2022	Image DehazingKnowledge Distillation	—Unverified
Li3DeTr: A LiDAR based 3D Detection Transformer	Oct 27, 2022	Autonomous DrivingDecoder	—Unverified
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks	Oct 27, 2022	Knowledge DistillationQuantization	—Unverified
Fast DistilBERT on CPUs	Oct 27, 2022	Knowledge DistillationModel Compression	—Unverified
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation	Oct 27, 2022	Feature EngineeringKnowledge Distillation	—Unverified
Long-tailed Food Classification	Oct 26, 2022	ClassificationData Augmentation	—Unverified
Online Cross-Layer Knowledge Distillation on Graph Neural Networks with Deep Supervision	Oct 25, 2022	Knowledge DistillationModel Compression	—Unverified
An Effective Deep Network for Head Pose Estimation without Keypoints	Oct 25, 2022	Gaze EstimationHead Pose Estimation	—Unverified
Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation	Oct 25, 2022	Knowledge DistillationSentence	—Unverified
Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models	Oct 24, 2022	Knowledge DistillationModel Compression	—Unverified
Respecting Transfer Gap in Knowledge Distillation	Oct 23, 2022	Knowledge Distillation	—Unverified
Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation	Oct 22, 2022	Knowledge DistillationText Generation	—Unverified
Hard Gate Knowledge Distillation -- Leverage Calibration for Robust and Reliable Language Model	Oct 22, 2022	Knowledge DistillationLanguage Modeling	—Unverified
Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks	Oct 21, 2022	Knowledge Distillationtext-classification	—Unverified
Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation	Oct 21, 2022	Data AugmentationDiversity	—Unverified
Modeling Document-level Temporal Structures for Building Temporal Dependency Graphs	Oct 21, 2022	Knowledge DistillationSentence	CodeCode Available
Distilling the Undistillable: Learning from a Nasty Teacher	Oct 21, 2022	Knowledge Distillation	CodeCode Available
Semi-supervised object detection based on single-stage detector for thighbone fracture localization	Oct 20, 2022	Fracture detectionImage Augmentation	—Unverified
Toward Multiple Specialty Learners for Explaining GNNs via Online Knowledge Distillation	Oct 20, 2022	Knowledge Distillation	—Unverified
Similarity of Neural Architectures using Adversarial Attack Transferability	Oct 20, 2022	Adversarial AttackDiversity	—Unverified
ADPS: Asymmetric Distillation Post-Segmentation for Image Anomaly Detection	Oct 19, 2022	Anomaly DetectionAnomaly Localization	—Unverified
A baseline revisited: Pushing the limits of multi-segment models for context-aware translation	Oct 19, 2022	Knowledge DistillationTranslation	—Unverified
On effects of Knowledge Distillation on Transfer Learning	Oct 18, 2022	image-classificationImage Classification	—Unverified

Show:10 25 50

← PrevPage 58 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified