Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3201–3250 of 4240 papers

Title	Date	Tasks	Status
Better Supervisory Signals by Observing Learning Paths	Mar 4, 2022	Knowledge Distillation	CodeCode Available
MIAShield: Defending Membership Inference Attacks via Preemptive Exclusion of Members	Mar 2, 2022	image-classificationImage Classification	—Unverified
Dual Embodied-Symbolic Concept Representations for Deep Learning	Mar 1, 2022	class-incremental learningClass Incremental Learning	—Unverified
TRILLsson: Distilled Universal Paralinguistic Speech Representations	Mar 1, 2022	Emotion RecognitionKnowledge Distillation	—Unverified
Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation	Feb 28, 2022	DecoderKnowledge Distillation	—Unverified
Joint Answering and Explanation for Visual Commonsense Reasoning	Feb 25, 2022	Knowledge DistillationQuestion Answering	CodeCode Available
Learn From the Past: Experience Ensemble Knowledge Distillation	Feb 25, 2022	Knowledge DistillationTransfer Learning	—Unverified
Bridging the Gap Between Patient-specific and Patient-independent Seizure Prediction via Knowledge Distillation	Feb 25, 2022	Knowledge DistillationPrediction	—Unverified
Efficient Video Segmentation Models with Per-frame Inference	Feb 24, 2022	Image MattingInstance Segmentation	—Unverified
Are All Linear Regions Created Equal?	Feb 23, 2022	AllKnowledge Distillation	CodeCode Available
Multi-Teacher Knowledge Distillation for Incremental Implicitly-Refined Classification	Feb 23, 2022	ClassificationIncremental Learning	—Unverified
Distilled Neural Networks for Efficient Learning to Rank	Feb 22, 2022	CPUInformation Retrieval	CodeCode Available
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning	Feb 21, 2022	Continual LearningKnowledge Distillation	—Unverified
A Novel Architecture Slimming Method for Network Pruning and Knowledge Distillation	Feb 21, 2022	Knowledge DistillationModel Compression	—Unverified
Cross-Task Knowledge Distillation in Multi-Task Recommendation	Feb 20, 2022	Knowledge DistillationMulti-Task Learning	—Unverified
Knowledge Distillation with Deep Supervision	Feb 16, 2022	Knowledge DistillationTransfer Learning	CodeCode Available
No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices	Feb 16, 2022	Federated LearningKnowledge Distillation	—Unverified
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation	Feb 16, 2022	Grammatical Error CorrectionKnowledge Distillation	—Unverified
Meta Knowledge Distillation	Feb 16, 2022	Data AugmentationImage Classification	—Unverified
Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in Bing Sponsored Search	Feb 13, 2022	Contrastive LearningKnowledge Distillation	—Unverified
AI can evolve without labels: self-evolving vision transformer for chest X-ray diagnosis through knowledge distillation	Feb 13, 2022	Deep LearningDiagnostic	—Unverified
Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning	Feb 9, 2022	AllContrastive Learning	—Unverified
Locally Differentially Private Distributed Deep Learning via Knowledge Distillation	Feb 7, 2022	Deep LearningKnowledge Distillation	CodeCode Available
Adaptive Mixing of Auxiliary Losses in Supervised Learning	Feb 7, 2022	DenoisingKnowledge Distillation	CodeCode Available
Measuring and Reducing Model Update Regression in Structured Prediction for NLP	Feb 7, 2022	Dependency ParsingKnowledge Distillation	—Unverified
Cross domain knowledge compression in realtime optical flow prediction on ultrasound sequences	Feb 4, 2022	Knowledge DistillationOptical Flow Estimation	—Unverified
Iterative Self Knowledge Distillation -- From Pothole Classification to Fine-Grained and COVID Recognition	Feb 4, 2022	ClassificationKnowledge Distillation	—Unverified
Bootstrapped Representation Learning for Skeleton-Based Action Recognition	Feb 4, 2022	Action RecognitionData Augmentation	—Unverified
Deep-Disaster: Unsupervised Disaster Detection and Localization Using Visual Data	Jan 31, 2022	HumanitarianKnowledge Distillation	CodeCode Available
Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning	Jan 30, 2022	Knowledge DistillationNetwork Pruning	—Unverified
Improving Robustness by Enhancing Weak Subnets	Jan 30, 2022	Adversarial RobustnessData Augmentation	CodeCode Available
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models	Jan 29, 2022	Inductive BiasKnowledge Distillation	—Unverified
Dynamic Rectification Knowledge Distillation	Jan 27, 2022	Edge-computingKnowledge Distillation	CodeCode Available
Adaptive Instance Distillation for Object Detection in Autonomous Driving	Jan 26, 2022	Autonomous DrivingKnowledge Distillation	—Unverified
TrustAL: Trustworthy Active Learning using Knowledge Distillation	Jan 26, 2022	Active LearningDiversity	—Unverified
One Student Knows All Experts Know: From Sparse to Dense	Jan 26, 2022	AllKnowledge Distillation	—Unverified
Jointly Learning Knowledge Embedding and Neighborhood Consensus with Relational Knowledge Distillation for Entity Alignment	Jan 25, 2022	BenchmarkingEntity Alignment	—Unverified
Attentive Task Interaction Network for Multi-Task Learning	Jan 25, 2022	DecoderKnowledge Distillation	CodeCode Available
Federated Unlearning with Knowledge Distillation	Jan 24, 2022	Federated LearningKnowledge Distillation	—Unverified
Can Model Compression Improve NLP Fairness	Jan 21, 2022	FairnessKnowledge Distillation	—Unverified
AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models	Jan 21, 2022	Bayesian OptimizationKnowledge Distillation	—Unverified
Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer	Jan 21, 2022	Knowledge DistillationTransfer Learning	—Unverified
UKD: Debiasing Conversion Rate Estimation via Uncertainty-regularized Knowledge Distillation	Jan 20, 2022	Knowledge DistillationSelection bias	—Unverified
Improving Neural Machine Translation by Denoising Training	Jan 19, 2022	DenoisingKnowledge Distillation	—Unverified
Continual Coarse-to-Fine Domain Adaptation in Semantic Segmentation	Jan 18, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available
Cross-modal Contrastive Distillation for Instructional Activity Anticipation	Jan 18, 2022	Knowledge Distillation	—Unverified
Knowledge Distillation as Self-Supervised Learning	Jan 17, 2022	Knowledge DistillationSelf-Supervised Learning	—Unverified
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation	Jan 16, 2022	cross-modal alignmentKnowledge Distillation	—Unverified
Re2G: Retrieve, Rerank, Generate	Jan 16, 2022	Fact CheckingGPU	—Unverified
Learning Cross-Lingual IR from an English Retriever	Jan 16, 2022	Cross-Lingual Information RetrievalInformation Retrieval	—Unverified

Show:10 25 50

← PrevPage 65 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified