Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2901–2950 of 4240 papers

Title	Date	Tasks	Status	Hype
Meta Knowledge Distillation	Feb 16, 2022	Data AugmentationImage Classification	—Unverified	0
Knowledge Distillation with Deep Supervision	Feb 16, 2022	Knowledge DistillationTransfer Learning	CodeCode Available	0
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation	Feb 16, 2022	Grammatical Error CorrectionKnowledge Distillation	—Unverified	0
FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction	Feb 16, 2022	Active LearningKnowledge Distillation	CodeCode Available	1
No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices	Feb 16, 2022	Federated LearningKnowledge Distillation	—Unverified	0
ZeroGen: Efficient Zero-shot Learning via Dataset Generation	Feb 16, 2022	Data-free Knowledge DistillationDataset Generation	CodeCode Available	1
Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in Bing Sponsored Search	Feb 13, 2022	Contrastive LearningKnowledge Distillation	—Unverified	0
AI can evolve without labels: self-evolving vision transformer for chest X-ray diagnosis through knowledge distillation	Feb 13, 2022	Deep LearningDiagnostic	—Unverified	0
Tiny Object Tracking: A Large-scale Dataset and A Baseline	Feb 11, 2022	AttributeKnowledge Distillation	CodeCode Available	2
Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning	Feb 9, 2022	AllContrastive Learning	—Unverified	0
Point-Level Region Contrast for Object Detection Pre-Training	Feb 9, 2022	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Exploring Inter-Channel Correlation for Diversity-preserved KnowledgeDistillation	Feb 8, 2022	DiversityKnowledge Distillation	CodeCode Available	1
Adaptive Mixing of Auxiliary Losses in Supervised Learning	Feb 7, 2022	DenoisingKnowledge Distillation	CodeCode Available	0
Locally Differentially Private Distributed Deep Learning via Knowledge Distillation	Feb 7, 2022	Deep LearningKnowledge Distillation	CodeCode Available	0
Measuring and Reducing Model Update Regression in Structured Prediction for NLP	Feb 7, 2022	Dependency ParsingKnowledge Distillation	—Unverified	0
Cross domain knowledge compression in realtime optical flow prediction on ultrasound sequences	Feb 4, 2022	Knowledge DistillationOptical Flow Estimation	—Unverified	0
Bootstrapped Representation Learning for Skeleton-Based Action Recognition	Feb 4, 2022	Action RecognitionData Augmentation	—Unverified	0
Iterative Self Knowledge Distillation -- From Pothole Classification to Fine-Grained and COVID Recognition	Feb 4, 2022	ClassificationKnowledge Distillation	—Unverified	0
Local Feature Matching with Transformers for low-end devices	Feb 1, 2022	Knowledge Distillation	CodeCode Available	1
Deep-Disaster: Unsupervised Disaster Detection and Localization Using Visual Data	Jan 31, 2022	HumanitarianKnowledge Distillation	CodeCode Available	0
Improving Robustness by Enhancing Weak Subnets	Jan 30, 2022	Adversarial RobustnessData Augmentation	CodeCode Available	0
Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning	Jan 30, 2022	Knowledge DistillationNetwork Pruning	—Unverified	0
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models	Jan 29, 2022	Inductive BiasKnowledge Distillation	—Unverified	0
Global-Reasoned Multi-Task Learning Model for Surgical Scene Understanding	Jan 28, 2022	Graph AttentionKnowledge Distillation	CodeCode Available	1
Dynamic Rectification Knowledge Distillation	Jan 27, 2022	Edge-computingKnowledge Distillation	CodeCode Available	0
Anomaly Detection via Reverse Distillation from One-Class Embedding	Jan 26, 2022	Anomaly Classification	CodeCode Available	2
Adaptive Instance Distillation for Object Detection in Autonomous Driving	Jan 26, 2022	Autonomous DrivingKnowledge Distillation	—Unverified	0
TrustAL: Trustworthy Active Learning using Knowledge Distillation	Jan 26, 2022	Active LearningDiversity	—Unverified	0
One Student Knows All Experts Know: From Sparse to Dense	Jan 26, 2022	AllKnowledge Distillation	—Unverified	0
Attentive Task Interaction Network for Multi-Task Learning	Jan 25, 2022	DecoderKnowledge Distillation	CodeCode Available	0
Jointly Learning Knowledge Embedding and Neighborhood Consensus with Relational Knowledge Distillation for Entity Alignment	Jan 25, 2022	BenchmarkingEntity Alignment	—Unverified	0
Federated Unlearning with Knowledge Distillation	Jan 24, 2022	Federated LearningKnowledge Distillation	—Unverified	0
AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models	Jan 21, 2022	Bayesian OptimizationKnowledge Distillation	—Unverified	0
Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer	Jan 21, 2022	Knowledge DistillationTransfer Learning	—Unverified	0
Can Model Compression Improve NLP Fairness	Jan 21, 2022	FairnessKnowledge Distillation	—Unverified	0
UKD: Debiasing Conversion Rate Estimation via Uncertainty-regularized Knowledge Distillation	Jan 20, 2022	Knowledge DistillationSelection bias	—Unverified	0
Improving Neural Machine Translation by Denoising Training	Jan 19, 2022	DenoisingKnowledge Distillation	—Unverified	0
Continual Coarse-to-Fine Domain Adaptation in Semantic Segmentation	Jan 18, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	0
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing	Jan 18, 2022	AllClassification	CodeCode Available	1
Cross-modal Contrastive Distillation for Instructional Activity Anticipation	Jan 18, 2022	Knowledge Distillation	—Unverified	0
Knowledge Distillation as Self-Supervised Learning	Jan 17, 2022	Knowledge DistillationSelf-Supervised Learning	—Unverified	0
Tree Knowledge Distillation for Compressing Transformer-Based Language Models	Jan 16, 2022	Knowledge Distillation	—Unverified	0
Learning Cross-Lingual IR from an English Retriever	Jan 16, 2022	Cross-Lingual Information RetrievalInformation Retrieval	—Unverified	0
Nearest Neighbor Knowledge Distillation for Neural Machine Translation	Jan 16, 2022	Knowledge DistillationMachine Translation	—Unverified	0
Transferring Knowledge from Structure-aware Self-attention Language Model to Sequence-to-Sequence Semantic Parsing	Jan 16, 2022	Code GenerationKnowledge Distillation	—Unverified	0
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation	Jan 16, 2022	cross-modal alignmentKnowledge Distillation	—Unverified	0
Re2G: Retrieve, Rerank, Generate	Jan 16, 2022	Fact CheckingGPU	—Unverified	0
CL-ReKD: Cross-lingual Knowledge Distillation for Multilingual Retrieval Question Answering	Jan 16, 2022	Knowledge DistillationLanguage Modeling	—Unverified	0
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation	Jan 16, 2022	Knowledge DistillationMixture-of-Experts	—Unverified	0
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation	Jan 13, 2022	Knowledge Distillationregression	CodeCode Available	1

Show:10 25 50

← PrevPage 59 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified