Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3801–3850 of 4240 papers

Title	Date	Tasks	Status	Hype
Unsupervised Multi-Target Domain Adaptation Through Knowledge Distillation	Jul 14, 2020	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Knowledge Distillation for Multi-task Learning	Jul 14, 2020	Knowledge DistillationMulti-Task Learning	CodeCode Available	1
Learning to Learn Parameterized Classification Networks for Scalable Input Images	Jul 13, 2020	ClassificationGeneral Classification	CodeCode Available	1
Towards Practical Lipreading with Distilled and Efficient Models	Jul 13, 2020	Knowledge DistillationLipreading	CodeCode Available	1
RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning	Jul 13, 2020	Continual LearningImage Captioning	CodeCode Available	1
Representation Transfer by Optimal Transport	Jul 13, 2020	Knowledge DistillationModel Compression	—Unverified	0
Dual-Teacher: Integrating Intra-domain and Inter-domain Teachers for Annotation-efficient Cardiac Segmentation	Jul 13, 2020	Cardiac SegmentationDomain Adaptation	—Unverified	0
Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection	Jul 13, 2020	image-classificationImage Classification	CodeCode Available	1
Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer	Jul 10, 2020	Knowledge DistillationOptical Flow Estimation	—Unverified	0
Data-Efficient Ranking Distillation for Image Retrieval	Jul 10, 2020	Image RetrievalKnowledge Distillation	—Unverified	0
Robust Re-Identification by Multiple Views Knowledge Distillation	Jul 8, 2020	Knowledge DistillationPerson Re-Identification	CodeCode Available	1
Tracking-by-Trackers with a Distilled and Reinforced Model	Jul 8, 2020	Knowledge DistillationObject Tracking	CodeCode Available	1
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation	Jul 3, 2020	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Knowledge Distillation Beyond Model Compression	Jul 3, 2020	Knowledge Distillationmodel	—Unverified	0
Interactive Knowledge Distillation	Jul 3, 2020	image-classificationImage Classification	—Unverified	0
Improving Autoregressive NMT with Non-Autoregressive Model	Jul 1, 2020	Decoderde-en	—Unverified	0
Xiaomi's Submissions for IWSLT 2020 Open Domain Translation Task	Jul 1, 2020	Domain AdaptationKnowledge Distillation	—Unverified	0
CASIA's System for IWSLT 2020 Open Domain Translation	Jul 1, 2020	Knowledge DistillationMachine Translation	—Unverified	0
Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT	Jul 1, 2020	Document ClassificationGeneral Classification	—Unverified	0
SimulSpeech: End-to-End Simultaneous Speech to Text Translation	Jul 1, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Improving Event Detection via Open-domain Trigger Knowledge	Jul 1, 2020	Event DetectionKnowledge Distillation	CodeCode Available	1
On the Demystification of Knowledge Distillation: A Residual Network Perspective	Jun 30, 2020	Knowledge DistillationModel Compression	—Unverified	0
Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution	Jun 30, 2020	Image ClassificationKnowledge Distillation	—Unverified	0
Interpreting and Disentangling Feature Components of Various Complexity from DNNs	Jun 29, 2020	Knowledge Distillation	CodeCode Available	0
Motion Pyramid Networks for Accurate and Efficient Cardiac Motion Estimation	Jun 28, 2020	Knowledge DistillationMotion Estimation	—Unverified	0
Diverse Knowledge Distillation (DKD): A Solution for Improving The Robustness of Ensemble Models Against Adversarial Attacks	Jun 26, 2020	Ensemble Learningimage-classification	—Unverified	0
Streaming Transformer ASR with Blockwise Synchronous Inference	Jun 25, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Distilling Object Detectors with Task Adaptive Regularization	Jun 23, 2020	Knowledge DistillationObject	—Unverified	0
Self-Knowledge Distillation with Progressive Refinement of Targets	Jun 22, 2020	image-classificationImage Classification	CodeCode Available	1
Paying more attention to snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation	Jun 20, 2020	image-classificationImage Classification	CodeCode Available	1
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation	Jun 18, 2020	DecoderKnowledge Distillation	CodeCode Available	1
Self-supervised Knowledge Distillation for Few-shot Learning	Jun 17, 2020	Few-Shot Image ClassificationFew-Shot Learning	CodeCode Available	1
Prior knowledge distillation based on financial time series	Jun 16, 2020	Knowledge DistillationTime Series	—Unverified	0
Multi-fidelity Neural Architecture Search with Knowledge Distillation	Jun 15, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	0
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks	Jun 15, 2020	AutoMLKnowledge Distillation	CodeCode Available	1
Pixel Invisibility: Detecting Objects Invisible in Color Images	Jun 15, 2020	Knowledge Distillationobject-detection	—Unverified	0
Knowledge Distillation Meets Self-Supervision	Jun 12, 2020	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Ensemble Distillation for Robust Model Fusion in Federated Learning	Jun 12, 2020	BIG-bench Machine LearningFederated Learning	CodeCode Available	0
Real-Time Video Inference on Edge Devices via Adaptive Model Streaming	Jun 11, 2020	Knowledge DistillationSemantic Segmentation	CodeCode Available	1
Adjoined Networks: A Training Paradigm with Applications to Network Compression	Jun 10, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1
Knowledge Distillation: A Survey	Jun 9, 2020	Knowledge DistillationModel Compression	—Unverified	0
Continual Representation Learning for Biometric Identification	Jun 8, 2020	Continual LearningKnowledge Distillation	CodeCode Available	0
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability	Jun 8, 2020	FairnessGeneral Classification	CodeCode Available	0
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech	Jun 8, 2020	Knowledge DistillationSpeech Synthesis	CodeCode Available	1
ResKD: Residual-Guided Knowledge Distillation	Jun 8, 2020	Knowledge Distillation	—Unverified	0
Multi-view Contrastive Learning for Online Knowledge Distillation	Jun 7, 2020	ClassificationContrastive Learning	CodeCode Available	1
ADMP: An Adversarial Double Masks Based Pruning Framework For Unsupervised Cross-Domain Compression	Jun 7, 2020	Domain AdaptationKnowledge Distillation	—Unverified	0
Peer Collaborative Learning for Online Knowledge Distillation	Jun 7, 2020	Knowledge Distillation	CodeCode Available	1
An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation	Jun 6, 2020	Data AugmentationKnowledge Distillation	—Unverified	0
An Overview of Neural Network Compression	Jun 5, 2020	Knowledge DistillationModel Compression	—Unverified	0

Show:10 25 50

← PrevPage 77 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified