Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2801–2850 of 4240 papers

Title	Date	Tasks	Status
Multi-adversarial Faster-RCNN with Paradigm Teacher for Unrestricted Object Detection	Dec 11, 2022	Domain AdaptationKnowledge Distillation	—Unverified
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning	Dec 10, 2022	Knowledge DistillationRepresentation Learning	—Unverified
LEAD: Liberal Feature-based Distillation for Dense Retrieval	Dec 10, 2022	Document RankingKnowledge Distillation	—Unverified
Knowledge Distillation Applied to Optical Channel Equalization: Solving the Parallelization Problem of Recurrent Connection	Dec 8, 2022	Knowledge Distillation	—Unverified
Occlusion-Robust FAU Recognition by Mining Latent Space of Masked Autoencoders	Dec 8, 2022	Knowledge Distillation	—Unverified
Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation	Dec 6, 2022	Knowledge DistillationMachine Translation	—Unverified
Open World DETR: Transformer based Open World Object Detection	Dec 6, 2022	Knowledge DistillationObject	—Unverified
Leveraging Different Learning Styles for Improved Knowledge Distillation in Biomedical Imaging	Dec 6, 2022	Knowledge DistillationModel Compression	—Unverified
DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection	Dec 5, 2022	3D Object Detectionclass-incremental learning	—Unverified
Single image calibration using knowledge distillation approaches	Dec 5, 2022	Camera CalibrationIncremental Learning	—Unverified
The RoyalFlush System for the WMT 2022 Efficiency Task	Dec 3, 2022	DecoderGPU	—Unverified
StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition	Dec 2, 2022	Image RetrievalKnowledge Distillation	—Unverified
Injecting Spatial Information for Monaural Speech Enhancement via Knowledge Distillation	Dec 2, 2022	Knowledge DistillationSpeech Enhancement	—Unverified
Distilling Reasoning Capabilities into Smaller Language Models	Dec 1, 2022	GSM8KKnowledge Distillation	CodeCode Available
Coordinating Cross-modal Distillation for Molecular Property Prediction	Nov 30, 2022	Graph RegressionGraph Representation Learning	—Unverified
Explicit Knowledge Transfer for Weakly-Supervised Code Generation	Nov 30, 2022	Code GenerationFew-Shot Learning	—Unverified
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression	Nov 30, 2022	Efficient ExplorationKnowledge Distillation	—Unverified
Hint-dynamic Knowledge Distillation	Nov 30, 2022	Knowledge Distillation	—Unverified
Random Copolymer inverse design system orienting on Accurate discovering of Antimicrobial peptide-mimetic copolymers	Nov 30, 2022	Activity PredictionKnowledge Distillation	—Unverified
Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection	Nov 30, 2022	3D Object DetectionDepth Estimation	CodeCode Available
Feature-domain Adaptive Contrastive Distillation for Efficient Single Image Super-Resolution	Nov 29, 2022	Image Super-ResolutionKnowledge Distillation	—Unverified
SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification	Nov 28, 2022	Few-Shot Image ClassificationFew-Shot Learning	CodeCode Available
Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition	Nov 28, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
BJTU-WeChat's Systems for the WMT22 Chat Translation Task	Nov 28, 2022	DenoisingKnowledge Distillation	—Unverified
Lightning Fast Video Anomaly Detection via Adversarial Knowledge Distillation	Nov 28, 2022	Anomaly DetectionKnowledge Distillation	CodeCode Available
Class-aware Information for Logit-based Knowledge Distillation	Nov 27, 2022	Knowledge Distillation	—Unverified
EPIK: Eliminating multi-model Pipelines with Knowledge-distillation	Nov 27, 2022	Knowledge DistillationTransliteration	—Unverified
SKDBERT: Compressing BERT via Stochastic Knowledge Distillation	Nov 26, 2022	Knowledge DistillationLanguage Modeling	—Unverified
Structural Knowledge Distillation for Object Detection	Nov 23, 2022	Feature ImportanceKnowledge Distillation	—Unverified
On the Transferability of Visual Features in Generalized Zero-Shot Learning	Nov 22, 2022	Generalized Zero-Shot LearningKnowledge Distillation	CodeCode Available
Blind Knowledge Distillation for Robust Image Classification	Nov 21, 2022	Classificationimage-classification	CodeCode Available
Privacy in Practice: Private COVID-19 Detection in X-Ray Images (Extended Version)	Nov 21, 2022	Knowledge DistillationMembership Inference Attack	CodeCode Available
Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders	Nov 20, 2022	Knowledge DistillationModel Compression	CodeCode Available
AI-KD: Adversarial learning and Implicit regularization for self-Knowledge Distillation	Nov 20, 2022	Knowledge DistillationSelf-Knowledge Distillation	—Unverified
Scalable Collaborative Learning via Representation Sharing	Nov 20, 2022	Federated LearningKnowledge Distillation	—Unverified
DASECount: Domain-Agnostic Sample-Efficient Wireless Indoor Crowd Counting via Few-shot Learning	Nov 18, 2022	Crowd CountingFew-Shot Learning	—Unverified
DETRDistill: A Universal Knowledge Distillation Framework for DETR-families	Nov 17, 2022	Knowledge Distillationobject-detection	—Unverified
Knowledge distillation for fast and accurate DNA sequence correction	Nov 17, 2022	Knowledge Distillation	—Unverified
D^3ETR: Decoder Distillation for Detection Transformer	Nov 17, 2022	DecoderKnowledge Distillation	—Unverified
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers	Nov 17, 2022	Knowledge DistillationModel Compression	CodeCode Available
Sub-Graph Learning for Spatiotemporal Forecasting via Knowledge Distillation	Nov 17, 2022	DiversityGraph Learning	—Unverified
Yield Evaluation of Citrus Fruits based on the YoloV5 compressed by Knowledge Distillation	Nov 16, 2022	Knowledge Distillation	—Unverified
An Investigation of the Combination of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding	Nov 15, 2022	class-incremental learningClass Incremental Learning	CodeCode Available
Instance-aware Model Ensemble With Distillation For Unsupervised Domain Adaptation	Nov 15, 2022	Domain AdaptationKnowledge Distillation	—Unverified
An Efficient Active Learning Pipeline for Legal Text Classification	Nov 15, 2022	Active LearningClassification	—Unverified
Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling	Nov 15, 2022	General KnowledgeKnowledge Distillation	CodeCode Available
Feature Correlation-guided Knowledge Transfer for Federated Self-supervised Learning	Nov 14, 2022	Feature CorrelationFederated Learning	—Unverified
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection	Nov 14, 2022	Knowledge Distillation	—Unverified
An Interpretable Neuron Embedding for Static Knowledge Distillation	Nov 14, 2022	Knowledge Distillation	—Unverified
Long-Range Zero-Shot Generative Deep Network Quantization	Nov 13, 2022	Knowledge DistillationQuantization	—Unverified

Show:10 25 50

← PrevPage 57 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified