SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 28762900 of 4240 papers

TitleStatusHype
Student Becomes Decathlon Master in Retinal Vessel Segmentation via Dual-teacher Multi-target Domain AdaptationCode0
Enhance Language Identification using Dual-mode Model with Knowledge DistillationCode0
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter PruningCode1
Consistent Representation Learning for Continual Relation ExtractionCode1
Better Supervisory Signals by Observing Learning PathsCode0
MIAShield: Defending Membership Inference Attacks via Preemptive Exclusion of Members0
X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense CaptioningCode1
TRILLsson: Distilled Universal Paralinguistic Speech Representations0
Dual Embodied-Symbolic Concept Representations for Deep Learning0
Self-Supervised Vision Transformers Learn Visual Concepts in HistopathologyCode1
Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation0
TransKD: Transformer Knowledge Distillation for Efficient Semantic SegmentationCode1
Content-Variant Reference Image Quality Assessment via Knowledge DistillationCode1
Joint Answering and Explanation for Visual Commonsense ReasoningCode0
Bridging the Gap Between Patient-specific and Patient-independent Seizure Prediction via Knowledge Distillation0
Learn From the Past: Experience Ensemble Knowledge Distillation0
Efficient Video Segmentation Models with Per-frame Inference0
Are All Linear Regions Created Equal?Code0
Multi-Teacher Knowledge Distillation for Incremental Implicitly-Refined Classification0
Distilled Neural Networks for Efficient Learning to RankCode0
A Novel Architecture Slimming Method for Network Pruning and Knowledge Distillation0
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning0
CaMEL: Mean Teacher Learning for Image CaptioningCode1
Cross-Task Knowledge Distillation in Multi-Task Recommendation0
General Cyclical Training of Neural NetworksCode1
Show:102550
← PrevPage 116 of 170Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified