SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 28512875 of 4240 papers

TitleStatusHype
SC2 Benchmark: Supervised Compression for Split Computing0
Graph Flow: Cross-layer Graph Flow Distillation for Dual Efficient Medical Image SegmentationCode1
Unified Visual Transformer CompressionCode1
SATS: Self-Attention Transfer for Continual Semantic SegmentationCode1
On the benefits of knowledge distillation for adversarial robustness0
DS3-Net: Difficulty-perceived Common-to-T1ce Semi-Supervised Multimodal MRI Synthesis Network0
CEKD:Cross Ensemble Knowledge Distillation for Augmented Fine-grained Data0
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio ClassificationCode3
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation0
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation0
Medical Image Segmentation on MRI Images with Missing Modalities: A Review0
Deep Class Incremental Learning from Decentralized DataCode0
Improving Neural ODEs via Knowledge Distillation0
Look Backward and Forward: Self-Knowledge Distillation with Bidirectional Decoder for Neural Machine Translation0
Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGACode0
Prediction-Guided Distillation for Dense Object DetectionCode1
Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation0
Representation Compensation Networks for Continual Semantic SegmentationCode1
Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better TransferabilityCode1
Efficient Sub-structured Knowledge DistillationCode0
How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting0
PyNET-QxQ: An Efficient PyNET Variant for QxQ Bayer Pattern Demosaicing in CMOS Image SensorsCode0
On Generalizing Beyond Domains in Cross-Domain Continual Learning0
Multi-trial Neural Architecture Search with Lottery Tickets0
Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine TranslationCode1
Show:102550
← PrevPage 115 of 170Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified