SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 25512575 of 4240 papers

TitleStatusHype
Bidirectional Distillation: A Mixed-Play Framework for Multi-Agent Generalizable Behaviors0
Ground Reaction Force Estimation via Time-aware Knowledge Distillation0
3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation0
3D Denoisers are Good 2D Teachers: Molecular Pretraining via Denoising and Cross-Modal Distillation0
3D Face Alignment Through Fusion of Head Pose Information and Features0
3D Point Cloud Pre-training with Knowledge Distillation from 2D Images0
A baseline revisited: Pushing the limits of multi-segment models for context-aware translation0
A Bayesian Optimization Framework for Neural Network Compression0
ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression0
ABKD: Graph Neural Network Compression with Attention-Based Knowledge Distillation0
ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation0
Accelerating Diffusion Models with One-to-Many Knowledge Distillation0
Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling0
Accelerating Molecular Graph Neural Networks via Knowledge Distillation0
Accelerating Transformer Decoding via a Hybrid of Self-attention and Recurrent Neural Network0
Accurate and Structured Pruning for Efficient Automatic Speech Recognition0
Accurate Knowledge Distillation with n-best Reranking0
A Classifier-Free Incremental Learning Framework for Scalable Medical Image Segmentation0
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation0
A Closer Look at Knowledge Distillation with Features, Logits, and Gradients0
A Closer Look at Rehearsal-Free Continual Learning0
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement0
A Cohesive Distillation Architecture for Neural Language Models0
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models0
Supervised domain adaptation for building extraction from off-nadir aerial images0
Show:102550
← PrevPage 103 of 170Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified