SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 40014050 of 4240 papers

TitleStatusHype
SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillationCode0
Deep geometric knowledge distillation with graphsCode0
Class incremental learning with probability dampening and cascaded gated classifierCode0
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language ModelsCode0
Attentive Task Interaction Network for Multi-Task LearningCode0
CaPriDe Learning: Confidential and Private Decentralized Learning Based on Encryption-Friendly Distillation LossCode0
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLPCode0
Deep-Disaster: Unsupervised Disaster Detection and Localization Using Visual DataCode0
FedSDAF: Leveraging Source Domain Awareness for Enhanced Federated Domain GeneralizationCode0
Enhancing OOD Detection Using Latent DiffusionCode0
3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health DetectionCode0
Object Attribute Matters in Visual Question AnsweringCode0
Advancing Compressed Video Action Recognition through Progressive Knowledge DistillationCode0
Towards Enabling Meta-Learning from Target ModelsCode0
Fed-RAC: Resource-Aware Clustering for Tackling Heterogeneity of Participants in Federated LearningCode0
FedKD-hybrid: Federated Hybrid Knowledge Distillation for Lithography Hotspot DetectionCode0
Deep Clustering with Diffused Sampling and Hardness-aware Self-distillationCode0
CAPEEN: Image Captioning with Early Exits and Knowledge DistillationCode0
ODM3D: Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object DetectionCode0
Scaffolding a Student to Instill KnowledgeCode0
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence EmbeddingCode0
Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge DistillationCode0
VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face RecognitionCode0
Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer EncodersCode0
Swapped Logit Distillation via Bi-level Teacher AlignmentCode0
Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly DetectionCode0
Deep Class Incremental Learning from Decentralized DataCode0
FedICT: Federated Multi-task Distillation for Multi-access Edge ComputingCode0
On-Device Language Models: A Comprehensive ReviewCode0
FedHPD: Heterogeneous Federated Reinforcement Learning via Policy DistillationCode0
Deep Classifier Mimicry without Data AccessCode0
Attention to detail: inter-resolution knowledge distillationCode0
Can Self-Supervised Representation Learning Methods Withstand Distribution Shifts and Corruptions?Code0
DED: Diagnostic Evidence Distillation for acne severity grading on face imagesCode0
FedHe: Heterogeneous Models and Communication-Efficient Federated LearningCode0
Adaptive Decoupled Pose Knowledge DistillationCode0
Decoupled Knowledge with Ensemble Learning for Online DistillationCode0
On enhancing the robustness of Vision Transformers: Defensive DiffusionCode0
One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive TranslationCode0
Camera-Incremental Object Re-Identification with Identity Knowledge EvolutionCode0
Federated Learning for Time-Series Healthcare Sensing with Incomplete ModalitiesCode0
One-shot Federated Learning without Server-side TrainingCode0
Federated Learning with a Single Shared ImageCode0
Federated Incremental Named Entity RecognitionCode0
FedDW: Distilling Weights through Consistency Optimization in Heterogeneous Federated LearningCode0
CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with Clustered Aggregation and Knowledge DIStilled RegularizationCode0
Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based Image RetrievalCode0
Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video PairsCode0
One-Teacher and Multiple-Student Knowledge Distillation on Sentiment ClassificationCode0
Decoding visual brain representations from electroencephalography through Knowledge Distillation and latent diffusion modelsCode0
Show:102550
← PrevPage 81 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified