SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 28012850 of 4240 papers

TitleStatusHype
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature ConsolidationCode1
A Dual-Contrastive Framework for Low-Resource Cross-Lingual Named Entity RecognitionCode0
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation0
Feature Structure Distillation with Centered Kernel Alignment in BERT TransferringCode1
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge DistillationCode1
Knowledge distillation with error-correcting transfer learning for wind power prediction0
Unified and Effective Ensemble Knowledge Distillation0
Rethinking Position Bias Modeling with Knowledge Distillation for CTR Prediction0
Preventing Distillation-based Attacks on Neural Network IP0
Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense EmbeddingsCode1
Conditional Autoregressors are Interpretable Classifiers0
A Closer Look at Rehearsal-Free Continual Learning0
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the TeacherCode1
Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification0
Rainbow Keywords: Efficient Incremental Learning for Online Spoken Keyword SpottingCode1
Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models0
Monitored Distillation for Positive Congruent Depth CompletionCode1
Self-Distillation from the Last Mini-Batch for Consistency RegularizationCode1
Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise DistillationCode2
Instance Relation Graph Guided Source-Free Domain Adaptive Object DetectionCode1
Knowledge Distillation: Bad Models Can Be Good Role Models0
RAVIR: A Dataset and Methodology for the Semantic Segmentation and Quantitative Analysis of Retinal Arteries and Veins in Infrared Reflectance Imaging0
Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches0
Uncertainty-aware Contrastive Distillation for Incremental Semantic SegmentationCode1
Knowledge Distillation with the Reused Teacher ClassifierCode1
Model LEGO: Creating Models Like Disassembling and Assembling Building BlocksCode1
PCA-Based Knowledge Distillation Towards Lightweight and Content-Style Balanced Photorealistic Style Transfer ModelsCode1
A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals0
Class-Incremental Learning for Action Recognition in Videos0
Rich Feature Construction for the Optimization-Generalization DilemmaCode1
Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error CorrectionCode1
R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental LearningCode1
Multitask Emotion Recognition Model with Knowledge Distillation and Task Discriminator0
Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal0
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis0
Scale-Equivalent Distillation for Semi-Supervised Object Detection0
On Neural Network Equivalence Checking using SMT Solvers0
Channel Self-Supervision for Online Knowledge Distillation0
SSD-KD: A Self-supervised Diverse Knowledge Distillation Method for Lightweight Skin Lesion Classification Using Dermoscopic ImagesCode1
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and QuantizationCode1
Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge DistillationCode1
Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge DistillationCode1
Emulating Quantum Dynamics with Neural Networks via Knowledge DistillationCode0
A Closer Look at Knowledge Distillation with Features, Logits, and Gradients0
Delta Distillation for Efficient Video ProcessingCode0
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data AugmentationCode1
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated LearningCode1
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation0
Domain Adaptive Hand Keypoint and Pixel Localization in the Wild0
Decoupled Knowledge DistillationCode2
Show:102550
← PrevPage 57 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified