SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 31513200 of 4240 papers

TitleStatusHype
Using Explainable Boosting Machine to Compare Idiographic and Nomothetic Approaches for Ecological Momentary Assessment Data0
Co-Teaching for Unsupervised Domain Adaptation and ExpansionCode0
CDKT-FL: Cross-Device Knowledge Transfer using Proxy Dataset in Federated Learning0
DST: Dynamic Substitute Training for Data-free Black-box Attack0
A Dual-Contrastive Framework for Low-Resource Cross-Lingual Named Entity RecognitionCode0
CL-XABSA: Contrastive Learning for Cross-lingual Aspect-based Sentiment AnalysisCode0
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation0
Rethinking Position Bias Modeling with Knowledge Distillation for CTR Prediction0
Preventing Distillation-based Attacks on Neural Network IP0
Knowledge distillation with error-correcting transfer learning for wind power prediction0
Unified and Effective Ensemble Knowledge Distillation0
Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification0
A Closer Look at Rehearsal-Free Continual Learning0
Conditional Autoregressors are Interpretable Classifiers0
Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models0
RAVIR: A Dataset and Methodology for the Semantic Segmentation and Quantitative Analysis of Retinal Arteries and Veins in Infrared Reflectance Imaging0
Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches0
Knowledge Distillation: Bad Models Can Be Good Role Models0
A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals0
Class-Incremental Learning for Action Recognition in Videos0
Multitask Emotion Recognition Model with Knowledge Distillation and Task Discriminator0
Scale-Equivalent Distillation for Semi-Supervised Object Detection0
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis0
Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal0
On Neural Network Equivalence Checking using SMT Solvers0
Channel Self-Supervision for Online Knowledge Distillation0
Emulating Quantum Dynamics with Neural Networks via Knowledge DistillationCode0
A Closer Look at Knowledge Distillation with Features, Logits, and Gradients0
Delta Distillation for Efficient Video ProcessingCode0
SC2 Benchmark: Supervised Compression for Split Computing0
Domain Adaptive Hand Keypoint and Pixel Localization in the Wild0
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation0
On the benefits of knowledge distillation for adversarial robustness0
DS3-Net: Difficulty-perceived Common-to-T1ce Semi-Supervised Multimodal MRI Synthesis Network0
CEKD:Cross Ensemble Knowledge Distillation for Augmented Fine-grained Data0
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation0
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation0
Medical Image Segmentation on MRI Images with Missing Modalities: A Review0
Deep Class Incremental Learning from Decentralized DataCode0
Look Backward and Forward: Self-Knowledge Distillation with Bidirectional Decoder for Neural Machine Translation0
Improving Neural ODEs via Knowledge Distillation0
Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGACode0
Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation0
How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting0
Efficient Sub-structured Knowledge DistillationCode0
PyNET-QxQ: An Efficient PyNET Variant for QxQ Bayer Pattern Demosaicing in CMOS Image SensorsCode0
On Generalizing Beyond Domains in Cross-Domain Continual Learning0
Multi-trial Neural Architecture Search with Lottery Tickets0
Enhance Language Identification using Dual-mode Model with Knowledge DistillationCode0
Student Becomes Decathlon Master in Retinal Vessel Segmentation via Dual-teacher Multi-target Domain AdaptationCode0
Show:102550
← PrevPage 64 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified