SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 25012550 of 4240 papers

TitleStatusHype
Real-time Spatio-temporal Action Localization via Learning Motion Representation0
ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation0
Rebalancing Multi-Label Class-Incremental Learning0
Recalling The Forgotten Class Memberships: Unlearned Models Can Be Noisy Labelers to Leak Privacy0
Recent Advances in Direct Speech-to-text Translation0
Recent Advances of Continual Learning in Computer Vision: An Overview0
Membership Privacy for Machine Learning Models Through Knowledge Transfer0
Reconstructing Perceived Images from Brain Activity by Visually-guided Cognitive Representation and Adversarial Learning0
Rectified Decision Trees: Exploring the Landscape of Interpretable and Effective Machine Learning0
Rectified Decision Trees: Towards Interpretability, Compression and Empirical Soundness0
Rectifying the Data Bias in Knowledge Distillation0
Recurrent knowledge distillation0
Recurrent Stacking of Layers in Neural Networks: An Application to Neural Machine Translation0
Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation0
Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation0
Reducing the Teacher-Student Gap via Adaptive Temperatures0
RefBERT: Compressing BERT by Referencing to Pre-computed Representations0
Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation0
Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation0
Region-aware Knowledge Distillation for Efficient Image-to-Image Translation0
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates0
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition0
Reinforced Multi-Teacher Selection for Knowledge Distillation0
Relational Subsets Knowledge Distillation for Long-tailed Retinal Diseases Recognition0
Relation Modeling and Distillation for Learning with Noisy Labels0
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA0
Remembering Transformer for Continual Learning0
Remining Hard Negatives for Generative Pseudo Labeled Domain Adaptation0
Remote Sensing Image Classification with Decoupled Knowledge Distillation0
Removing Rain Streaks via Task Transfer Learning0
Representation Consolidation from Multiple Expert Teachers0
Representation Disparity-aware Distillation for 3D Object Detection0
Representation Transfer by Optimal Transport0
Research on Multilingual News Clustering Based on Cross-Language Word Embeddings0
Research on the Online Update Method for Retrieval-Augmented Generation (RAG) Model with Incremental Learning0
Residual Knowledge Distillation0
ResKD: Residual-Guided Knowledge Distillation0
Resolution-Based Distillation for Efficient Histology Image Classification0
Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework0
REFT: Resource-Efficient Federated Training Framework for Heterogeneous and Resource-Constrained Environments0
Respecting Transfer Gap in Knowledge Distillation0
Response-based Distillation for Incremental Object Detection0
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers0
Rethinking Attention Mechanism in Time Series Classification0
Rethinking Feature-Based Knowledge Distillation for Face Recognition0
Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off0
Rethinking Knowledge Distillation via Cross-Entropy0
Rethinking Knowledge in Distillation: An In-context Sample Retrieval Perspective0
Rethinking Position Bias Modeling with Knowledge Distillation for CTR Prediction0
Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective0
Show:102550
← PrevPage 51 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified