SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 42014240 of 4240 papers

TitleStatusHype
Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation0
Recurrent knowledge distillation0
Knowledge Distillation with Adversarial Samples Supporting Decision BoundaryCode0
Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students0
Born Again Neural NetworksCode0
Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation SystemsCode0
Neural Compatibility Modeling with Attentive Knowledge Distillation0
Few-shot learning of neural networks from scratch by pseudo example optimization0
Model compression for faster structural separation of macromolecules captured by Cellular Electron Cryo-Tomography0
Faster gaze prediction with dense networks and Fisher pruningCode0
Deep Net Triage: Analyzing the Importance of Network Layers via Structural Compression0
Generation and Consolidation of Recollections for Efficient Deep Lifelong Learning0
Learning Deep and Compact Models for Gesture RecognitionCode0
NestedNet: Learning Nested Sparse Structures in Deep Neural Networks0
StrassenNets: Deep Learning with a Multiplication BudgetCode0
Learning Efficient Object Detection Models with Knowledge Distillation0
Knowledge Concentration: Learning 100K Object Classifiers in a Single CNN0
MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Face ImagesCode0
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy0
Non-Autoregressive Neural Machine TranslationCode0
A Survey of Model Compression and Acceleration for Deep Neural Networks0
Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification0
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks0
Knowledge Distillation for Bilingual Dictionary Induction0
A Joint Sequential and Relational Model for Frame-Semantic Parsing0
Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation0
WebChild 2.0 : Fine-Grained Commonsense Knowledge Distillation0
A Gift From Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning0
TIP: Typifying the Interpretability of Procedures0
Knowledge distillation using unlabeled mismatched images0
Collaborative Deep Reinforcement LearningCode0
Knowledge Adaptation: Teaching to Adapt0
Ensemble Distillation for Neural Machine Translation0
Neural Machine Translation from Simplified Translations0
In Teacher We Trust: Learning Compressed Models for Pedestrian Detection0
A scalable convolutional neural network for task-specified scenarios via knowledge distillation0
Knowledge Distillation for Small-footprint Highway Networks0
Adapting Models to Signal Degradation using Distillation0
Distilling Knowledge from Deep Networks with Applications to Healthcare Domain0
Distilling Model KnowledgeCode0
Show:102550
← PrevPage 85 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified