SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 22012250 of 4240 papers

TitleStatusHype
Knowledge Distillation Label Smoothing: Fact or Fallacy?0
FractalAD: A simple industrial anomaly detection method using fractal anomaly generation and backbone knowledge distillationCode0
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical DistillationCode1
On student-teacher deviations in distillation: does it pay to disobey?0
Few-shot Face Image Translation via GAN Prior Distillation0
Supervision Complexity and its Role in Knowledge Distillation0
MVKT-ECG: Efficient Single-lead ECG Classification on Multi-Label Arrhythmia by Multi-View Knowledge Transferring0
Improved knowledge distillation by utilizing backward pass knowledge in neural networks0
EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval0
Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?0
Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text0
OvarNet: Towards Open-vocabulary Object Attribute RecognitionCode1
A Simple Recipe for Competitive Low-compute Self supervised Vision Models0
Unifying Synergies between Self-supervised Learning and Dynamic ComputationCode0
The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation0
ProKD: An Unsupervised Prototypical Knowledge Distillation Network for Zero-Resource Cross-Lingual Named Entity Recognition0
RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation0
Adaptively Integrated Knowledge Distillation and Prediction Uncertainty for Continual Learning0
Knowledge Distillation in Federated Edge Learning: A Survey0
A Cohesive Distillation Architecture for Neural Language Models0
Effective Decision Boundary Learning for Class Incremental Learning0
TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps DistillationCode1
Synthetic data generation method for data-free knowledge distillation in regression neural networksCode0
Online Hyperparameter Optimization for Class-Incremental LearningCode1
ERNIE 3.0 Tiny: Frustratingly Simple Method to Improve Task-Agnostic Distillation GeneralizationCode0
Designing an Improved Deep Learning-based Model for COVID-19 Recognition in Chest X-ray Images: A Knowledge Distillation Approach0
Reference Twice: A Simple and Unified Baseline for Few-Shot Instance SegmentationCode1
RELIANT: Fair Knowledge Distillation for Graph Neural NetworksCode0
Knowledge-guided Causal Intervention for Weakly-supervised Object LocalizationCode0
Label-Guided Knowledge Distillation for Continual Semantic Segmentation on 2D Images and 3D Point CloudsCode1
Multi-Task Learning with Knowledge Distillation for Dense Prediction0
Automated Knowledge Distillation via Monte Carlo Tree SearchCode0
TripLe: Revisiting Pretrained Model Reuse and Progressive Learning for Efficient Vision Transformer Scaling and Searching0
Continual Segment: Towards a Single, Unified and Non-forgetting Continual Segmentation Model of 143 Whole-body Organs in CT Scans0
Knowledge-Spreader: Learning Semi-Supervised Facial Action Dynamics by Consistifying Knowledge Granularity0
UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors0
Alleviating Catastrophic Forgetting of Incremental Object Detection via Within-Class and Between-Class Knowledge Distillation0
Remembering Normality: Memory-guided Knowledge Distillation for Unsupervised Anomaly DetectionCode1
MI-GAN: A Simple Baseline for Image Inpainting on Mobile DevicesCode2
Tiny Updater: Towards Efficient Neural Network-Driven Software UpdatingCode0
Data-Free Class-Incremental Hand Gesture RecognitionCode1
Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object DetectionCode1
Masked Autoencoders Are Stronger Knowledge Distillers0
Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video RetrievalCode1
ICD-Face: Intra-class Compactness Distillation for Face Recognition0
Beyond the Limitation of Monocular 3D Detector via Knowledge DistillationCode0
Data-Free Knowledge Distillation via Feature Exchange and Activation Region ConstraintCode1
ScaleKD: Distilling Scale-Aware Knowledge in Small Object Detector0
Probabilistic Knowledge Distillation of Face Ensembles0
Multi-Level Logit DistillationCode1
Show:102550
← PrevPage 45 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified