SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 25512600 of 4240 papers

TitleStatusHype
Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation0
Web Content Filtering through knowledge distillation of Large Language Models0
NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge0
Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation0
Distilled Mid-Fusion Transformer Networks for Multi-Modal Human Activity Recognition0
Smaller3d: Smaller Models for 3D Semantic Segmentation Using Minkowski Engine and Knowledge Distillation MethodsCode0
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target TrainingCode0
Structure Aware Incremental Learning with Personalized Imitation Weights for Recommender Systems0
Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models0
Detect, Distill and Update: Detect, Distill and Update: Learned DB Systems Facing Out of Distribution DataCode0
Scaffolding a Student to Instill KnowledgeCode0
Refined Response Distillation for Class-Incremental Player DetectionCode0
Ensemble Modeling with Contrastive Knowledge Distillation for Sequential RecommendationCode0
Multi-to-Single Knowledge Distillation for Point Cloud Semantic SegmentationCode0
CORSD: Class-Oriented Relational Self Distillation0
Learning Human-Human Interactions in Images from Weak Textual Supervision0
Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional Inputs0
A Forward and Backward Compatible Framework for Few-shot Class-incremental Pill RecognitionCode0
Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving0
Improving Knowledge Distillation via Transferring Learning AbilityCode0
Decouple Non-parametric Knowledge Distillation For End-to-end Speech Translation0
Word Sense Induction with Knowledge Distillation from BERT0
Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks0
Knowledge Distillation Under Ideal Joint Classifier Assumption0
An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models0
Deep Collective Knowledge Distillation0
Learning to "Segment Anything" in Thermal Infrared Images through Knowledge Distillation with a Large Scale Dataset SATIRCode0
LaSNN: Layer-wise ANN-to-SNN Distillation for Effective and Efficient Training in Deep Spiking Neural Networks0
Always Strengthen Your Strengths: A Drift-Aware Incremental Learning Framework for CTR Prediction0
Teacher Network Calibration Improves Cross-Quality Knowledge DistillationCode0
Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based GamesCode0
Class-Incremental Learning of Plant and Disease Detection: Growing Branches with Knowledge Distillation0
Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation0
SFT-KD-Recon: Learning a Student-friendly Teacher for Knowledge Distillation in Magnetic Resonance Image ReconstructionCode0
Grouped Knowledge Distillation for Deep Face Recognition0
A Survey on Recent Teacher-student Learning Studies0
HyperINR: A Fast and Predictive Hypernetwork for Implicit Neural Representations via Knowledge Distillation0
Homogenizing Non-IID datasets via In-Distribution Knowledge Distillation for Decentralized Learning0
A Comprehensive Survey on Knowledge Distillation of Diffusion Models0
Model-Agnostic Decentralized Collaborative Learning for On-Device POI Recommendation0
Masked Student Dataset of ExpressionsCode0
Continual Detection Transformer for Incremental Object Detection0
Self-Distillation for Gaussian Process Regression and ClassificationCode0
Towards Efficient Task-Driven Model Reprogramming with Foundation Models0
MadEye: Boosting Live Video Analytics Accuracy with Adaptive Camera Configurations0
Cross-Class Feature Augmentation for Class Incremental Learning0
Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge DistillationCode0
Knowledge-Distilled Graph Neural Networks for Personalized Epileptic Seizure Detection0
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation0
Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders0
Show:102550
← PrevPage 52 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified