SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 801850 of 4240 papers

TitleStatusHype
CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination0
MedMAP: Promoting Incomplete Multi-modal Brain Tumor Segmentation with Alignment0
V2X-VLM: End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models0
Multi Teacher Privileged Knowledge Distillation for Multimodal Expression RecognitionCode0
MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLUCode0
Towards Real-time Video Compressive Sensing on Mobile DevicesCode0
One Step Diffusion-based Super-Resolution with Time-Aware DistillationCode1
FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher0
Knowledge Distillation with Refined LogitsCode1
Using Advanced LLMs to Enhance Smaller LLMs: An Interpretable Knowledge Distillation Approach0
Optimizing Vision Transformers with Data-Free Knowledge Transfer0
Low-Dimensional Federated Knowledge Graph Embedding via Knowledge Distillation0
LaDiMo: Layer-wise Distillation Inspired MoEfier0
ComKD-CLIP: Comprehensive Knowledge Distillation for Contrastive Language-Image Pre-traning Model0
Distillation Learning Guided by Image Reconstruction for One-Shot Medical Image SegmentationCode0
Real-time Event Recognition of Long-distance Distributed Vibration Sensing with Knowledge Distillation and Hardware AccelerationCode1
Dual-Modeling Decouple Distillation for Unsupervised Anomaly Detection0
EEGMobile: Enhancing Speed and Accuracy in EEG-Based Gaze Prediction with Advanced Mobile Architectures0
Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal SummarizationCode0
Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations0
Comb, Prune, Distill: Towards Unified Pruning for Vision Model CompressionCode0
VizECGNet: Visual ECG Image Network for Cardiovascular Diseases Classification with Multi-Modal Training and Knowledge Distillation0
Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped ConvolutionCode0
An approach to optimize inference of the DIART speaker diarization pipeline0
Unsupervised Domain Adaption Harnessing Vision-Language Pre-trainingCode1
Do You Remember . . . the Future? Weak-to-Strong generalization in 3D Object DetectionCode0
Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual LearningCode0
DistillGrasp: Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects0
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation0
StyleRF-VolVis: Style Transfer of Neural Radiance Fields for Expressive Volume Visualization0
Gemma 2: Improving Open Language Models at a Practical Size0
Lifelong Person Search0
Dynamic Object Queries for Transformer-based Incremental Object Detection0
VIPeR: Visual Incremental Place Recognition with Adaptive Mining and Continual Learning0
Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins0
Pruning Large Language Models with Semi-Structural Adaptive Sparse TrainingCode1
SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillationCode0
ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality0
Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge DistillationCode0
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language ModelsCode0
LLAVADI: What Matters For Multimodal Large Language Models Distillation0
Logic Distillation: Learning from Code Function by Function for Planning and Decision-making0
Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network0
Modality-Balanced Learning for Multimedia RecommendationCode1
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D TransformersCode0
Towards A Generalizable Pathology Foundation Model via Unified Knowledge DistillationCode2
FedUD: Exploiting Unaligned Data for Cross-Platform Federated Click-Through Rate Prediction0
Leveraging Foundation Models via Knowledge Distillation in Multi-Object Tracking: Distilling DINOv2 Features to FairMOTCode0
How to Train the Teacher Model for Effective Knowledge DistillationCode0
NC-NCD: Novel Class Discovery for Node ClassificationCode0
Show:102550
← PrevPage 17 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified