SOTAVerified

Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Showing 42014240 of 4240 papers

TitleStatusHype
Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation LearningCode0
Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image ClassificationCode0
Asymmetric Masked Distillation for Pre-Training Small Foundation ModelsCode0
BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge DistillationCode0
Enhance Language Identification using Dual-mode Model with Knowledge DistillationCode0
Temperature-Centric Investigation of Speculative Decoding with Knowledge DistillationCode0
Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous DataCode0
Emulating Quantum Dynamics with Neural Networks via Knowledge DistillationCode0
Co-Teaching for Unsupervised Domain Adaptation and ExpansionCode0
Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-SupervisionCode0
EGAD: Evolving Graph Representation Learning with Self-Attention and Knowledge Distillation for Live Video Streaming EventsCode0
Efficient Ternary Weight Embedding Model: Bridging Scalability and PerformanceCode0
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge DistillationCode0
Point-to-Voxel Knowledge Distillation for LiDAR Semantic SegmentationCode0
Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural NetworksCode0
Efficient Sub-structured Knowledge DistillationCode0
Semi-Online Knowledge DistillationCode0
Temporal Action Proposal Generation With Action Frequency Adaptive NetworkCode0
POS-Constrained Parallel Decoding for Non-autoregressive GenerationCode0
Active Object Detection with Knowledge Aggregation and Distillation from Large ModelsCode0
Efficient Speech Translation through Model Compression and Knowledge DistillationCode0
Training convolutional neural networks with cheap convolutions and online distillationCode0
Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-IdentificationCode0
Correlation Congruence for Knowledge DistillationCode0
Biomed-DPT: Dual Modality Prompt Tuning for Biomedical Vision-Language ModelsCode0
PowerSkel: A Device-Free Framework Using CSI Signal for Human Skeleton Estimation in Power StationCode0
A Diffusion Model and Knowledge Distillation Framework for Robust Coral Detection in Complex Underwater EnvironmentsCode0
PP-ShiTu: A Practical Lightweight Image Recognition SystemCode0
CoReD: Generalizing Fake Media Detection with Continual Representation using DistillationCode0
Cooperative Retriever and Ranker in Deep RecommendersCode0
Cooperative Knowledge Distillation: A Learner Agnostic ApproachCode0
Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self DistillationCode0
Asymmetrical Reciprocity-based Federated Learning for Resolving Disparities in Medical DiagnosisCode0
Efficient Multitask Dense Predictor via BinarizationCode0
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech RecognitionCode0
Sentence Embedder Guided Utterance Encoder (SEGUE) for Spoken Language UnderstandingCode0
Evolving Knowledge Mining for Class Incremental SegmentationCode0
PreFallKD: Pre-Impact Fall Detection via CNN-ViT Knowledge DistillationCode0
Efficient Lung Ultrasound Severity Scoring Using Dedicated Feature ExtractorCode0
Cooperative Classification and Rationalization for Graph GeneralizationCode0
Show:102550
← PrevPage 85 of 85Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ScaleKD (T:BEiT-L S:ViT-B/14)Top-1 accuracy %86.43Unverified
2ScaleKD (T:Swin-L S:ViT-B/16)Top-1 accuracy %85.53Unverified
3ScaleKD (T:Swin-L S:ViT-S/16)Top-1 accuracy %83.93Unverified
4ScaleKD (T:Swin-L S:Swin-T)Top-1 accuracy %83.8Unverified
5KD++(T: regnety-16GF S:ViT-B)Top-1 accuracy %83.6Unverified
6VkD (T:RegNety 160 S:DeiT-S)Top-1 accuracy %82.9Unverified
7SpectralKD (T:Swin-S S:Swin-T)Top-1 accuracy %82.7Unverified
8ScaleKD (T:Swin-L S:ResNet-50)Top-1 accuracy %82.55Unverified
9DiffKD (T:Swin-L S: Swin-T)Top-1 accuracy %82.5Unverified
10DIST (T: Swin-L S: Swin-T)Top-1 accuracy %82.3Unverified
#ModelMetricClaimedVerifiedStatus
1SRD (T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)79.86Unverified
2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)78.76Unverified
3MV-MR (T: CLIP/ViT-B-16 S: resnet50)Top-1 Accuracy (%)78.6Unverified
4resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)78.28Unverified
5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])Top-1 Accuracy (%)78.08Unverified
6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)Top-1 Accuracy (%)77.93Unverified
7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)Top-1 Accuracy (%)77.68Unverified
8resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)77.5Unverified
9resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.68Unverified
10resnet8x4 (T: resnet32x4 S: resnet8x4)Top-1 Accuracy (%)76.31Unverified
#ModelMetricClaimedVerifiedStatus
1LSHFM (T: ResNet101 S: ResNet50)mAP93.17Unverified
2LSHFM (T: ResNet101 S: MobileNetV2)mAP90.14Unverified
#ModelMetricClaimedVerifiedStatus
1TIE-KD (T: Adabins S: MobileNetV2)RMSE2.43Unverified