Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 4240 papers

Title	Date	Tasks	Status	Hype
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image Generation	Nov 28, 2023	Cross-lingual Text-to-Image GenerationImage Generation	CodeCode Available	1
Some Like It Small: Czech Semantic Embedding Models for Industry Applications	Nov 23, 2023	Image RetrievalKnowledge Distillation	CodeCode Available	1
HoVer-UNet: Accelerating HoVerNet with UNet-based multi-class nuclei segmentation via knowledge distillation	Nov 21, 2023	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Point, Segment and Count: A Generalized Framework for Object Counting	Nov 21, 2023	Knowledge DistillationObject	CodeCode Available	1
FreeKD: Knowledge Distillation via Semantic Frequency Prompt	Nov 20, 2023	Knowledge Distillation	CodeCode Available	1
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention	Nov 18, 2023	Concept AlignmentGraph Generation	CodeCode Available	1
Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments	Nov 10, 2023	Activity RecognitionAutonomous Driving	CodeCode Available	1
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification	Nov 4, 2023	ClassificationCross-Domain Few-Shot	CodeCode Available	1
Implicit Chain of Thought Reasoning via Knowledge Distillation	Nov 2, 2023	Knowledge DistillationMath	CodeCode Available	1
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts	Nov 2, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models	Nov 2, 2023	Data AugmentationDomain Generalization	CodeCode Available	1
One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation	Oct 30, 2023	AllKnowledge Distillation	CodeCode Available	1
Label Poisoning is All You Need	Oct 29, 2023	AllBackdoor Attack	CodeCode Available	1
Understanding the Effects of Projectors in Knowledge Distillation	Oct 26, 2023	Knowledge Distillation	CodeCode Available	1
MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient	Oct 17, 2023	3D Object DetectionGPU	CodeCode Available	1
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents	Oct 13, 2023	InformativenessKnowledge Distillation	CodeCode Available	1
Transport-Hub-Aware Spatial-Temporal Adaptive Graph Transformer for Traffic Flow Prediction	Oct 12, 2023	Incremental LearningKnowledge Distillation	CodeCode Available	1
Online Speculative Decoding	Oct 11, 2023	Knowledge Distillation	CodeCode Available	1
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation	Oct 11, 2023	Decoderfr-en	CodeCode Available	1
A Discrepancy Aware Framework for Robust Anomaly Detection	Oct 11, 2023	Anomaly DetectionDecoder	CodeCode Available	1
LumiNet: The Bright Side of Perceptual Knowledge Distillation	Oct 5, 2023	ClassificationKnowledge Distillation	CodeCode Available	1
SEA: Sparse Linear Attention with Estimated Attention Mask	Oct 3, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1
NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation	Sep 30, 2023	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	1
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation	Sep 26, 2023	3D Object DetectionAutonomous Driving	CodeCode Available	1
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance	Sep 21, 2023	Domain GeneralizationKnowledge Distillation	CodeCode Available	1
Weight Averaging Improves Knowledge Distillation under Domain Shift	Sep 20, 2023	Domain GeneralizationKnowledge Distillation	CodeCode Available	1
DFIL: Deepfake Incremental Learning by Exploiting Domain-invariant Forgery Clues	Sep 18, 2023	Continual LearningContrastive Learning	CodeCode Available	1
FDCNet: Feature Drift Compensation Network for Class-Incremental Weakly Supervised Object Localization	Sep 17, 2023	class-incremental learningIncremental Learning	CodeCode Available	1
Rethinking Momentum Knowledge Distillation in Online Continual Learning	Sep 6, 2023	Continual LearningKnowledge Distillation	CodeCode Available	1
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers	Sep 3, 2023	Action DetectionAction Spotting	CodeCode Available	1
SpikeBERT: A Language Spikformer Learned from BERT with Knowledge Distillation	Aug 29, 2023	Knowledge Distillationtext-classification	CodeCode Available	1
Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection	Aug 28, 2023	Binary ClassificationClassification	CodeCode Available	1
DM-VTON: Distilled Mobile Real-time Virtual Try-On	Aug 26, 2023	GPUHuman Parsing	CodeCode Available	1
Sentence Embedding Models for Ancient Greek Using Multilingual Knowledge Distillation	Aug 24, 2023	Authorship AttributionKnowledge Distillation	CodeCode Available	1
Ground-to-Aerial Person Search: Benchmark Dataset and Approach	Aug 24, 2023	Knowledge DistillationPerson Search	CodeCode Available	1
FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning	Aug 24, 2023	Continual LearningFederated Learning	CodeCode Available	1
SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation	Aug 21, 2023	Knowledge DistillationLanguage Modelling	CodeCode Available	1
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning	Aug 21, 2023	Federated LearningKnowledge Distillation	CodeCode Available	1
AltDiffusion: A Multilingual Text-to-Image Diffusion Model	Aug 19, 2023	BlockingConcept Alignment	CodeCode Available	1
Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning	Aug 18, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1
Token-Scaled Logit Distillation for Ternary Weight Generative Language Models	Aug 13, 2023	Arithmetic ReasoningCommon Sense Reasoning	CodeCode Available	1
Multi-Label Knowledge Distillation	Aug 12, 2023	Binary ClassificationKnowledge Distillation	CodeCode Available	1
Multi-View Fusion and Distillation for Subgrade Distresses Detection based on 3D-GPR	Aug 9, 2023	GPRKnowledge Distillation	CodeCode Available	1
AICSD: Adaptive Inter-Class Similarity Distillation for Semantic Segmentation	Aug 8, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available	1
One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer	Aug 5, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available	1
VQGraph: Rethinking Graph Representation Space for Bridging GNNs and MLPs	Aug 4, 2023	Knowledge DistillationQuantization	CodeCode Available	1
Transferable Graph Structure Learning for Graph-based Traffic Forecasting Across Cities	Aug 4, 2023	Graph structure learningKnowledge Distillation	CodeCode Available	1
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty	Aug 3, 2023	Knowledge Distillation	CodeCode Available	1
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model	Aug 2, 2023	HallucinationImage Captioning	CodeCode Available	1
NormKD: Normalized Logits for Knowledge Distillation	Aug 1, 2023	image-classificationImage Classification	CodeCode Available	1

Show:10 25 50

← PrevPage 7 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified