Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 4240 papers

Title	Date	Tasks	Status	Hype
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning	Jul 25, 2024	Knowledge DistillationMathematical Reasoning	CodeCode Available	2
ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation	Jul 19, 2024	DecoderImage Segmentation	CodeCode Available	2
Progressive Pretext Task Learning for Human Trajectory Prediction	Jul 16, 2024	Knowledge DistillationPrediction	CodeCode Available	2
Accessing Vision Foundation Models at ImageNet-level Costs	Jul 15, 2024	Knowledge DistillationTransfer Learning	CodeCode Available	2
Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models	Jul 7, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	2
A Unified Framework for 3D Scene Understanding	Jul 3, 2024	Contrastive LearningKnowledge Distillation	CodeCode Available	2
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation	Jul 3, 2024	Domain GeneralizationKnowledge Distillation	CodeCode Available	2
Dual-Space Knowledge Distillation for Large Language Models	Jun 25, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	2
Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study	Jun 20, 2024	In-Context LearningKnowledge Distillation	CodeCode Available	2
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions	Jun 18, 2024	Knowledge Distillation	CodeCode Available	2
Improving the Training of Rectified Flows	May 30, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2
Rethinking Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising	Apr 11, 2024	Computational EfficiencyDenoising	CodeCode Available	2
Optimization Methods for Personalizing Large Language Models through Retrieval Augmentation	Apr 9, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	2
Diffusion Time-step Curriculum for One Image to 3D Generation	Apr 6, 2024	3D GenerationImage to 3D	CodeCode Available	2
Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners	Apr 2, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	2
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning	Mar 29, 2024	Continual LearningContinual Panoptic Segmentation	CodeCode Available	2
Scale Decoupled Distillation	Mar 20, 2024	Knowledge Distillation	CodeCode Available	2
Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection	Mar 14, 2024	Knowledge DistillationNovel Object Detection	CodeCode Available	2
CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning	Mar 12, 2024	Knowledge DistillationMultivariate Time Series Forecasting	CodeCode Available	2
V_kD: Improving Knowledge Distillation using Orthogonal Projections	Mar 10, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving	Mar 2, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	2
A Cognitive-Based Trajectory Prediction Approach for Autonomous Driving	Feb 29, 2024	Autonomous DrivingDecision Making	CodeCode Available	2
Sinkhorn Distance Minimization for Knowledge Distillation	Feb 27, 2024	DecoderKnowledge Distillation	CodeCode Available	2
PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning	Feb 27, 2024	Knowledge DistillationModel Compression	CodeCode Available	2
Incremental Sequence Labeling: A Tale of Two Shifts	Feb 16, 2024	Incremental LearningKnowledge Distillation	CodeCode Available	2
LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation	Jan 30, 2024	HallucinationKnowledge Distillation	CodeCode Available	2
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition	Jan 19, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
OBSeg: Accurate and Fast Instance Segmentation Framework Using Segmentation Foundation Models with Oriented Bounding Box Prompts	Jan 16, 2024	Amodal Instance SegmentationInstance Segmentation	CodeCode Available	2
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss	Jan 5, 2024	Knowledge Distillation	CodeCode Available	2
LiSA: LiDAR Localization with Semantic Awareness	Jan 1, 2024	Knowledge DistillationSemantic Segmentation	CodeCode Available	2
VkD: Improving Knowledge Distillation using Orthogonal Projections	Jan 1, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2
Scaled Decoupled Distillation	Jan 1, 2024	Knowledge Distillation	CodeCode Available	2
Point Segment and Count: A Generalized Framework for Object Counting	Jan 1, 2024	Few-shot Object Counting and DetectionKnowledge Distillation	CodeCode Available	2
TinySAM: Pushing the Envelope for Efficient Segment Anything Model	Dec 21, 2023	Knowledge DistillationQuantization	CodeCode Available	2
Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference	Dec 15, 2023	DecoderDenoising	CodeCode Available	2
LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS	Nov 28, 2023	Knowledge DistillationNeRF	CodeCode Available	2
Low-latency Real-time Voice Conversion on CPU	Nov 1, 2023	CPUKnowledge Distillation	CodeCode Available	2
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models	Oct 24, 2023	Audio ClassificationAudio Tagging	CodeCode Available	2
Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline	Sep 26, 2023	Knowledge DistillationObject Tracking	CodeCode Available	2
EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization	Sep 20, 2023	Knowledge Distillationobject-detection	CodeCode Available	2
LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis	Aug 18, 2023	Facial Expression RecognitionKnowledge Distillation	CodeCode Available	2
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data	Aug 8, 2023	Federated LearningKnowledge Distillation	CodeCode Available	2
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future	Jul 18, 2023	Knowledge Distillationobject-detection	CodeCode Available	2
DOT: A Distillation-Oriented Trainer	Jul 17, 2023	Knowledge Distillation	CodeCode Available	2
MiniLLM: Knowledge Distillation of Large Language Models	Jun 14, 2023	Instruction FollowingKnowledge Distillation	CodeCode Available	2
Are Large Kernels Better Teachers than Transformers for ConvNets?	May 30, 2023	Knowledge Distillation	CodeCode Available	2
CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition	May 24, 2023	DenoisingKnowledge Distillation	CodeCode Available	2
Lion: Adversarial Distillation of Proprietary Large Language Models	May 22, 2023	Instruction FollowingKnowledge Distillation	CodeCode Available	2
OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion	Feb 27, 2023	3D geometry3D Semantic Scene Completion	CodeCode Available	2
Social4Rec: Distilling User Preference from Social Graph for Video Recommendation in Tencent	Feb 20, 2023	Knowledge DistillationRecommendation Systems	CodeCode Available	2

Show:10 25 50

← PrevPage 2 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified