Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1301–1350 of 4240 papers

Title	Date	Tasks	Status
Fast DistilBERT on CPUs	Oct 27, 2022	Knowledge DistillationModel Compression	—Unverified
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks	May 5, 2025	Code CompletionCode Generation	—Unverified
Cooperative Learning for Cost-Adaptive Inference	Dec 13, 2023	Knowledge Distillation	—Unverified
FAN-Trans: Online Knowledge Distillation for Facial Action Unit Detection	Nov 11, 2022	Action Unit DetectionFace Alignment	—Unverified
A Knowledge Distillation Approach for Sepsis Outcome Prediction from Multivariate Clinical Time Series	Nov 16, 2023	Knowledge DistillationTime Series	—Unverified
Cooperative Denoising for Distantly Supervised Relation Extraction	Aug 1, 2018	DenoisingInformation Retrieval	—Unverified
On Importance of Pruning and Distillation for Efficient Low Resource NLP	Sep 21, 2024	Document ClassificationGPU	—Unverified
Fast and Efficient Once-For-All Networks for Diverse Hardware Deployment	Sep 29, 2021	AllGPU	—Unverified
Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation	Oct 5, 2022	Graph Representation LearningKnowledge Distillation	—Unverified
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition	Sep 29, 2021	image-classificationImage Classification	—Unverified
Automated Channel Pruning with Learned Importance	Sep 29, 2021	DenoisingGPU	—Unverified
Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies	Apr 29, 2024	Knowledge Distillationreinforcement-learning	—Unverified
Controlling the Quality of Distillation in Response-Based Network Compression	Dec 19, 2021	Knowledge Distillation	—Unverified
Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation	Sep 5, 2023	Image CompressionKnowledge Distillation	—Unverified
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation	Apr 2, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization	May 8, 2025	AttributeBenchmarking	—Unverified
Contrast-reconstruction Representation Learning for Self-supervised Skeleton-based Action Recognition	Nov 22, 2021	Action RecognitionContrastive Learning	—Unverified
Contrast R-CNN for Continual Learning in Object Detection	Jul 11, 2021	Continual Learningimage-classification	—Unverified
AUTOKD: Automatic Knowledge Distillation Into A Student Architecture Family	Nov 5, 2021	Bayesian OptimizationKnowledge Distillation	—Unverified
Contrastive Representation Distillation via Multi-Scale Feature Decoupling	Feb 9, 2025	Knowledge DistillationTransfer Learning	—Unverified
A Joint Sequential and Relational Model for Frame-Semantic Parsing	Sep 1, 2017	Knowledge DistillationMachine Translation	—Unverified
AirNet: Neural Network Transmission over the Air	May 24, 2021	Knowledge Distillation	—Unverified
Contrastive Learning-Based Spectral Knowledge Distillation for Multi-Modality and Missing Modality Scenarios in Semantic Segmentation	Dec 4, 2023	BenchmarkingContrastive Learning	—Unverified
AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models	Jan 21, 2022	Bayesian OptimizationKnowledge Distillation	—Unverified
AdapterDistillation: Non-Destructive Task Composition with Knowledge Distillation	Dec 26, 2023	Knowledge DistillationRetrieval	—Unverified
Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments	May 25, 2023	Continual LearningContinual Semantic Segmentation	—Unverified
Contrastive Continual Multi-view Clustering with Filtered Structural Fusion	Sep 26, 2023	ClusteringContrastive Learning	—Unverified
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models	Jan 29, 2022	Inductive BiasKnowledge Distillation	—Unverified
Continuous sign language recognition based on cross-resolution knowledge distillation	Mar 13, 2023	Knowledge DistillationSign Language Recognition	—Unverified
Dynamic Object Queries for Transformer-based Incremental Object Detection	Jul 31, 2024	Knowledge DistillationObject	—Unverified
Fair Feature Importance Scores for Interpreting Tree-Based Methods and Surrogates	Oct 6, 2023	FairnessFeature Importance	—Unverified
Continuous Concepts Removal in Text-to-image Diffusion Models	Nov 30, 2024	Knowledge Distillation	—Unverified
Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization	Dec 12, 2022	Knowledge DistillationNatural Language Understanding	—Unverified
AutoADR: Automatic Model Design for Ad Relevance	Oct 14, 2020	AutoMLKnowledge Distillation	—Unverified
Continual Self-Supervised Learning with Masked Autoencoders in Remote Sensing	Jun 26, 2025	Continual LearningContinual Self-Supervised Learning	—Unverified
Continual Segment: Towards a Single, Unified and Non-forgetting Continual Segmentation Model of 143 Whole-body Organs in CT Scans	Jan 1, 2023	Continual Semantic SegmentationDecoder	—Unverified
Adapter-based Selective Knowledge Distillation for Federated Multi-domain Meeting Summarization	Aug 7, 2023	Federated LearningKnowledge Distillation	—Unverified
Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling	Dec 3, 2018	Knowledge DistillationMachine Translation	—Unverified
Fairly Predicting Graft Failure in Liver Transplant for Organ Assigning	Feb 18, 2023	FairnessKnowledge Distillation	—Unverified
Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning	Jun 21, 2024	Knowledge Distillation	—Unverified
Continual Segment: Towards a Single, Unified and Accessible Continual Segmentation Model of 143 Whole-body Organs in CT Scans	Feb 1, 2023	Continual Semantic SegmentationDecoder	—Unverified
A Unified Knowledge Distillation Framework for Deep Directed Graphical Models	Sep 29, 2021	Continual LearningFederated Learning	—Unverified
Accelerating Diffusion Models with One-to-Many Knowledge Distillation	Oct 5, 2024	Image GenerationKnowledge Distillation	—Unverified
AI-KD: Adversarial learning and Implicit regularization for self-Knowledge Distillation	Nov 20, 2022	Knowledge DistillationSelf-Knowledge Distillation	—Unverified
A Unified Knowledge-Distillation and Semi-Supervised Learning Framework to Improve Industrial Ads Delivery Systems	Feb 5, 2025	Knowledge Distillation	—Unverified
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains	Jun 25, 2021	Knowledge Distillation	—Unverified
Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices	Jun 20, 2024	Knowledge DistillationModel Compression	—Unverified
Continual Learning with Dirichlet Generative-based Rehearsal	Sep 13, 2023	Continual LearningIncremental Learning	—Unverified
Continual Learning with Diffusion-based Generative Replay for Industrial Streaming Data	Jun 22, 2024	Continual LearningKnowledge Distillation	—Unverified
A Unified Framework for Continual Learning and Unlearning	Aug 21, 2024	Continual LearningKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 27 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified