Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1551–1600 of 4240 papers

Title	Date	Tasks	Status	Hype
Distilling Knowledge from CNN-Transformer Models for Enhanced Human Action Recognition	Nov 2, 2023	Action RecognitionKnowledge Distillation	—Unverified	0
Low-latency Real-time Voice Conversion on CPU	Nov 1, 2023	CPUKnowledge Distillation	CodeCode Available	2
Boosting Summarization with Normalizing Flows and Aggressive Training	Nov 1, 2023	DecoderKnowledge Distillation	CodeCode Available	0
Group Distributionally Robust Knowledge Distillation	Nov 1, 2023	Knowledge Distillation	—Unverified	0
NEO-KD: Knowledge-Distillation-Based Adversarial Training for Robust Multi-Exit Neural Networks	Nov 1, 2023	Knowledge Distillation	—Unverified	0
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling	Nov 1, 2023	HallucinationKnowledge Distillation	CodeCode Available	4
Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision	Oct 31, 2023	InformativenessKnowledge Distillation	—Unverified	0
One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation	Oct 30, 2023	AllKnowledge Distillation	CodeCode Available	1
AMLNet: Adversarial Mutual Learning Neural Network for Non-AutoRegressive Multi-Horizon Time Series Forecasting	Oct 30, 2023	DecoderDiversity	CodeCode Available	0
RCKD: Response-Based Cross-Task Knowledge Distillation for Pathological Image Analysis	Oct 29, 2023	Image ClassificationKnowledge Distillation	—Unverified	0
Label Poisoning is All You Need	Oct 29, 2023	AllBackdoor Attack	CodeCode Available	1
MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition	Oct 29, 2023	Knowledge Distillationspeech-recognition	—Unverified	0
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation	Oct 29, 2023	DiversityEvolutionary Algorithms	—Unverified	0
Discourse Structures Guided Fine-grained Propaganda Identification	Oct 28, 2023	AttributeKnowledge Distillation	CodeCode Available	0
ODM3D: Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object Detection	Oct 28, 2023	3D Object DetectionAutonomous Driving	CodeCode Available	0
Efficient Object Detection in Optical Remote Sensing Imagery via Attention-based Feature Distillation	Oct 28, 2023	Knowledge DistillationObject	—Unverified	0
Multi-label Emotion Analysis in Conversation via Multimodal Knowledge Distillation	Oct 27, 2023	Emotion RecognitionKnowledge Distillation	—Unverified	0
Towards a Unified Conversational Recommendation System: Multi-task Learning via Contextualized Knowledge Distillation	Oct 27, 2023	Conversational RecommendationDiversity	CodeCode Available	0
Understanding the Effects of Projectors in Knowledge Distillation	Oct 26, 2023	Knowledge Distillation	CodeCode Available	1
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model	Oct 26, 2023	Data AugmentationGeneral Knowledge	CodeCode Available	0
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP	Oct 26, 2023	image-classificationImage Classification	—Unverified	0
SonoSAMTrack -- Segment and Track Anything on Ultrasound Images	Oct 25, 2023	Knowledge Distillation	—Unverified	0
TOP-Training: Target-Oriented Pretraining for Medical Extractive Question Answering	Oct 25, 2023	Domain AdaptationExtractive Question-Answering	CodeCode Available	0
Wakening Past Concepts without Past Data: Class-Incremental Learning from Online Placebos	Oct 24, 2023	class-incremental learningClass Incremental Learning	—Unverified	0
Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous Data	Oct 24, 2023	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	0
ABKD: Graph Neural Network Compression with Attention-Based Knowledge Distillation	Oct 24, 2023	Drug DiscoveryFake News Detection	—Unverified	0
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models	Oct 24, 2023	Audio ClassificationAudio Tagging	CodeCode Available	2
MCC-KD: Multi-CoT Consistent Knowledge Distillation	Oct 23, 2023	DiversityKnowledge Distillation	CodeCode Available	0
Leveraging Complementary Attention maps in vision transformers for OCT image analysis	Oct 21, 2023	Knowledge Distillation	—Unverified	0
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images	Oct 20, 2023	Data AugmentationData-free Knowledge Distillation	—Unverified	0
Enhancing Abstractiveness of Summarization Models through Calibrated Distillation	Oct 20, 2023	Abstractive Text SummarizationInformativeness	—Unverified	0
GenDistiller: Distilling Pre-trained Language Models based on Generative Models	Oct 20, 2023	Knowledge DistillationLanguage Modeling	—Unverified	0
DistillCSE: Distilled Contrastive Learning for Sentence Embeddings	Oct 20, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available	0
MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient	Oct 17, 2023	3D Object DetectionGPU	CodeCode Available	1
Leveraging Knowledge Distillation for Efficient Deep Reinforcement Learning in Resource-Constrained Environments	Oct 16, 2023	Decision MakingDeep Reinforcement Learning	CodeCode Available	0
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents	Oct 13, 2023	InformativenessKnowledge Distillation	CodeCode Available	1
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models	Oct 13, 2023	Knowledge Distillation	—Unverified	0
Revisiting Multi-modal 3D Semantic Segmentation in Real-world Autonomous Driving	Oct 13, 2023	3D Semantic SegmentationAutonomous Driving	—Unverified	0
Transport-Hub-Aware Spatial-Temporal Adaptive Graph Transformer for Traffic Flow Prediction	Oct 12, 2023	Incremental LearningKnowledge Distillation	CodeCode Available	1
DistillSpec: Improving Speculative Decoding via Knowledge Distillation	Oct 12, 2023	Knowledge DistillationLanguage Modelling	—Unverified	0
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation	Oct 11, 2023	Decoderfr-en	CodeCode Available	1
Retrieve Anything To Augment Large Language Models	Oct 11, 2023	Knowledge DistillationRetrieval	—Unverified	0
A Discrepancy Aware Framework for Robust Anomaly Detection	Oct 11, 2023	Anomaly DetectionDecoder	CodeCode Available	1
Distilling Efficient Vision Transformers from CNNs for Semantic Segmentation	Oct 11, 2023	Knowledge DistillationSemantic Segmentation	—Unverified	0
Online Speculative Decoding	Oct 11, 2023	Knowledge Distillation	CodeCode Available	1
Distillation Improves Visual Place Recognition for Low Quality Images	Oct 10, 2023	Knowledge DistillationQuantization	CodeCode Available	0
Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data	Oct 10, 2023	Knowledge Distillation	CodeCode Available	0
Knowledge Distillation for Anomaly Detection	Oct 9, 2023	Anomaly DetectionKnowledge Distillation	—Unverified	0
What do larger image classifiers memorise?	Oct 9, 2023	image-classificationImage Classification	—Unverified	0
Applying Knowledge Distillation to Improve Weed Mapping With Drones	Oct 8, 2023	Knowledge DistillationManagement	CodeCode Available	0

Show:10 25 50

← PrevPage 32 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified