HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices

2023-03-08Unverified0· sign in to hype

Lotfi Abdelkrim Mecharbat, Hadjer Benmeziane, Hamza Ouarnoughi, Smail Niar

Unverified — Be the first to reproduce this paper.

Abstract

Vision Transformers have enabled recent attention-based Deep Learning (DL) architectures to achieve remarkable results in Computer Vision (CV) tasks. However, due to the extensive computational resources required, these architectures are rarely implemented on resource-constrained platforms. Current research investigates hybrid handcrafted convolution-based and attention-based models for CV tasks such as image classification and object detection. In this paper, we propose HyT-NAS, an efficient Hardware-aware Neural Architecture Search (HW-NAS) including hybrid architectures targeting vision tasks on tiny devices. HyT-NAS improves state-of-the-art HW-NAS by enriching the search space and enhancing the search strategy as well as the performance predictors. Our experiments show that HyT-NAS achieves a similar hypervolume with less than ~5x training evaluations. Our resulting architecture outperforms MLPerf MobileNetV1 by 6.3% accuracy improvement with 3.5x less number of parameters on Visual Wake Words.

Tasks

Hardware Aware Neural Architecture Search image-classification Image Classification Neural Architecture Search object-detection Object Detection

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Visual Wake Words	HyT-NAS-BA	Accuracy	92.25	—	Unverified
Visual Wake Words	ProxylessNAS	Accuracy	86.55	—	Unverified
Visual Wake Words	MobileNetV2 (x0.35)	Accuracy	86.34	—	Unverified
Visual Wake Words	MobileNetV1	Accuracy	83.7	—	Unverified

HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices

Abstract

Tasks

Benchmark Results

Reproductions