SOTAVerified

Scaled-YOLOv4: Scaling Cross Stage Partial Network

2020-11-16CVPR 2021Code Available1· sign in to hype

Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down and is applicable to small and large networks while maintaining optimal speed and accuracy. We propose a network scaling approach that modifies not only the depth, width, resolution, but also structure of the network. YOLOv4-large model achieves state-of-the-art results: 55.5% AP (73.4% AP50) for the MS COCO dataset at a speed of ~16 FPS on Tesla V100, while with the test time augmentation, YOLOv4-large achieves 56.0% AP (73.3 AP50). To the best of our knowledge, this is currently the highest accuracy on the COCO dataset among any published work. The YOLOv4-tiny model achieves 22.0% AP (42.0% AP50) at a speed of 443 FPS on RTX 2080Ti, while by using TensorRT, batch size = 4 and FP16-precision the YOLOv4-tiny achieves 1774 FPS.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
COCO minivalYOLOv4-P7 CSP-P7 (single-scale, 16 fps)box AP55.4Unverified
COCO test-devYOLOv4-P7 with TTAbox mAP55.8Unverified
COCO test-devYOLOv4-P6 with TTAbox mAP54.9Unverified
COCO test-devYOLOv4-P6 CSP-P6 (single-scale, 32 fps)box mAP54.3Unverified
COCO test-devYOLOv4-P5 with TTAbox mAP52.5Unverified
COCO test-devYOLOv4 (CD53)box mAP45.5Unverified

Reproductions