SOTAVerified

0/1 Deep Neural Networks via Block Coordinate Descent

2022-06-19Unverified0· sign in to hype

HUI ZHANG, Shenglong Zhou, Geoffrey Ye Li, Naihua Xiu

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The step function is one of the simplest and most natural activation functions for deep neural networks (DNNs). As it counts 1 for positive variables and 0 for others, its intrinsic characteristics (e.g., discontinuity and no viable information of subgradients) impede its development for several decades. Even if there is an impressive body of work on designing DNNs with continuous activation functions that can be deemed as surrogates of the step function, it is still in the possession of some advantageous properties, such as complete robustness to outliers and being capable of attaining the best learning-theoretic guarantee of predictive accuracy. Hence, in this paper, we aim to train DNNs with the step function used as an activation function (dubbed as 0/1 DNNs). We first reformulate 0/1 DNNs as an unconstrained optimization problem and then solve it by a block coordinate descend (BCD) method. Moreover, we acquire closed-form solutions for sub-problems of BCD as well as its convergence properties. Furthermore, we also integrate _2,0-regularization into 0/1 DNN to accelerate the training process and compress the network scale. As a result, the proposed algorithm has a high performance on classifying MNIST and Fashion-MNIST datasets. As a result, the proposed algorithm has a desirable performance on classifying MNIST, FashionMNIST, Cifar10, and Cifar100 datasets.

Tasks

10-shot image generation16k2D Object Detection3D dense captioning3D Face Alignment3D Facial Expression Recognition3D Facial Landmark Localization3D Hand Pose Estimation3D Instance Segmentation3D Lane Detection3D Multi-Object Tracking3D Place RecognitionAbstractive Text SummarizationAction RecognitionAnomaly DetectionArithmetic ReasoningArticlesAsthmatic Lung Sound ClassificationAudio ClassificationChange DetectionClassificationClick-Through Rate PredictionCode GenerationColor Image DenoisingCommon Sense ReasoningCross-Domain Few-Shot Object DetectionDeblurringDeepFake DetectionDenoisingDepth EstimationDomain GeneralizationDrug DiscoveryEEG 4 classesFace DetectionFace RecognitionFake Image DetectionFine-Grained Image ClassificationFracture detectionFraud DetectionGloss-free Sign Language TranslationGraph ClassificationHandwritten Mathmatical Expression RecognitionHateful Meme ClassificationHighlight DetectionImage CaptioningImage ClassificationImage DehazingImage GenerationKeyword SpottingLanguage ModellingLicense Plate DetectionLong-range modelingLow-Light Image EnhancementMachine TranslationMedical Image SegmentationMeme ClassificationMonocular Depth EstimationMulti-Label ClassificationMultimodal Emotion RecognitionMultimodal Intent RecognitionMulti-Object TrackingMusic Source SeparationNavSimNovel View SynthesisObject DetectionObject Detection In Aerial ImagesObject RearrangementObject TrackingPerson Re-IdentificationPhone-level pronunciation scoringPose EstimationQuestion AnsweringRailway Track Image ClassificationReal-Time Object DetectionRgb-T TrackingRobot ManipulationRobot Manipulation GeneralizationRobot Task PlanningSemantic SegmentationSpeech EnhancementSpeech RecognitionStyle TransferTable-to-Text GenerationTemporal Relation ExtractionText to 3DText-to-Image GenerationUniversal Domain AdaptationUnsupervised Domain AdaptationVideo derainingVideo GenerationVideo Question AnsweringVirtual Try-onVisual Object TrackingWeakly Supervised Action LocalizationZero-Shot Video Question Answer

Benchmark Results

Reproductions