SOTAVerified

Malware Classification

Malware Classification is the process of assigning a malware sample to a specific malware family. Malware within a family shares similar properties that can be used to create signatures for detection and classification. Signatures can be categorized as static or dynamic based on how they are extracted. A static signature can be based on a byte-code sequence, binary assembly instruction, or an imported Dynamic Link Library (DLL). Dynamic signatures can be based on file system activities, terminal commands, network communications, or function and system call sequences.

Source: Behavioral Malware Classification using Convolutional Recurrent Neural Networks

Papers

Showing 110 of 146 papers

TitleStatusHype
Malware Classification Leveraging NLP & Machine Learning for Enhanced AccuracyCode0
Semantic Preprocessing for LLM-based Malware Analysis0
EMBER2024 -- A Benchmark Dataset for Holistic Evaluation of Malware ClassifiersCode2
Dynamic Malware Classification of Windows PE Files using CNNs and Greyscale Images Derived from Runtime API Call Argument Conversion0
Structure-based Anomaly Detection and Clustering0
Malware families discovery via Open-Set Recognition on Android manifest permissions0
Semantic-Aware Contrastive Fine-Tuning: Boosting Multimodal Malware Classification with Discriminative Embeddings0
Scalable APT Malware Classification via Parallel Feature Extraction and GPU-Accelerated Learning0
OpCode-Based Malware Classification Using Machine Learning and Deep Learning Techniques0
Imbalanced malware classification: an approach based on dynamic classifier selectionCode0
Show:102550
← PrevPage 1 of 15Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MalConvAccuracy (10-fold)9,641Unverified
2TPOT ClassifierAccuracy (5-fold)98.94Unverified
3CNN BiLSTM - Reb SamplAccuracy (5-fold)98.2Unverified
4Ahmadi et al. (2016): ENT, Bytes 1-G, STR, IMG1, IMG2, MD1, MISC, OPC, SEC, REG, DP, API, SYM, MD2 IMG and Opcode N-Grams + Ensemble Learning (XGBoost)Accuracy (10-fold)1Unverified
5HYDRAAccuracy (10-fold)1Unverified
6Zhang et al. (2016): Total lines of each Section, Operation Code Count, API Usage, Special Symbols Count, Asm File Pixel Intensity Feature, Bytes File Block Size Distribution, Bytes File N-Gram + Ensemble Learning (XGBoost)Accuracy (10-fold)1Unverified
7OrthrusAccuracy (10-fold)0.99Unverified
8Opcode-based Shallow CNNAccuracy (10-fold)0.99Unverified
9Hierarchical Convolutional NetworkAccuracy (10-fold)0.99Unverified
10SEAAccuracy (10-fold)0.99Unverified