SOTAVerified

Vulnerability Detection

Vulnerability detection plays a crucial role in safeguarding against these threats by identifying weaknesses and potential entry points that malicious actors could exploit. Through advanced scanning techniques and penetration testing, vulnerability detection tools meticulously analyze web applications and websites for vulnerabilities such as SQL injection, cross-site scripting (XSS), and insecure authentication mechanisms.

By proactively identifying and addressing vulnerabilities, organizations can strengthen their online security posture and mitigate the risk of data breaches, financial loss, and reputational damage. Additionally, vulnerability detection empowers businesses to stay compliant with industry regulations and standards, demonstrating their commitment to safeguarding sensitive information and maintaining the trust of their customers. With the evolving threat landscape and increasingly sophisticated attack vectors, investing in robust vulnerability detection measures is paramount for staying one step ahead of cyber threats and ensuring the resilience of web-based platforms and services.

Papers

Showing 110 of 216 papers

TitleStatusHype
NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive SecurityCode11
Vulnerability Detection with Code Language Models: How Far Are We?Code3
MoreFixes: A Large-Scale Dataset of CVE Fix Commits Mined through Enhanced Repository DiscoveryCode2
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong DetectionCode2
Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-TuningCode2
Finetuning Large Language Models for Vulnerability DetectionCode2
CRAKEN: Cybersecurity LLM Agent with Knowledge-Based ExecutionCode1
The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMsCode1
R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning DistillationCode1
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE DetectionCode1
Show:102550
← PrevPage 1 of 22Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Reveal Model - Tested on Reveal (Training on Devign + VulScribeR 20K + Extra Cleans)F1 Score26.18Unverified
2Devign Model - Tested on Reveal (Training on Devign + VulScribeR 20K + Extra Cleans)F1 Score24.99Unverified
3Reveal Model - Tested on Bigvul (Training on Devign + VulScribeR 20K + Extra Cleans)F1 Score18.98Unverified
4Devign Model - Tested on Bigvul (Training on Devign + VulScribeR 20K + Extra Cleans)F1 Score18.51Unverified
5LineVul - Tested on Reveal (Training on Devign + VulScribeR 20K + Extra Cleans)F1 Score17.38Unverified
6LineVul - Tested on BigVul (Training on Devign + VulScribeR 20K+ Extra Cleans)F1 Score16.23Unverified
#ModelMetricClaimedVerifiedStatus
1WizardCoderAUC0.86Unverified
2ContraBERTAUC0.85Unverified