SOTAVerified

test driven development

Papers

Showing 124 of 24 papers

TitleStatusHype
CoreCodeBench: A Configurable Multi-Scenario Repository-Level BenchmarkCode1
Unit Test Case Generation with Transformers and Focal ContextCode1
Otter: Generating Tests from Issues to Validate SWE PatchesCode1
TDD-Bench Verified: Can LLMs Generate Tests for Issues Before They Get Resolved?Code1
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven MannerCode1
Generating Automotive Code: Large Language Models for Software Development and Verification in Safety-Critical Systems0
LLM4TDD: Best Practices for Test Driven Development Using Large Language Models0
More Effective Ontology Authoring with Test-Driven Development0
Open Source Evolutionary Computation with Chips-n-Salsa0
A Comparative Study on the Impact of Test-Driven Development (TDD) and Behavior-Driven Development (BDD) on Enterprise Software Delivery Effectiveness0
Evaluation-Driven Development of LLM Agents: A Process Model and Reference Architecture0
Apertium-fin-eng--Rule-based Shallow Machine Translation for WMT 2019 Shared Task0
Applied Awareness: Test-Driven GUI Development using Computer Vision and Cryptography0
From Defects to Demands: A Unified, Iterative, and Heuristically Guided LLM-Based Framework for Automated Software Repair and Requirement Realization0
TimeGym: Debugging for Time Series Modeling in Python0
Unit Testing in ASP Revisited: Language and Test-Driven Development Environment0
Use Property-Based Testing to Bridge LLM Code Generation and Validation0
Test-Driven Development for Code Generation0
Test-Driven Development of ontologies (extended version)0
Testing LLMs on Code Generation with Varying Levels of Prompt Specificity0
Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation0
Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software DeploymentCode0
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software DeploymentCode0
Comprehensive Evaluation and Insights into the Use of Large Language Models in the Automation of Behavior-Driven Development Acceptance Test FormulationCode0
Show:102550

No leaderboard results yet.