| Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees | Jun 17, 2025 | Code TranslationHumanEval | —Unverified | 0 |
| Navigating the growing field of research on AI for software testing -- the taxonomy for AI-augmented software testing and an ontology-driven literature survey | Jun 17, 2025 | software testing | CodeCode Available | 0 |
| IntenTest: Stress Testing for Intent Integrity in API-Calling LLM Agents | Jun 9, 2025 | software testing | —Unverified | 0 |
| The Impact of Software Testing with Quantum Optimization Meets Machine Learning | Jun 2, 2025 | Defect Detectionsoftware testing | —Unverified | 0 |
| EvoGPT: Enhancing Test Suite Robustness via LLM-Based Generation and Genetic Optimization | May 18, 2025 | DiversityFault Detection | —Unverified | 0 |
| On the Need for a Statistical Foundation in Scenario-Based Testing of Autonomous Vehicles | May 4, 2025 | Autonomous Vehiclessoftware testing | —Unverified | 0 |
| Automated Unit Test Case Generation: A Systematic Literature Review | Apr 29, 2025 | software testingSystematic Literature Review | —Unverified | 0 |
| Test It Before You Trust It: Applying Software Testing for Trustworthy In-context Learning | Apr 26, 2025 | In-Context LearningPhilosophy | CodeCode Available | 0 |
| Harden and Catch for Just-in-Time Assured LLM-Based Software Testing: Open Research Challenges | Apr 23, 2025 | software testing | —Unverified | 0 |
| Expectations vs Reality -- A Secondary Study on AI Adoption in Software Testing | Apr 7, 2025 | software testing | —Unverified | 0 |
| From Code Generation to Software Testing: AI Copilot with Context-Based RAG | Apr 2, 2025 | ChatbotCode Generation | —Unverified | 0 |
| Towards Trustworthy GUI Agents: A Survey | Mar 30, 2025 | Decision MakingSequential Decision Making | CodeCode Available | 0 |
| Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT's Capabilities in Generating Metamorphic Relations | Mar 28, 2025 | software testing | —Unverified | 0 |
| Vulnerability Detection: From Formal Verification to Large Language Models and Hybrid Approaches: A Comprehensive Overview | Mar 13, 2025 | Automated Theorem Provingsoftware testing | —Unverified | 0 |
| Rule-Guided Reinforcement Learning Policy Evaluation and Improvement | Mar 12, 2025 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| ToolFuzz -- Automated Agent Tool Testing | Mar 6, 2025 | Large Language ModelPrompt Engineering | —Unverified | 0 |
| WIP: Assessing the Effectiveness of ChatGPT in Preparatory Testing Activities | Mar 5, 2025 | software testing | —Unverified | 0 |
| Towards Reliable LLM-Driven Fuzz Testing: Vision and Road Ahead | Mar 2, 2025 | software testingvalid | —Unverified | 0 |
| CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification | Feb 12, 2025 | 16k4k | —Unverified | 0 |
| Identifying Flaky Tests in Quantum Code: A Machine Learning Approach | Feb 6, 2025 | software testing | —Unverified | 0 |
| A Systematic Approach for Assessing Large Language Models' Test Case Generation Capability | Feb 5, 2025 | software testingTest Case Creation | —Unverified | 0 |
| Assessing Data Augmentation-Induced Bias in Training and Testing of Machine Learning Models | Feb 3, 2025 | Data Augmentationsoftware testing | CodeCode Available | 0 |
| Toward Neurosymbolic Program Comprehension | Feb 3, 2025 | Code Generationsoftware testing | —Unverified | 0 |
| Many-Objective Neuroevolution for Testing Games | Jan 14, 2025 | software testing | —Unverified | 0 |
| An efficient approach to represent enterprise web application structure using Large Language Model in the service of Intelligent Quality Engineering | Jan 12, 2025 | Few-Shot LearningIn-Context Learning | —Unverified | 0 |