| From Code Generation to Software Testing: AI Copilot with Context-Based RAG | Apr 2, 2025 | ChatbotCode Generation | —Unverified | 0 |
| Towards Trustworthy GUI Agents: A Survey | Mar 30, 2025 | Decision MakingSequential Decision Making | CodeCode Available | 0 |
| Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT's Capabilities in Generating Metamorphic Relations | Mar 28, 2025 | software testing | —Unverified | 0 |
| Vulnerability Detection: From Formal Verification to Large Language Models and Hybrid Approaches: A Comprehensive Overview | Mar 13, 2025 | Automated Theorem Provingsoftware testing | —Unverified | 0 |
| Rule-Guided Reinforcement Learning Policy Evaluation and Improvement | Mar 12, 2025 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| ToolFuzz -- Automated Agent Tool Testing | Mar 6, 2025 | Large Language ModelPrompt Engineering | —Unverified | 0 |
| WIP: Assessing the Effectiveness of ChatGPT in Preparatory Testing Activities | Mar 5, 2025 | software testing | —Unverified | 0 |
| Towards Reliable LLM-Driven Fuzz Testing: Vision and Road Ahead | Mar 2, 2025 | software testingvalid | —Unverified | 0 |
| CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification | Feb 12, 2025 | 16k4k | —Unverified | 0 |
| Identifying Flaky Tests in Quantum Code: A Machine Learning Approach | Feb 6, 2025 | software testing | —Unverified | 0 |