PRO-V: An Efficient Program Generation Multi-Agent System for Automatic RTL Verification Jun 13, 2025 Code Generation In-Context Learning
Code Code Available 1LLM-as-a-Judge for Reference-less Automatic Code Validation and Refinement for Natural Language to Bash in IT Automation Jun 12, 2025 Code Generation
— Unverified 0Execution Guided Line-by-Line Code Generation Jun 12, 2025 Code Generation
Code Code Available 2Specification and Evaluation of Multi-Agent LLM Systems -- Prototype and Cybersecurity Applications Jun 12, 2025 Code Generation Question Answering
Code Code Available 0AutoMind: Adaptive Knowledgeable Agent for Automated Data Science Jun 12, 2025 Code Generation Large Language Model
Code Code Available 2Reasoning as a Resource: Optimizing Fast and Slow Thinking in Code Generation Models Jun 11, 2025 Benchmarking Code Generation
— Unverified 0ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs Jun 11, 2025 Code Generation Diagnostic
Code Code Available 1Prompt Variability Effects On LLM Code Generation Jun 11, 2025 Code Generation
— Unverified 0Bridging the Gap Between Open-Source and Proprietary LLMs in Table QA Jun 11, 2025 Code Generation Language Modeling
Code Code Available 0Technical Report for Argoverse2 Scenario Mining Challenges on Iterative Error Correction and Spatially-Aware Prompting Jun 10, 2025 Autonomous Driving Code Generation
— Unverified 0Exploring the Capabilities of the Frontier Large Language Models for Nuclear Energy Research Jun 10, 2025 Code Generation Prompt Engineering
— Unverified 0UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench Jun 10, 2025 Code Generation
Code Code Available 1Understanding Software Engineering Agents Through the Lens of Traceability: An Empirical Study Jun 10, 2025 Code Generation Decision Making
— Unverified 0Edit Flows: Flow Matching with Edit Operations Jun 10, 2025 Code Generation Image Captioning
— Unverified 0LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges Jun 9, 2025 Code Generation
Code Code Available 0SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design Jun 9, 2025 Code Generation RAG
Code Code Available 1Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles Jun 9, 2025 Code Generation RAG
— Unverified 0Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models Jun 9, 2025 Code Generation
— Unverified 0ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols Jun 9, 2025 Code Generation Specificity
— Unverified 0Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation Jun 8, 2025 Code Generation Mathematical Problem-Solving
Code Code Available 0VeriLoC: Line-of-Code Level Prediction of Hardware Design Quality from Verilog Code Jun 8, 2025 Code Generation Prediction
— Unverified 0Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems Jun 7, 2025 Code Generation valid
— Unverified 0Can Theoretical Physics Research Benefit from Language Agents? Jun 6, 2025 Code Generation Mathematical Reasoning
— Unverified 0SafeGenBench: A Benchmark Framework for Security Vulnerability Detection in LLM-Generated Code Jun 6, 2025 Code Generation Vulnerability Detection
— Unverified 0DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation Jun 6, 2025 Code Generation
Code Code Available 1KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes Jun 6, 2025 Code Generation Data Integration
Code Code Available 1CP-Bench: Evaluating Large Language Models for Constraint Modelling Jun 6, 2025 Code Generation
— Unverified 0Deployability-Centric Infrastructure-as-Code Generation: An LLM-based Iterative Framework Jun 5, 2025 Code Generation
Code Code Available 0ScaleRTL: Scaling LLMs with Reasoning Data and Test-Time Compute for Accurate RTL Code Generation Jun 5, 2025 Code Generation
— Unverified 0ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests Jun 5, 2025 Code Generation
Code Code Available 0Demonstrations of Integrity Attacks in Multi-Agent Systems Jun 5, 2025 Code Generation Natural Language Understanding
— Unverified 0hdl2v: A Code Translation Dataset for Enhanced LLM Verilog Generation Jun 5, 2025 Code Generation Code Translation
— Unverified 0Seed-Coder: Let the Code Model Curate Data for Itself Jun 4, 2025 Code Completion Code Generation
Code Code Available 4CETBench: A Novel Dataset constructed via Transformations over Programs for Benchmarking LLMs for Code-Equivalence Checking Jun 4, 2025 Benchmarking Code Generation
— Unverified 0Matter-of-Fact: A Benchmark for Verifying the Feasibility of Literature-Supported Claims in Materials Science Jun 4, 2025 Articles Code Generation
Code Code Available 0Generating Automotive Code: Large Language Models for Software Development and Verification in Safety-Critical Systems Jun 4, 2025 Benchmarking Code Generation
— Unverified 0From Virtual Agents to Robot Teams: A Multi-Robot Framework Evaluation in High-Stakes Healthcare Context Jun 4, 2025 Code Generation
— Unverified 0VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation Jun 4, 2025 Code Generation
— Unverified 0DiaBlo: Diagonal Blocks Are Sufficient For Finetuning Jun 3, 2025 Arithmetic Reasoning Code Generation
Code Code Available 0Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning Jun 3, 2025 Code Generation reinforcement-learning
Code Code Available 4Adaptive Graph Pruning for Multi-Agent Communication Jun 3, 2025 Code Generation Large Language Model
Code Code Available 0Rethinking the effects of data contamination in Code Intelligence Jun 3, 2025 Code Generation Code Summarization
— Unverified 0How do Pre-Trained Models Support Software Engineering? An Empirical Study in Hugging Face Jun 3, 2025 Code Generation Text Generation
— Unverified 0ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code Jun 2, 2025 Benchmarking Code Generation
— Unverified 0Flow2Code: Evaluating Large Language Models for Flowchart-based Code Generation Capability Jun 2, 2025 Code Generation
Code Code Available 0SALAD: Systematic Assessment of Machine Unlearing on LLM-Aided Hardware Design Jun 2, 2025 Code Generation Machine Unlearning
— Unverified 0Legal Compliance Evaluation of Smart Contracts Generated By Large Language Models Jun 1, 2025 Code Generation
— Unverified 0CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval May 31, 2025 Code Generation Information Retrieval
— Unverified 0Writing-Zero: Bridge the Gap Between Non-verifiable Tasks and Verifiable Rewards May 30, 2025 Code Generation
— Unverified 0Cascading Adversarial Bias from Injection to Distillation in Language Models May 30, 2025 Bias Detection Code Generation
— Unverified 0