SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation May 30, 2025 Code Generation HumanEval
— Unverified 0A Reward-driven Automated Webshell Malicious-code Generator for Red-teaming May 30, 2025 Code Generation Diversity
— Unverified 0RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation May 30, 2025 Code Generation Diversity
Code Code Available 0Eye of Judgement: Dissecting the Evaluation of Russian-speaking LLMs with POLLUX May 30, 2025 Code Generation
— Unverified 0HardTests: Synthesizing High-Quality Test Cases for LLM Coding May 30, 2025 Code Generation Language Modeling
— Unverified 0Using Reasoning Models to Generate Search Heuristics that Solve Open Instances of Combinatorial Design Problems May 29, 2025 Code Generation
Code Code Available 0SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving May 29, 2025 Code Generation
— Unverified 0Enhancing LLM-Based Code Generation with Complexity Metrics: A Feedback-Driven Approach May 29, 2025 Code Generation HumanEval
— Unverified 0Infinite-Instruct: Synthesizing Scaling Code instruction Data with Bidirectional Synthesis and Static Verification May 29, 2025 Code Generation
Code Code Available 0OSS-UAgent: An Agent-based Usability Evaluation Framework for Open Source Software May 29, 2025 Code Generation
Code Code Available 0Self-Correcting Code Generation Using Small Language Models May 29, 2025 Code Generation HumanEval
Code Code Available 0VERINA: Benchmarking Verifiable Code Generation May 29, 2025 Benchmarking Code Generation
Code Code Available 2LLM Performance for Code Generation on Noisy Tasks May 29, 2025 Benchmarking Code Generation
Code Code Available 0DeepRTL2: A Versatile Model for RTL-Related Tasks May 28, 2025 Code Generation Code Search
— Unverified 0Training Language Models to Generate Quality Code with Program Analysis Feedback May 28, 2025 Code Generation
Code Code Available 1HiLDe: Intentional Code Generation via Human-in-the-Loop Decoding May 28, 2025 Code Completion Code Generation
— Unverified 0Rendering-Aware Reinforcement Learning for Vector Graphics Generation May 27, 2025 Code Generation reinforcement-learning
— Unverified 0R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning May 27, 2025 Code Generation Reinforcement Learning (RL)
Code Code Available 1RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving May 27, 2025 Code Generation
Code Code Available 0An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks May 27, 2025 Code Generation Code Summarization
— Unverified 0Large Language Models for IT Automation Tasks: Are We There Yet? May 26, 2025 Attribute Code Generation
— Unverified 0SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents May 26, 2025 Code Generation
Code Code Available 1An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation May 26, 2025 Code Generation GitHub issue resolution
Code Code Available 0Style2Code: A Style-Controllable Code Generation Framework with Dual-Modal Contrastive Representation Learning May 26, 2025 Code Generation Contrastive Learning
Code Code Available 0Learning to Reason without External Rewards May 26, 2025 Code Generation reinforcement-learning
Code Code Available 3Compliance-to-Code: Enhancing Financial Compliance Checking via Code Generation May 26, 2025 Code Generation
Code Code Available 1Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs May 26, 2025 Code Generation Recommendation Systems
Code Code Available 1Large Language Models in Code Co-generation for Safe Autonomous Vehicles May 26, 2025 Autonomous Vehicles Code Generation
— Unverified 0CODE-DITING: A Reasoning-Based Metric for Functional Alignment in Code Evaluation May 26, 2025 Code Generation
— Unverified 0ReChisel: Effective Automatic Chisel Code Generation by LLM with Reflection May 26, 2025 Code Generation
Code Code Available 1ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment May 25, 2025 Code Generation Mathematical Reasoning
— Unverified 0Architectures of Error: A Philosophical Inquiry into AI and Human Code Generation May 25, 2025 Code Generation
— Unverified 0Mind the Gap: A Practical Attack on GGUF Quantization May 24, 2025 Code Generation Quantization
Code Code Available 1PromptWise: Online Learning for Cost-Aware Prompt Assignment in Generative Models May 24, 2025 Code Generation Model Selection
— Unverified 0HD-PiSSA: High-Rank Distributed Orthogonal Adaptation May 24, 2025 Code Generation GPU
— Unverified 0Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment? May 24, 2025 Code Generation Math
— Unverified 0SEW: Self-Evolving Agentic Workflows for Automated Code Generation May 24, 2025 Code Generation
Code Code Available 7From Output to Evaluation: Does Raw Instruction-Tuned Code LLMs Output Suffice for Fill-in-the-Middle Code Generation? May 24, 2025 Code Generation HumanEval
— Unverified 0ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation May 24, 2025 Benchmarking Chart Understanding
Code Code Available 3Autocomp: LLM-Driven Code Optimization for Tensor Accelerators May 24, 2025 Code Generation
— Unverified 0PPT: A Process-based Preference Learning Framework for Self Improving Table Question Answering Models May 23, 2025 Code Generation Mathematical Reasoning
— Unverified 0Evaluating the Energy-Efficiency of the Code Generated by LLMs May 23, 2025 Code Generation
— Unverified 0FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow May 23, 2025 Benchmarking Code Generation
Code Code Available 1HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation May 22, 2025 Code Generation
Code Code Available 0Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks May 22, 2025 Code Generation Language Modeling
— Unverified 0SIMCOPILOT: Evaluating Large Language Models for Copilot-Style Code Generation May 21, 2025 Benchmarking Code Generation
— Unverified 0dKV-Cache: The Cache for Diffusion Language Models May 21, 2025 Code Generation Denoising
Code Code Available 2MAPS: A Multilingual Benchmark for Global Agent Performance and Security May 21, 2025 Code Generation Math
— Unverified 0Towards a Science of Causal Interpretability in Deep Learning for Software Engineering May 21, 2025 Causal Inference Code Generation
— Unverified 0R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization May 21, 2025 Code Generation Model Optimization
Code Code Available 13