| Cognitive maps are generative programs | Apr 29, 2025 | Computational EfficiencyLarge Language Model | —Unverified | 0 |
| CoCo-Bench: A Comprehensive Code Benchmark For Multi-task Large Language Model Evaluation | Apr 29, 2025 | Code GenerationLanguage Model Evaluation | —Unverified | 0 |
| semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage | Apr 28, 2025 | GPULarge Language Model | —Unverified | 0 |
| Intelligent4DSE: Optimizing High-Level Synthesis Design Space Exploration with Graph Neural Networks and Large Language Models | Apr 28, 2025 | Evolutionary AlgorithmsGraph Neural Network | —Unverified | 0 |
| Leveraging LLM to Strengthen ML-Based Cross-Site Scripting Detection | Apr 28, 2025 | Large Language Model | —Unverified | 0 |
| An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination | Apr 28, 2025 | Code GenerationHallucination | —Unverified | 0 |
| AutoJudge: Judge Decoding Without Manual Annotation | Apr 28, 2025 | GSM8KLarge Language Model | —Unverified | 0 |
| Fitness Landscape of Large Language Model-Assisted Automated Algorithm Search | Apr 28, 2025 | Combinatorial OptimizationLanguage Modeling | —Unverified | 0 |
| CodeBC: A More Secure Large Language Model for Smart Contract Code Generation in Blockchain | Apr 28, 2025 | Code GenerationLanguage Modeling | CodeCode Available | 0 |
| GVPO: Group Variance Policy Optimization for Large Language Model Post-Training | Apr 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SAGA: A Security Architecture for Governing AI Agentic Systems | Apr 27, 2025 | Large Language Model | —Unverified | 0 |
| GenTorrent: Scaling Large Language Model Serving with An Overley Network | Apr 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning | Apr 27, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| Exploring a Large Language Model for Transforming Taxonomic Data into OWL: Lessons Learned and Implications for Ontology Development | Apr 25, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Big Send-off: High Performance Collectives on GPU-based Supercomputers | Apr 25, 2025 | GPULanguage Modeling | —Unverified | 0 |
| An Empirical Study of Evaluating Long-form Question Answering | Apr 25, 2025 | FormInformativeness | CodeCode Available | 0 |
| MultiMind: Enhancing Werewolf Agents with Multimodal Reasoning and Theory of Mind | Apr 25, 2025 | Large Language ModelMultimodal Reasoning | —Unverified | 0 |
| Toward Personalizing Quantum Computing Education: An Evolutionary LLM-Powered Approach | Apr 24, 2025 | HallucinationLarge Language Model | —Unverified | 0 |
| Unified Attacks to Large Language Model Watermarks: Spoofing and Scrubbing in Unauthorized Knowledge Distillation | Apr 24, 2025 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| Does Knowledge Distillation Matter for Large Language Model based Bundle Generation? | Apr 24, 2025 | In-Context LearningKnowledge Distillation | —Unverified | 0 |
| TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation | Apr 24, 2025 | Caption GenerationDense Video Captioning | —Unverified | 0 |
| Automatically Generating Rules of Malicious Software Packages via Large Language Model | Apr 24, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Towards Leveraging Large Language Model Summaries for Topic Modeling in Source Code | Apr 24, 2025 | Code SearchLanguage Modeling | —Unverified | 0 |
| Exploring human-SAV interaction using large language models: The impact of psychological ownership and anthropomorphism on user experience | Apr 23, 2025 | Autonomous VehiclesLarge Language Model | —Unverified | 0 |
| Monte Carlo Planning with Large Language Model for Text-Based Game Agents | Apr 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ParamΔ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost | Apr 23, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model | Apr 23, 2025 | Large Language Model | —Unverified | 0 |
| Improving Significant Wave Height Prediction Using Chronos Models | Apr 23, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| Enhancing TCR-Peptide Interaction Prediction with Pretrained Language Models and Molecular Representations | Apr 22, 2025 | BenchmarkingFew-Shot Learning | —Unverified | 0 |
| Research on Cloud Platform Network Traffic Monitoring and Anomaly Detection System based on Large Language Models | Apr 22, 2025 | Anomaly DetectionComputational Efficiency | —Unverified | 0 |
| Do It For Me vs. Do It With Me: Investigating User Perceptions of Different Paradigms of Automation in Copilots for Feature-Rich Software | Apr 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LLMs meet Federated Learning for Scalable and Secure IoT Management | Apr 22, 2025 | Computational EfficiencyDecision Making | —Unverified | 0 |
| Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3 | Apr 22, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Large Language Model Empowered Privacy-Protected Framework for PHI Annotation in Clinical Notes | Apr 22, 2025 | De-identificationLanguage Modeling | —Unverified | 0 |
| DATETIME: A new benchmark to measure LLM translation and reasoning capabilities | Apr 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning | Apr 22, 2025 | Large Language Modelreinforcement-learning | —Unverified | 0 |
| Automated Bug Report Prioritization in Large Open-Source Projects | Apr 22, 2025 | Large Language Modeltext-classification | CodeCode Available | 0 |
| FaceInsight: A Multimodal Large Language Model for Face Perception | Apr 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators | Apr 21, 2025 | Code GenerationInstruction Following | CodeCode Available | 0 |
| Speculative Sampling via Exponential Races | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark | Apr 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automated Duplicate Bug Report Detection in Large Open Bug Repositories | Apr 21, 2025 | Large Language Model | —Unverified | 0 |
| Kuwain 1.5B: An Arabic SLM via Language Injection | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Causal Disentanglement for Robust Long-tail Medical Image Generation | Apr 20, 2025 | counterfactualDisentanglement | —Unverified | 0 |
| Don't Retrieve, Generate: Prompting LLMs for Synthetic Training Data in Dense Retrieval | Apr 20, 2025 | Large Language ModelRetrieval | —Unverified | 0 |
| ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task | Apr 20, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines | Apr 20, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models | Apr 19, 2025 | Deep LearningLanguage Modeling | —Unverified | 0 |