| Continually Self-Improving Language Models for Bariatric Surgery Question--Answering | May 22, 2025 | Large Language ModelMisinformation | —Unverified | 0 |
| A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization | May 22, 2025 | Combinatorial OptimizationLanguage Modeling | CodeCode Available | 1 |
| PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models | May 22, 2025 | GSM8KLarge Language Model | —Unverified | 0 |
| CTRAP: Embedding Collapse Trap to Safeguard Large Language Models from Harmful Fine-Tuning | May 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions | May 22, 2025 | Large Language ModelThompson Sampling | —Unverified | 0 |
| Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine | May 22, 2025 | Causal InferenceDrug Discovery | —Unverified | 0 |
| Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks | May 22, 2025 | Code GenerationLanguage Modeling | —Unverified | 0 |
| Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification | May 21, 2025 | Data AugmentationLarge Language Model | —Unverified | 0 |
| Reward Is Enough: LLMs Are In-Context Reinforcement Learners | May 21, 2025 | Large Language ModelReinforcement Learning (RL) | —Unverified | 0 |
| Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector | May 21, 2025 | Bias DetectionIn-Context Learning | —Unverified | 0 |
| Forging Time Series with Language: A Large Language Model Approach to Synthetic Data Generation | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| NEXT-EVAL: Next Evaluation of Traditional and LLM Web Data Record Extraction | May 21, 2025 | BenchmarkingHallucination | —Unverified | 0 |
| CRAKEN: Cybersecurity LLM Agent with Knowledge-Based Execution | May 21, 2025 | Large Language ModelTask Planning | CodeCode Available | 1 |
| AutoData: A Multi-Agent System for Open Web Data Collection | May 21, 2025 | Large Language Model | CodeCode Available | 0 |
| How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior | May 21, 2025 | Large Language ModelManagement | CodeCode Available | 1 |
| Aligning Dialogue Agents with Global Feedback via Large Language Model Reward Decomposition | May 21, 2025 | Dialogue GenerationLanguage Modeling | —Unverified | 0 |
| Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval | May 21, 2025 | AttributeImage Retrieval | —Unverified | 0 |
| MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling | May 21, 2025 | Emotion RecognitionFace Detection | —Unverified | 0 |
| LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling | May 21, 2025 | DiagnosticLarge Language Model | —Unverified | 0 |
| Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective | May 21, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering | May 21, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| CP-LLM: Context and Pixel Aware Large Language Model for Video Quality Assessment | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Web-Shepherd: Advancing PRMs for Reinforcing Web Agents | May 21, 2025 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 2 |