| System of Agentic AI for the Discovery of Metal-Organic Frameworks | Apr 18, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| PV-VLM: A Multimodal Vision-Language Approach Incorporating Sky Images for Intra-Hour Photovoltaic Power Forecasting | Apr 18, 2025 | energy managementLanguage Modeling | —Unverified | 0 |
| Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization | Apr 18, 2025 | Action LocalizationAnomaly Detection | —Unverified | 0 |
| Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety | Apr 18, 2025 | Anomaly DetectionAutonomous Driving | CodeCode Available | 0 |
| Zero-Shot Industrial Anomaly Segmentation with Image-Aware Prompt Generation | Apr 18, 2025 | Anomaly SegmentationLanguage Modeling | CodeCode Available | 0 |
| Scaling sparse feature circuit finding for in-context learning | Apr 18, 2025 | In-Context LearningLarge Language Model | —Unverified | 0 |
| RAG Without the Lag: Interactive Debugging for Retrieval-Augmented Generation Pipelines | Apr 18, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Are Retrials All You Need? Enhancing Large Language Model Reasoning Without Verbalized Feedback | Apr 17, 2025 | AllLanguage Modeling | —Unverified | 0 |
| Causal-Copilot: An Autonomous Causal Analysis Agent | Apr 17, 2025 | Causal DiscoveryCausal Inference | —Unverified | 0 |
| ChatEXAONEPath: An Expert-level Multimodal Large Language Model for Histopathology Using Whole Slide Images | Apr 17, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Pandora: A Code-Driven Large Language Model Agent for Unified Reasoning Across Diverse Structured Knowledge | Apr 17, 2025 | Knowledge GraphsLanguage Modeling | —Unverified | 0 |
| Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration | Apr 17, 2025 | Geometry Problem SolvingLarge Language Model | CodeCode Available | 1 |
| Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks | Apr 17, 2025 | Epistemic ReasoningLarge Language Model | CodeCode Available | 0 |
| Retrieval-Augmented Generation with Conflicting Evidence | Apr 17, 2025 | Large Language ModelMisinformation | CodeCode Available | 1 |
| DIDS: Domain Impact-aware Data Sampling for Large Language Model Training | Apr 17, 2025 | Dimensionality ReductionLanguage Modeling | —Unverified | 0 |
| EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery | Apr 17, 2025 | Large Language ModelMulti-Task Learning | —Unverified | 0 |
| SkyReels-V2: Infinite-length Film Generative Model | Apr 17, 2025 | Large Language Modelmodel | CodeCode Available | 9 |
| Uncertainty-Aware Trajectory Prediction via Rule-Regularized Heteroscedastic Deep Classification | Apr 17, 2025 | DiversityGaussian Processes | CodeCode Available | 0 |
| SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding | Apr 17, 2025 | Image GenerationLarge Language Model | CodeCode Available | 1 |
| Mixer Metaphors: audio interfaces for non-musical applications | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| BitNet b1.58 2B4T Technical Report | Apr 16, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM | Apr 16, 2025 | Large Language ModelText-to-Video Generation | —Unverified | 0 |
| Trusting CHATGPT: how minor tweaks in the prompts lead to major differences in sentiment classification | Apr 16, 2025 | Large Language ModelSentiment Analysis | —Unverified | 0 |
| AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection | Apr 16, 2025 | Anomaly DetectionLarge Language Model | CodeCode Available | 1 |
| Generative Recommendation with Continuous-Token Diffusion | Apr 16, 2025 | DenoisingLanguage Modeling | —Unverified | 0 |
| Rethinking LLM-Based Recommendations: A Query Generation-Based, Training-Free Approach | Apr 16, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design Tasks | Apr 16, 2025 | High-Level SynthesisLarge Language Model | CodeCode Available | 1 |
| Position: The Most Expensive Part of an LLM should be its Training Data | Apr 16, 2025 | Large Language ModelPosition | —Unverified | 0 |
| Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures | Apr 16, 2025 | CPUGPU | —Unverified | 0 |
| Towards Conversational AI for Human-Machine Collaborative MLOps | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Recommending Clinical Trials for Online Patient Cases using Artificial Intelligence | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| GraphicBench: A Planning Benchmark for Graphic Design with Language Agents | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Video Summarization with Large Language Models | Apr 15, 2025 | Large Language ModelVideo Summarization | —Unverified | 0 |
| When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers | Apr 15, 2025 | Binary ClassificationDomain Generalization | —Unverified | 0 |
| Large Language Model-Informed Feature Discovery Improves Prediction and Interpretation of Credibility Perceptions of Visual Content | Apr 15, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| ReZero: Enhancing LLM search ability by trying one-more-time | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning | Apr 15, 2025 | Automated Theorem ProvingLarge Language Model | CodeCode Available | 3 |
| Learning to Be A Doctor: Searching for Effective Medical Agent Architectures | Apr 15, 2025 | AutoMLDiagnostic | —Unverified | 0 |
| The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections | Apr 15, 2025 | Large Language Model | —Unverified | 0 |
| Evaluation Report on MCP Servers | Apr 15, 2025 | Large Language Model | CodeCode Available | 3 |
| Transferable text data distillation by trajectory matching | Apr 14, 2025 | ARCLarge Language Model | —Unverified | 0 |
| A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Investigating cybersecurity incidents using large language models in latest-generation wireless networks | Apr 14, 2025 | Binary ClassificationData Poisoning | —Unverified | 0 |
| LLM Unlearning Reveals a Stronger-Than-Expected Coreset Effect in Current Benchmarks | Apr 14, 2025 | Large Language ModelMachine Unlearning | CodeCode Available | 0 |
| Mavors: Multi-granularity Video Representation for Multimodal Large Language Model | Apr 14, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer | Apr 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LangPert: Detecting and Handling Task-level Perturbations for Robust Object Rearrangement | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automated Testing of COBOL to Java Transformation | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |