| Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training | Apr 17, 2025 | Caption GenerationHallucination | —Unverified | 0 |
| Uncertainty-Aware Trajectory Prediction via Rule-Regularized Heteroscedastic Deep Classification | Apr 17, 2025 | DiversityGaussian Processes | CodeCode Available | 0 |
| Mixer Metaphors: audio interfaces for non-musical applications | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Higher-Order Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| BitNet b1.58 2B4T Technical Report | Apr 16, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Generative Recommendation with Continuous-Token Diffusion | Apr 16, 2025 | DenoisingLanguage Modeling | —Unverified | 0 |
| Rethinking LLM-Based Recommendations: A Query Generation-Based, Training-Free Approach | Apr 16, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DVLTA-VQA: Decoupled Vision-Language Modeling with Text-Guided Adaptation for Blind Video Quality Assessment | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Interpreting the linear structure of vision-language model embedding spaces | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions | Apr 16, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Towards Conversational AI for Human-Machine Collaborative MLOps | Apr 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Recommending Clinical Trials for Online Patient Cases using Artificial Intelligence | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Co-STAR: Collaborative Curriculum Self-Training with Adaptive Regularization for Source-Free Video Domain Adaptation | Apr 15, 2025 | Domain AdaptationLanguage Modeling | CodeCode Available | 0 |
| GraphicBench: A Planning Benchmark for Graphic Design with Language Agents | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation | Apr 15, 2025 | DiagnosticImage Segmentation | CodeCode Available | 0 |
| Large Language Model-Informed Feature Discovery Improves Prediction and Interpretation of Credibility Perceptions of Visual Content | Apr 15, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| ReZero: Enhancing LLM search ability by trying one-more-time | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ProtFlow: Fast Protein Sequence Design via Flow Matching on Compressed Protein Language Model Embeddings | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning | Apr 15, 2025 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| DeepMLF: Multimodal language model with learnable tokens for deep fusion in sentiment analysis | Apr 15, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| Looking beyond the next token | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Efficient Distributed Retrieval-Augmented Generation for Enhancing Language Model Performance | Apr 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Mavors: Multi-granularity Video Representation for Multimodal Large Language Model | Apr 14, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model | Apr 14, 2025 | Anomaly DetectionDomain Adaptation | —Unverified | 0 |
| Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families | Apr 14, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| α-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging | Apr 14, 2025 | Anomaly DetectionDiagnostic | CodeCode Available | 1 |
| MorphTok: Morphologically Grounded Tokenization for Indian Languages | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer | Apr 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| LangPert: Detecting and Handling Task-level Perturbations for Robust Object Rearrangement | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automated Testing of COBOL to Java Transformation | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| GNN-ACLP: Graph Neural Networks based Analog Circuit Link Prediction | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| RealHarm: A Collection of Real-World Language Model Application Failures | Apr 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design | Apr 14, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model | Apr 13, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Domain-Adaptive Continued Pre-Training of Small Language Models | Apr 13, 2025 | Domain AdaptationHellaSwag | —Unverified | 0 |
| Kongzi: A Historical Large Language Model with Fact Enhancement | Apr 13, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation | Apr 13, 2025 | Domain AdaptationLanguage Modeling | CodeCode Available | 2 |
| Structure-Accurate Medical Image Translation via Dynamic Frequency Balance and Knowledge Guidance | Apr 13, 2025 | Clinical KnowledgeLanguage Modeling | —Unverified | 0 |
| ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model | Apr 13, 2025 | DiagnosticLanguage Modeling | CodeCode Available | 2 |
| UXAgent: A System for Simulating Usability Testing of Web Design with LLM Agents | Apr 13, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AgentDynEx: Nudging the Mechanics and Dynamics of Multi-Agent Simulations | Apr 13, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents | Apr 13, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Fine-tuning a Large Language Model for Automating Computational Fluid Dynamics Simulations | Apr 13, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 |