| STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection | Apr 3, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 1 |
| LLM Social Simulations Are a Promising Research Method | Apr 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Prompt Optimization with Logged Bandit Data | Apr 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining | Apr 2, 2025 | Continual LearningContinual Pretraining | CodeCode Available | 1 |
| A Survey of Scaling in Large Language Model Reasoning | Apr 2, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks | Apr 2, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| BioAtt: Anatomical Prior Driven Low-Dose CT Denoising | Apr 2, 2025 | DenoisingLanguage Modeling | —Unverified | 0 |
| STPNet: Scale-aware Text Prompt Network for Medical Image Segmentation | Apr 2, 2025 | Image SegmentationLanguage Modeling | CodeCode Available | 1 |
| Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing | Apr 2, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LLM-VPRF: Large Language Model Based Vector Pseudo Relevance Feedback | Apr 2, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Biomedical Question Answering via Multi-Level Summarization on a Local Knowledge Graph | Apr 2, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training | Apr 2, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| LLM-mediated Dynamic Plan Generation with a Multi-Agent Approach | Apr 2, 2025 | Autonomous VehiclesLanguage Modeling | —Unverified | 0 |
| TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication | Apr 2, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Representation Bending for Large Language Model Safety | Apr 2, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| When Persuasion Overrides Truth in Multi-Agent LLM Debates: Introducing a Confidence-Weighted Persuasion Override Rate (CW-POR) | Apr 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Unleashing the Power of Pre-trained Encoders for Universal Adversarial Attack Detection | Apr 1, 2025 | Adversarial AttackAdversarial Attack Detection | —Unverified | 0 |
| VerifiAgent: a Unified Verification Agent in Language Model Reasoning | Apr 1, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Multi-Token Attention | Apr 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Command A: An Enterprise-Ready Large Language Model | Apr 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Detecting PTSD in Clinical Interviews: A Comparative Analysis of NLP Methods and Large Language Models | Apr 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automated detection of atomicity violations in large-scale systems | Apr 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ShieldGemma 2: Robust and Tractable Image Content Moderation | Apr 1, 2025 | Image GenerationLanguage Modeling | —Unverified | 0 |
| 4th PVUW MeViS 3rd Place Report: Sa2VA | Apr 1, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| CrowdVLM-R1: Expanding R1 Ability to Vision Language Model for Crowd Counting using Fuzzy Group Relative Policy Reward | Mar 31, 2025 | Crowd CountingLanguage Modeling | CodeCode Available | 1 |