| Better Process Supervision with Bi-directional Rewarding Signals | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model | Mar 6, 2025 | General KnowledgeImage Captioning | CodeCode Available | 2 |
| Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach | Mar 6, 2025 | GPULanguage Modeling | —Unverified | 0 |
| TPC: Cross-Temporal Prediction Connection for Vision-Language Model Hallucination Reduction | Mar 6, 2025 | HallucinationLanguage Modeling | —Unverified | 0 |
| LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression | Mar 6, 2025 | BenchmarkingCommon Sense Reasoning | CodeCode Available | 0 |
| AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM | Mar 6, 2025 | Anomaly DetectionLanguage Modeling | CodeCode Available | 2 |
| KidneyTalk-open: No-code Deployment of a Private Large Language Model with Medical Documentation-Enhanced Knowledge Database for Kidney Disease | Mar 6, 2025 | ChunkingLanguage Modeling | CodeCode Available | 0 |
| PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks | Mar 6, 2025 | document understandingLanguage Modeling | —Unverified | 0 |
| Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining | Mar 6, 2025 | GPUHyperparameter Optimization | —Unverified | 0 |
| Measuring temporal effects of agent knowledge by date-controlled tool use | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AgentSafe: Safeguarding Large Language Model-based Multi-agent Systems via Hierarchical Data Management | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AOLO: Analysis and Optimization For Low-Carbon Oriented Wireless Large Language Model Services | Mar 6, 2025 | Deep Reinforcement LearningLanguage Modeling | —Unverified | 0 |
| The Next Frontier of LLM Applications: Open Ecosystems and Hardware Synergy | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Scaling Rich Style-Prompted Text-to-Speech Datasets | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities | Mar 6, 2025 | Audio captioningLanguage Modeling | —Unverified | 0 |
| An Egocentric Vision-Language Model based Portable Real-time Smart Assistant | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| L^2M: Mutual Information Scaling Law for Long-Context Language Modeling | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| From Idea to CAD: A Language Model-Driven Multi-Agent System for Collaborative Design | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Know Thy Judge: On the Robustness Meta-Evaluation of LLM Safety Judges | Mar 6, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Human Preferences for Constructive Interactions in Language Model Alignment | Mar 5, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions | Mar 5, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Personalized Federated Fine-tuning for Heterogeneous Data: An Automatic Rank Learning Approach via Two-Level LoRA | Mar 5, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Human Implicit Preference-Based Policy Fine-tuning for Multi-Agent Reinforcement Learning in USV Swarm | Mar 5, 2025 | Collision AvoidanceFairness | —Unverified | 0 |
| Parallelized Planning-Acting for Efficient LLM-based Multi-Agent Systems | Mar 5, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 3 |
| Enhancing Spoken Discourse Modeling in Language Models Using Gestural Cues | Mar 5, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Collaborative Expert LLMs Guided Multi-Objective Molecular Optimization | Mar 5, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability | Mar 5, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| PAIR: A Novel Large Language Model-Guided Selection Strategy for Evolutionary Algorithms | Mar 5, 2025 | DiversityEvolutionary Algorithms | CodeCode Available | 0 |
| Tabby: Tabular Data Synthesis with Language Models | Mar 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Hierarchical Re-ranker Retriever (HRR) | Mar 4, 2025 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| Vision-Language Model IP Protection via Prompt-based Learning | Mar 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DriveGen: Towards Infinite Diverse Traffic Scenarios with Large Models | Mar 4, 2025 | Autonomous DrivingDiversity | —Unverified | 0 |
| A Phylogenetic Approach to Genomic Language Modeling | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Trust, Experience, and Innovation: Key Factors Shaping American Attitudes About AI | Mar 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Adapting Decoder-Based Language Models for Diverse Encoder Downstream Tasks | Mar 4, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| Optimizing open-domain question answering with graph-based retrieval augmented generation | Mar 4, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Iterative Value Function Optimization for Guided Decoding | Mar 4, 2025 | Decision MakingInstruction Following | —Unverified | 0 |
| Language Models can Self-Improve at State-Value Estimation for Better Search | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model | Mar 4, 2025 | es-enLanguage Modeling | CodeCode Available | 1 |
| AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation | Mar 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ATLaS: Agent Tuning via Learning Critical Steps | Mar 4, 2025 | Decision MakingLanguage Modeling | —Unverified | 0 |
| Generator-Assistant Stepwise Rollback Framework for Large Language Model Agent | Mar 4, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 0 |
| Haste Makes Waste: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments | Mar 4, 2025 | 2D Panoptic SegmentationGraph Generation | CodeCode Available | 2 |
| RedChronos: A Large Language Model-Based Log Analysis System for Insider Threat Detection in Enterprises | Mar 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Words or Vision: Do Vision-Language Models Have Blind Faith in Text? | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Superscopes: Amplifying Internal Feature Representations for Language Model Interpretation | Mar 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Forgetting Transformer: Softmax Attention with a Forget Gate | Mar 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |