| Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| lmgame-Bench: How Good are LLMs at Playing Games? | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Privacy-Preserving Conformal Prediction Under Local Differential Privacy | May 21, 2025 | Conformal PredictionLarge Language Model | CodeCode Available | 0 |
| PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration | May 21, 2025 | Large Language Modelscientific discovery | CodeCode Available | 1 |
| Listen to the Context: Towards Faithful Large Language Models for Retrieval Augmented Generation on Climate Questions | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Self-GIVE: Associative Thinking from Limited Structured Knowledge for Enhanced Large Language Model Reasoning | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ClickSight: Interpreting Student Clickstreams to Reveal Insights on Learning Strategies via LLMs | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Bridging Sign and Spoken Languages: Pseudo Gloss Generation for Sign Language Translation | May 21, 2025 | In-Context LearningLarge Language Model | —Unverified | 0 |
| Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory | May 21, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| Modality-Balancing Preference Optimization of Large Multimodal Models by Adversarial Negative Mining | May 20, 2025 | Large Language Model | —Unverified | 0 |
| FlowBERT: Prompt-tuned BERT for variable flow field prediction | May 20, 2025 | Dimensionality ReductionFew-Shot Learning | —Unverified | 0 |
| Semi-Clairvoyant Scheduling of Speculative Decoding Requests to Minimize LLM Inference Latency | May 20, 2025 | Large Language ModelScheduling | —Unverified | 0 |
| Large Language Model-Driven Distributed Integrated Multimodal Sensing and Semantic Communications | May 20, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automated Journalistic Questions: A New Method for Extracting 5W1H in French | May 20, 2025 | ArticlesLanguage Modeling | —Unverified | 0 |
| Reliable Decision Support with LLMs: A Framework for Evaluating Consistency in Binary Text Classification Applications | May 20, 2025 | ArticlesBinary text classification | —Unverified | 0 |
| DSMentor: Enhancing Data Science Agents with Curriculum Learning and Online Knowledge Accumulation | May 20, 2025 | In-Context LearningInference Optimization | —Unverified | 0 |
| LLM-based Evaluation Policy Extraction for Ecological Modeling | May 20, 2025 | BenchmarkingLarge Language Model | —Unverified | 0 |
| Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity | May 20, 2025 | GPULarge Language Model | CodeCode Available | 0 |
| Guarded Query Routing for Large Language Models | May 20, 2025 | Large Language Modeltext-classification | CodeCode Available | 0 |
| TRATES: Trait-Specific Rubric-Assisted Cross-Prompt Essay Scoring | May 20, 2025 | Automated Essay ScoringLanguage Modeling | —Unverified | 0 |
| BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks | May 20, 2025 | Large Language ModelMinecraft | CodeCode Available | 0 |
| UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation | May 20, 2025 | Image GenerationLanguage Modeling | —Unverified | 0 |
| Memory-Centric Embodied Question Answer | May 20, 2025 | Embodied Question AnsweringLarge Language Model | —Unverified | 0 |
| Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning | May 20, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation | May 20, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |