| Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay | Jun 5, 2025 | Reinforcement Learning (RL) | CodeCode Available | 1 |
| FEAT: Full-Dimensional Efficient Attention Transformer for Medical Video Generation | Jun 5, 2025 | DenoisingVideo Generation | CodeCode Available | 1 |
| Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification | Jun 5, 2025 | Hate Speech Detection | —Unverified | 0 |
| Through-the-Wall Radar Human Activity Recognition WITHOUT Using Neural Networks | Jun 5, 2025 | Activity RecognitionHuman Activity Recognition | CodeCode Available | 0 |
| StatsMerging: Statistics-Guided Model Merging via Task-Specific Teacher Distillation | Jun 5, 2025 | Knowledge Distillation | CodeCode Available | 0 |
| Rethinking Contrastive Learning in Session-based Recommendation | Jun 5, 2025 | Contrastive LearningSelf-Supervised Learning | CodeCode Available | 0 |
| Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching | Jun 5, 2025 | In-Context Learning | CodeCode Available | 0 |
| MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines | Jun 5, 2025 | | CodeCode Available | 0 |
| Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models | Jun 5, 2025 | RerankingRetrieval | CodeCode Available | 5 |
| Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models | Jun 5, 2025 | counterfactualData Augmentation | CodeCode Available | 0 |
| Dissecting Long Reasoning Models: An Empirical Study | Jun 5, 2025 | Reinforcement Learning (RL) | CodeCode Available | 0 |
| Composing Agents to Minimize Worst-case Risk | Jun 5, 2025 | Fairness | CodeCode Available | 0 |
| Tuning the Right Foundation Models is What you Need for Partial Label Learning | Jun 5, 2025 | Model SelectionPartial Label Learning | CodeCode Available | 1 |
| ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation | Jun 5, 2025 | 3D ReconstructionNeRF | —Unverified | 0 |
| Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis | Jun 5, 2025 | GPUMulti-Label Classification | —Unverified | 0 |
| Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels | Jun 5, 2025 | Semantic correspondence | —Unverified | 0 |
| Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning | Jun 5, 2025 | Mathematical ReasoningProblem Decomposition | —Unverified | 0 |
| A MISMATCHED Benchmark for Scientific Natural Language Inference | Jun 5, 2025 | ArticlesNatural Language Inference | CodeCode Available | 0 |
| Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding | Jun 5, 2025 | | CodeCode Available | 0 |
| VideoMolmo: Spatio-Temporal Grounding Meets Pointing | Jun 5, 2025 | Autonomous DrivingAutonomous Navigation | CodeCode Available | 2 |
| Flex-TravelPlanner: A Benchmark for Flexible Planning with Language Agents | Jun 5, 2025 | | CodeCode Available | 0 |
| Identifying Reliable Evaluation Metrics for Scientific Text Revision | Jun 5, 2025 | Instruction Following | CodeCode Available | 0 |
| Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models | Jun 5, 2025 | DiagnosticHallucination | CodeCode Available | 1 |
| HALoS: Hierarchical Asynchronous Local SGD over Slow Networks for Geo-Distributed Large Language Model Training | Jun 5, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Controlling Summarization Length Through EOS Token Weighting | Jun 5, 2025 | DecoderText Generation | —Unverified | 0 |
| TALL -- A Trainable Architecture for Enhancing LLM Performance in Low-Resource Languages | Jun 5, 2025 | Computational EfficiencyTranslation | —Unverified | 0 |
| Quantifying Cross-Modality Memorization in Vision-Language Models | Jun 5, 2025 | Machine UnlearningMemorization | —Unverified | 0 |
| DSG-World: Learning a 3D Gaussian World Model from Dual State Videos | Jun 5, 2025 | 3D Reconstruction | —Unverified | 0 |
| Stable Vision Concept Transformers for Medical Diagnosis | Jun 5, 2025 | Medical Diagnosis | —Unverified | 0 |
| MARBLE: Material Recomposition and Blending in CLIP-Space | Jun 5, 2025 | AttributeDenoising | —Unverified | 0 |
| ProRefine: Inference-time Prompt Refinement with Textual Feedback | Jun 5, 2025 | Mathematical Reasoning | —Unverified | 0 |
| UNO: Unlearning via Orthogonalization in Generative models | Jun 5, 2025 | | CodeCode Available | 0 |
| Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning | Jun 5, 2025 | Question AnsweringRAG | CodeCode Available | 0 |
| Debatable Intelligence: Benchmarking LLM Judges via Debate Speech Evaluation | Jun 5, 2025 | Benchmarking | CodeCode Available | 0 |
| ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition | Jun 5, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Prompting LLMs: Length Control for Isometric Machine Translation | Jun 5, 2025 | de-enMachine Translation | —Unverified | 0 |
| OpenAg: Democratizing Agricultural Intelligence | Jun 5, 2025 | Knowledge GraphsTransfer Learning | —Unverified | 0 |
| Search Arena: Analyzing Search-Augmented LLMs | Jun 5, 2025 | Fact Checking | CodeCode Available | 2 |
| BSBench: will your LLM find the largest prime number? | Jun 5, 2025 | Benchmarking | CodeCode Available | 0 |
| Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers | Jun 5, 2025 | GPUText-to-Video Generation | —Unverified | 0 |
| RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion | Jun 5, 2025 | Novel View SynthesisObject | —Unverified | 0 |
| SAM-aware Test-time Adaptation for Universal Medical Image Segmentation | Jun 5, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 0 |
| A Reasoning-Based Approach to Cryptic Crossword Clue Solving | Jun 5, 2025 | | CodeCode Available | 0 |
| FedAPM: Federated Learning via ADMM with Partial Model Personalization | Jun 5, 2025 | Federated Learning | CodeCode Available | 0 |
| Predicting ICU In-Hospital Mortality Using Adaptive Transformer Layer Fusion | Jun 5, 2025 | | CodeCode Available | 0 |
| Evaluating Sparse Autoencoders: From Shallow Design to Matching Pursuit | Jun 5, 2025 | Dictionary Learning | —Unverified | 0 |
| Contrastive Flow Matching | Jun 5, 2025 | | CodeCode Available | 2 |
| Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation | Jun 5, 2025 | Decision MakingMultimodal Reasoning | —Unverified | 0 |
| LSM-2: Learning from Incomplete Wearable Sensor Data | Jun 5, 2025 | DiagnosticImputation | —Unverified | 0 |
| Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback | Jun 5, 2025 | Math | —Unverified | 0 |