| Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms | May 23, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities | May 23, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning | May 23, 2025 | Image GenerationLanguage Modeling | CodeCode Available | 1 |
| ChemMLLM: Chemical Multimodal Large Language Model | May 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization | May 22, 2025 | Combinatorial OptimizationLanguage Modeling | CodeCode Available | 1 |
| Speculative Decoding Reimagined for Multimodal Large Language Models | May 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| U-SAM: An audio language Model for Unified Speech, Audio, and Music Understanding | May 20, 2025 | cross-modal alignmentLanguage Modeling | CodeCode Available | 1 |
| R3: Robust Rubric-Agnostic Reward Models | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| 3D Visual Illusion Depth Estimation | May 19, 2025 | Common Sense ReasoningDepth Estimation | CodeCode Available | 1 |
| Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation | May 16, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Unifying Segment Anything in Microscopy with Multimodal Large Language Model | May 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Multi-Token Prediction Needs Registers | May 15, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts | May 15, 2025 | Continual LearningLanguage Modeling | CodeCode Available | 1 |
| Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving | May 13, 2025 | 3D visual groundingAutonomous Driving | CodeCode Available | 1 |
| Kalman Filter Enhanced GRPO for Reinforcement Learning-Based Language Model Reasoning | May 12, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold Networks | May 12, 2025 | Kolmogorov-Arnold NetworksLanguage Modeling | CodeCode Available | 1 |
| MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks | May 9, 2025 | DiagnosticInstruction Following | CodeCode Available | 1 |
| CreoPep: A Universal Deep Learning Framework for Target-Specific Peptide Design and Optimization | May 5, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| WirelessAgent: Large Language Model Agents for Intelligent Wireless Networks | May 2, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Visual Test-time Scaling for GUI Agent Grounding | May 1, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework | Apr 30, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Reviving Any-Subset Autoregressive Models with Principled Parallel Sampling and Speculative Decoding | Apr 29, 2025 | Code GenerationDensity Estimation | CodeCode Available | 1 |
| PhenoAssistant: A Conversational Multi-Agent AI System for Automated Plant Phenotyping | Apr 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LEAM: A Prompt-only Large Language Model-enabled Antenna Modeling Method | Apr 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement | Apr 22, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |