| Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems | Feb 16, 2025 | Open-Domain Question AnsweringQuestion Answering | CodeCode Available | 2 |
| NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM | Feb 16, 2025 | NavigateRAG | CodeCode Available | 2 |
| How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training | Feb 16, 2025 | | CodeCode Available | 2 |
| FinMTEB: Finance Massive Text Embedding Benchmark | Feb 16, 2025 | ArticlesSemantic Textual Similarity | CodeCode Available | 2 |
| RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation | Feb 16, 2025 | graph constructionKnowledge Graphs | CodeCode Available | 2 |
| Hierarchical Expert Prompt for Large-Language-Model: An Approach Defeat Elite AI in TextStarCraft II for the First Time | Feb 16, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 2 |
| MasRouter: Learning to Route LLMs for Multi-Agent Systems | Feb 16, 2025 | HumanEvalmbpp | CodeCode Available | 2 |
| SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding | Feb 15, 2025 | Question AnsweringStreaming video understanding | CodeCode Available | 2 |
| D-CIPHER: Dynamic Collaborative Intelligent Multi-Agent System with Planner and Heterogeneous Executors for Offensive Security | Feb 15, 2025 | Task Planning | CodeCode Available | 2 |
| MonoForce: Learnable Image-conditioned Physics Engine | Feb 14, 2025 | Model Predictive ControlTrajectory Prediction | CodeCode Available | 2 |
| A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations | Feb 14, 2025 | Survey | CodeCode Available | 2 |
| Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning | Feb 14, 2025 | Reinforcement Learning (RL)Skills Assessment | CodeCode Available | 2 |
| Process Reward Models for LLM Agents: Practical Framework and Directions | Feb 14, 2025 | | CodeCode Available | 2 |
| Compression-Aware One-Step Diffusion Model for JPEG Artifact Removal | Feb 14, 2025 | DenoisingImage Restoration | CodeCode Available | 2 |
| CoSER: Coordinating LLM-Based Persona Simulation of Established Roles | Feb 13, 2025 | | CodeCode Available | 2 |
| KET-RAG: A Cost-Efficient Multi-Granular Indexing Framework for Graph-RAG | Feb 13, 2025 | Knowledge GraphsLarge Language Model | CodeCode Available | 2 |
| Unlocking the Potential of Classic GNNs for Graph-level Tasks: Simple Architectures Meet Excellence | Feb 13, 2025 | Graph ClassificationGraph Property Prediction | CodeCode Available | 2 |
| DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra | Feb 13, 2025 | DecoderDe novo molecule generation from MS/MS spectrum (bonus chemical formulae) | CodeCode Available | 2 |
| Diffusion Models for Molecules: A Survey of Methods and Tasks | Feb 13, 2025 | DiversityDrug Discovery | CodeCode Available | 2 |
| TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-Instrument | Feb 13, 2025 | Audio GenerationDecoder | CodeCode Available | 2 |
| A Judge-free LLM Open-ended Generation Benchmark Based on the Distributional Hypothesis | Feb 13, 2025 | Text Generation | CodeCode Available | 2 |
| Digi-Q: Learning Q-Value Functions for Training Device-Control Agents | Feb 13, 2025 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 2 |
| Harnessing Vision Models for Time Series Analysis: A Survey | Feb 13, 2025 | SurveyTime Series | CodeCode Available | 2 |
| DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References | Feb 13, 2025 | Human-Object Interaction DetectionImitation Learning | CodeCode Available | 2 |
| CoT-Valve: Length-Compressible Chain-of-Thought Tuning | Feb 13, 2025 | GSM8K | CodeCode Available | 2 |