| Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs | Mar 8, 2025 | | CodeCode Available | 3 |
| Learning and discovering multiple solutions using physics-informed neural networks with random initialization and deep ensemble | Mar 8, 2025 | Uncertainty Quantification | CodeCode Available | 3 |
| GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images | Mar 8, 2025 | cross-modal alignmentDiagnostic | CodeCode Available | 3 |
| GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving | Mar 7, 2025 | Autonomous DrivingDenoising | CodeCode Available | 3 |
| MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio | Mar 7, 2025 | Video Generation | CodeCode Available | 3 |
| Simulating the Real World: A Unified Survey of Multimodal Generative Models | Mar 6, 2025 | 3D GenerationSurvey | CodeCode Available | 3 |
| L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning | Mar 6, 2025 | | CodeCode Available | 3 |
| SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing | Mar 6, 2025 | ArticlesSurvey | CodeCode Available | 3 |
| EgoLife: Towards Egocentric Life Assistant | Mar 5, 2025 | Question AnsweringVideo Understanding | CodeCode Available | 3 |
| Parallelized Planning-Acting for Efficient LLM-based Multi-Agent Systems | Mar 5, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 3 |
| All-atom Diffusion Transformers: Unified generative modelling of molecules and materials | Mar 5, 2025 | AllUnconditional Crystal Generation | CodeCode Available | 3 |
| Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation | Mar 4, 2025 | Contact-rich ManipulationImitation Learning | CodeCode Available | 3 |
| A Phylogenetic Approach to Genomic Language Modeling | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding | Mar 4, 2025 | HumanEvalmbpp | CodeCode Available | 3 |
| OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale | Mar 4, 2025 | Text to SQLText-To-SQL | CodeCode Available | 3 |
| Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection | Mar 4, 2025 | Anomaly DetectionMulti-class Anomaly Detection | CodeCode Available | 3 |
| Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures | Mar 3, 2025 | Crack SegmentationMamba | CodeCode Available | 3 |
| LiteGS: A High-Performance Modular Framework for Gaussian Splatting Training | Mar 3, 2025 | 3DGSGPU | CodeCode Available | 3 |
| Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs | Mar 3, 2025 | Reinforcement Learning (RL) | CodeCode Available | 3 |
| MUSt3R: Multi-view Network for Stereo 3D Reconstruction | Mar 3, 2025 | 3D ReconstructionArticles | CodeCode Available | 3 |
| UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface | Mar 3, 2025 | Instance SegmentationReasoning Segmentation | CodeCode Available | 3 |
| Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation | Mar 3, 2025 | 3D Generation3D Reconstruction | CodeCode Available | 3 |
| PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization | Mar 3, 2025 | | CodeCode Available | 3 |
| Proteina: Scaling Flow-based Protein Structure Generative Models | Mar 2, 2025 | Protein Design | CodeCode Available | 3 |