| Faithful Logical Reasoning via Symbolic Chain-of-Thought | May 28, 2024 | Logical Reasoning | CodeCode Available | 3 |
| Multimodal Table Understanding | Jun 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| KV-Edit: Training-Free Image Editing for Precise Background Preservation | Feb 24, 2025 | Text-based Image Editing | CodeCode Available | 3 |
| DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving | Aug 1, 2024 | | CodeCode Available | 3 |
| VideoGen-Eval: Agent-based System for Video Generation Evaluation | Mar 30, 2025 | DiversityVideo Generation | CodeCode Available | 3 |
| LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis | May 5, 2025 | ChatbotDecoder | CodeCode Available | 3 |
| JAFAR: Jack up Any Feature at Any Resolution | Jun 10, 2025 | Feature Upsampling | CodeCode Available | 3 |
| VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation | Dec 30, 2024 | Video GenerationVideo Quality Assessment | CodeCode Available | 3 |
| GENERator: A Long-Context Generative Genomic Foundation Model | Feb 11, 2025 | model | CodeCode Available | 3 |
| EVEv2: Improved Baselines for Encoder-Free Vision-Language Models | Feb 10, 2025 | Decoder | CodeCode Available | 3 |
| SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model | Nov 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Half-Inverse Gradients for Physical Deep Learning | Mar 18, 2022 | Deep Learning | CodeCode Available | 3 |
| pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction | Dec 19, 2023 | 3D ReconstructionGeneralizable Novel View Synthesis | CodeCode Available | 3 |
| OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| DisCo: Disentangled Control for Realistic Human Dance Generation | Jun 30, 2023 | Attribute | CodeCode Available | 3 |
| ^2DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials | Jun 20, 2024 | Drug DiscoveryMolecular Property Prediction | CodeCode Available | 3 |
| DARWIN 1.5: Large Language Models as Materials Science Adapted Learners | Dec 16, 2024 | Large Language ModelMulti-Task Learning | CodeCode Available | 3 |
| NeuralOM: Neural Ocean Model for Subseasonal-to-Seasonal Simulation | May 27, 2025 | Computational EfficiencyGraph Neural Network | CodeCode Available | 3 |
| A Comprehensive Survey on Segment Anything Model for Vision and Beyond | May 14, 2023 | | CodeCode Available | 3 |
| HLOB -- Information Persistence and Structure in Limit Order Books | May 29, 2024 | Deep Learning | CodeCode Available | 3 |
| Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Aug 14, 2024 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 |
| Cold-Start Recommendation towards the Era of Large Language Models (LLMs): A Comprehensive Survey and Roadmap | Jan 3, 2025 | Recommendation SystemsWorld Knowledge | CodeCode Available | 3 |
| Mini-Splatting: Representing Scenes with a Constrained Number of Gaussians | Mar 21, 2024 | Binarization | CodeCode Available | 3 |
| Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow | Oct 9, 2024 | | CodeCode Available | 3 |
| Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams | Jun 12, 2024 | cross-modal alignmentLanguage Modelling | CodeCode Available | 3 |
| Opportunities and Risks of LLMs for Scalable Deliberation with Polis | Jun 20, 2023 | | CodeCode Available | 3 |
| RePlay: a Recommendation Framework for Experimentation and Production Use | Sep 11, 2024 | Recommendation Systems | CodeCode Available | 3 |
| Deep Reinforcement Learning | Oct 15, 2018 | Deep Reinforcement LearningManagement | CodeCode Available | 3 |
| SAM3D: Segment Anything in 3D Scenes | Jun 6, 2023 | Segmentation | CodeCode Available | 3 |
| MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views | Nov 7, 2024 | 3DGS3D Reconstruction | CodeCode Available | 3 |
| Generalizing Motion Planners with Mixture of Experts for Autonomous Driving | Oct 21, 2024 | Autonomous DrivingData Augmentation | CodeCode Available | 3 |
| Learning to Reason without External Rewards | May 26, 2025 | Code Generationreinforcement-learning | CodeCode Available | 3 |
| XRDSLAM: A Flexible and Modular Framework for Deep Learning based SLAM | Oct 31, 2024 | 3DGSBenchmarking | CodeCode Available | 3 |
| UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation | Mar 29, 2024 | Image SegmentationLesion Segmentation | CodeCode Available | 3 |
| Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning | Mar 8, 2025 | Reranking | CodeCode Available | 3 |
| Pipeline Gradient-based Model Training on Analog In-memory Accelerators | Oct 19, 2024 | | CodeCode Available | 3 |
| General Geospatial Inference with a Population Dynamics Foundation Model | Nov 11, 2024 | BenchmarkingGraph Neural Network | CodeCode Available | 3 |
| Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion | Dec 5, 2024 | Contrastive LearningHallucination | CodeCode Available | 3 |
| MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Apr 8, 2024 | GPUMultiple-choice | CodeCode Available | 3 |
| PuzzleAvatar: Assembling 3D Avatars from Personal Albums | May 23, 2024 | Language ModellingText to 3D | CodeCode Available | 3 |
| GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering | Feb 15, 2024 | 3D ReconstructionNovel View Synthesis | CodeCode Available | 3 |
| Stronger Fewer & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation | Jan 1, 2024 | Domain GeneralizationSemantic Segmentation | CodeCode Available | 3 |
| Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields | May 4, 2025 | Mixture-of-ExpertsNeRF | CodeCode Available | 3 |
| Self-Refine: Iterative Refinement with Self-Feedback | Mar 30, 2023 | Mathematical ReasoningResponse Generation | CodeCode Available | 3 |
| Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model | Mar 14, 2025 | Image to Video GenerationVideo Generation | CodeCode Available | 3 |
| MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems | May 22, 2025 | | CodeCode Available | 3 |
| LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry | Jan 3, 2024 | Point TrackingVisual Odometry | CodeCode Available | 3 |
| Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey | Feb 8, 2024 | ArticlesEntity Alignment | CodeCode Available | 3 |
| A Short Review and Evaluation of SAM2's Performance in 3D CT Image Segmentation | Aug 20, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 |
| Score-Guided Diffusion for 3D Human Recovery | Mar 14, 2024 | DenoisingHuman Mesh Recovery | CodeCode Available | 3 |