| SAM-Med2D | Aug 30, 2023 | DecoderImage Segmentation | CodeCode Available | 3 |
| Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents | Feb 8, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 3 |
| MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation | Apr 8, 2024 | Image GenerationImage-to-Image Translation | CodeCode Available | 3 |
| DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations | Mar 11, 2024 | Disentanglement | CodeCode Available | 3 |
| GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation | Jun 10, 2024 | 3D GenerationNeRF | CodeCode Available | 3 |
| Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details | Jun 19, 2025 | Texture Synthesis | CodeCode Available | 3 |
| ResearchTown: Simulator of Human Research Community | Dec 23, 2024 | | CodeCode Available | 3 |
| From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision | Dec 15, 2024 | Active Learning | CodeCode Available | 3 |
| How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs | Jan 12, 2024 | | CodeCode Available | 3 |
| LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion | Nov 4, 2023 | BenchmarkingImitation Learning | CodeCode Available | 3 |
| TorchDrug: A Powerful and Flexible Machine Learning Platform for Drug Discovery | Feb 16, 2022 | BIG-bench Machine LearningDrug Discovery | CodeCode Available | 3 |
| MathArena: Evaluating LLMs on Uncontaminated Math Competitions | May 29, 2025 | MathMathematical Reasoning | CodeCode Available | 3 |
| Frequency-aware Feature Fusion for Dense Image Prediction | Aug 23, 2024 | Prediction | CodeCode Available | 3 |
| VoiceBench: Benchmarking LLM-Based Voice Assistants | Oct 22, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation | Mar 18, 2024 | 3D Generation3D Reconstruction | CodeCode Available | 3 |
| MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents | Jan 24, 2025 | Benchmarking | CodeCode Available | 3 |
| GS-SDF: LiDAR-Augmented Gaussian Splatting and Neural SDF for Geometrically Consistent Rendering and Reconstruction | Mar 13, 2025 | Autonomous DrivingSurface Reconstruction | CodeCode Available | 3 |
| Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals | Sep 29, 2022 | Text Generation | CodeCode Available | 3 |
| Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription | Apr 15, 2024 | Music Transcription | CodeCode Available | 3 |
| PointCNN: Convolution On X-Transformed Points | Jan 23, 2018 | 3D Instance Segmentation3D Part Segmentation | CodeCode Available | 3 |
| OverleafCopilot: Empowering Academic Writing in Overleaf with Large Language Models | Mar 13, 2024 | | CodeCode Available | 3 |
| Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey | Nov 14, 2024 | | CodeCode Available | 3 |
| Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption | Jan 18, 2025 | Infrared And Visible Image Fusion | CodeCode Available | 3 |
| Game-theoretic LLM: Agent Workflow for Negotiation Games | Nov 8, 2024 | Decision Making | CodeCode Available | 3 |
| Tracking Anything with Decoupled Video Segmentation | Sep 7, 2023 | Open-Vocabulary Video SegmentationOpen-World Video Segmentation | CodeCode Available | 3 |
| ROLO-SLAM: Rotation-Optimized LiDAR-Only SLAM in Uneven Terrain with Ground Vehicle | Jan 4, 2025 | Pose Estimation | CodeCode Available | 3 |
| BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps | May 12, 2024 | Computational Efficiency | CodeCode Available | 3 |
| PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation | Dec 4, 2023 | Depth Estimation | CodeCode Available | 3 |
| Investigating Efficiently Extending Transformers for Long Input Summarization | Aug 8, 2022 | 16kLong-range modeling | CodeCode Available | 3 |
| Multiple Object Tracking as ID Prediction | Mar 25, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 3 |
| COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training | Oct 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| MoAI: Mixture of All Intelligence for Large Language and Vision Models | Mar 12, 2024 | AllMixture-of-Experts | CodeCode Available | 3 |
| MMAUD: A Comprehensive Multi-Modal Anti-UAV Dataset for Modern Miniature Drone Threats | Feb 6, 2024 | | CodeCode Available | 3 |
| InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation | Aug 28, 2024 | Cell SegmentationGPU | CodeCode Available | 3 |
| Nd-BiMamba2: A Unified Bidirectional Architecture for Multi-Dimensional Data Processing | Nov 22, 2024 | Computational EfficiencyCPU | CodeCode Available | 3 |
| A Joint Representation Using Continuous and Discrete Features for Cardiovascular Diseases Risk Prediction on Chest CT Scans | Oct 24, 2024 | | CodeCode Available | 3 |
| ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Mar 27, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 3 |
| VBench: Comprehensive Benchmark Suite for Video Generative Models | Nov 29, 2023 | Image GenerationVideo Generation | CodeCode Available | 3 |
| Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text | Jan 22, 2024 | | CodeCode Available | 3 |
| Theoretically Achieving Continuous Representation of Oriented Bounding Boxes | Feb 29, 2024 | Fairnessobject-detection | CodeCode Available | 3 |
| Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning | May 22, 2025 | | CodeCode Available | 3 |
| Do generative video models understand physical principles? | Jan 14, 2025 | Video Generation | CodeCode Available | 3 |
| Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search | May 21, 2025 | Information Retrieval | CodeCode Available | 3 |
| Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos | Jun 23, 2022 | Imitation LearningMinecraft | CodeCode Available | 3 |
| VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography | Jun 7, 2024 | Computed Tomography (CT)Image Segmentation | CodeCode Available | 3 |
| Remote Sensing Temporal Vision-Language Models: A Comprehensive Survey | Dec 3, 2024 | Change DetectionDescriptive | CodeCode Available | 3 |
| STG-Mamba: Spatial-Temporal Graph Learning via Selective State Space Model | Mar 19, 2024 | Computational EfficiencyGraph Learning | CodeCode Available | 3 |
| Rethinking Evaluation Metrics of Open-Vocabulary Segmentaion | Nov 6, 2023 | Segmentation | CodeCode Available | 3 |
| Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory Mapping | Feb 29, 2024 | Image Generation | CodeCode Available | 3 |
| ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models | Feb 18, 2024 | Language ModellingQuestion Answering | CodeCode Available | 3 |