| Knowledge Fusion of Large Language Models | Jan 19, 2024 | Code GenerationCommon Sense Reasoning | CodeCode Available | 4 |
| TALENT: A Tabular Analytics and Learning Toolbox | Jul 4, 2024 | | CodeCode Available | 4 |
| Osprey: Pixel Understanding with Visual Instruction Tuning | Dec 15, 2023 | Language Modelling | CodeCode Available | 4 |
| Let's Verify Step by Step | May 31, 2023 | Active LearningMath | CodeCode Available | 4 |
| Agent-as-a-Judge: Evaluate Agents with Agents | Oct 14, 2024 | Code Generation | CodeCode Available | 4 |
| TUMTraf V2X Cooperative Perception Dataset | Mar 2, 2024 | 3D Object DetectionAutonomous Vehicles | CodeCode Available | 4 |
| Attention on the Sphere | May 16, 2025 | Depth EstimationImage Segmentation | CodeCode Available | 4 |
| Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise | Dec 5, 2024 | DenoisingImage Restoration | CodeCode Available | 4 |
| GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction | Dec 5, 2024 | 3D Semantic Occupancy PredictionAutonomous Driving | CodeCode Available | 4 |
| Vision-Language Models for Vision Tasks: A Survey | Apr 3, 2023 | BenchmarkingKnowledge Distillation | CodeCode Available | 4 |
| A Survey on Visual Mamba | Apr 24, 2024 | Image RegistrationImage Restoration | CodeCode Available | 4 |
| End-to-end Autonomous Driving: Challenges and Frontiers | Jun 29, 2023 | Autonomous Drivingmotion prediction | CodeCode Available | 4 |
| TensoRF: Tensorial Radiance Fields | Mar 17, 2022 | Low-Dose X-Ray Ct ReconstructionNeRF | CodeCode Available | 4 |
| A Convergent Single-Loop Algorithm for Relaxation of Gromov-Wasserstein in Graph Data | Mar 12, 2023 | Computational Efficiency | CodeCode Available | 4 |
| Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks | Nov 17, 2022 | DecoderLanguage Modelling | CodeCode Available | 4 |
| Generating Structured Outputs from Language Models: Benchmark and Studies | Jan 18, 2025 | | CodeCode Available | 4 |
| Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation | Feb 11, 2024 | Cardiac SegmentationContrastive Learning | CodeCode Available | 4 |
| Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis | Mar 7, 2024 | CT ReconstructionNeRF | CodeCode Available | 4 |
| Timer-XL: Long-Context Transformers for Unified Time Series Forecasting | Oct 7, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 4 |
| TRUE: Re-evaluating Factual Consistency Evaluation | Apr 11, 2022 | Question GenerationQuestion-Generation | CodeCode Available | 4 |
| Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection | Oct 17, 2023 | Fact VerificationQuestion Answering | CodeCode Available | 4 |
| MedSAM2: Segment Anything in 3D Medical Images and Videos | Apr 4, 2025 | SegmentationVideo Segmentation | CodeCode Available | 4 |
| DepthFM: Fast Monocular Depth Estimation with Flow Matching | Mar 20, 2024 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 4 |
| Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Jan 7, 2025 | Objectobject-detection | CodeCode Available | 4 |
| Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents | Oct 17, 2024 | Experimental Design | CodeCode Available | 4 |
| T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge | Jun 25, 2024 | Computational EfficiencyCPU | CodeCode Available | 4 |
| JAX-Fluids 2.0: Towards HPC for Differentiable CFD of Compressible Two-phase Flows | Feb 7, 2024 | GPU | CodeCode Available | 4 |
| AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities | Nov 12, 2022 | Contrastive LearningCross-Modal Retrieval | CodeCode Available | 4 |
| Link and code: Fast indexing with graphs and compact regression codes | Apr 26, 2018 | Image Similarity SearchQuantization | CodeCode Available | 4 |
| GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | Mar 21, 2024 | 3D ReconstructionImage to 3D | CodeCode Available | 4 |
| Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content? | Feb 14, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs | Feb 19, 2024 | Knowledge Distillation | CodeCode Available | 4 |
| Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective | Oct 16, 2022 | Coreference ResolutionMultiple-choice | CodeCode Available | 4 |
| AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising | Jun 11, 2024 | Denoising | CodeCode Available | 4 |
| LLaMA Pro: Progressive LLaMA with Block Expansion | Jan 4, 2024 | Instruction FollowingMath | CodeCode Available | 4 |
| Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints | Apr 15, 2025 | GPUInference Optimization | CodeCode Available | 4 |
| Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence | Sep 7, 2022 | | CodeCode Available | 4 |
| OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving | May 27, 2022 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 4 |
| Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO | May 22, 2025 | Domain GeneralizationImage Generation | CodeCode Available | 4 |
| SAMPart3D: Segment Any Part in 3D Objects | Nov 11, 2024 | 3D Generation3D Part Segmentation | CodeCode Available | 4 |
| Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography | May 28, 2024 | Computational EfficiencyComputed Tomography (CT) | CodeCode Available | 4 |
| Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey | Mar 16, 2025 | Autonomous Drivingmultimodal generation | CodeCode Available | 4 |
| RGBD GS-ICP SLAM | Mar 19, 2024 | 3DGSSimultaneous Localization and Mapping | CodeCode Available | 4 |
| Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models | Jun 4, 2024 | Common Sense Reasoning | CodeCode Available | 4 |
| Exploring the Capabilities of Large Multimodal Models on Dense Text | May 9, 2024 | Prompt EngineeringVisual Question Answering (VQA) | CodeCode Available | 4 |
| CameraCtrl: Enabling Camera Control for Text-to-Video Generation | Apr 2, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 4 |
| SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models | Mar 14, 2024 | BlockingGPU | CodeCode Available | 4 |
| Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers | Aug 12, 2024 | GSM8KMath | CodeCode Available | 4 |
| Data quality dimensions for fair AI | May 11, 2023 | ClassificationFairness | CodeCode Available | 4 |
| AnyText: Multilingual Visual Text Generation And Editing | Nov 6, 2023 | Image GenerationOptical Character Recognition (OCR) | CodeCode Available | 4 |