| GREC: Generalized Referring Expression Comprehension | Aug 30, 2023 | Generalized Referring Expression ComprehensionReferring Expression | CodeCode Available | 2 | 5 |
| MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection | Jan 29, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 2 | 5 |
| PointSea: Point Cloud Completion via Self-structure Augmentation | Feb 24, 2025 | Point Cloud Completion | CodeCode Available | 2 | 5 |
| Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning | Dec 16, 2024 | HallucinationRobot Manipulation | CodeCode Available | 2 | 5 |
| Semantic Human Mesh Reconstruction with Textures | Mar 5, 2024 | | CodeCode Available | 2 | 5 |
| Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning | Jun 11, 2025 | Image CaptioningMath | CodeCode Available | 2 | 5 |
| CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition | May 24, 2023 | DenoisingKnowledge Distillation | CodeCode Available | 2 | 5 |
| ALBench: A Framework for Evaluating Active Learning in Object Detection | Jul 27, 2022 | Active Learningimage-classification | CodeCode Available | 2 | 5 |
| NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments | Jun 30, 2025 | Decision MakingVision and Language Navigation | CodeCode Available | 2 | 5 |
| Bilateral Propagation Network for Depth Completion | Mar 17, 2024 | Depth Completion | CodeCode Available | 2 | 5 |
| The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages | Aug 31, 2023 | Data AugmentationText Generation | CodeCode Available | 2 | 5 |
| Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance | Feb 12, 2025 | BenchmarkingLong-Context Understanding | CodeCode Available | 2 | 5 |
| HUGS: Human Gaussian Splats | Nov 29, 2023 | 3DGSNeural Rendering | CodeCode Available | 2 | 5 |
| NusaCrowd: Open Source Initiative for Indonesian NLP Resources | Dec 19, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs | May 12, 2025 | AI AgentKnowledge Distillation | CodeCode Available | 2 | 5 |
| Do Llamas Work in English? On the Latent Language of Multilingual Transformers | Feb 16, 2024 | | CodeCode Available | 2 | 5 |
| Leveraging Rust types for modular specification and verification | Oct 10, 2019 | Formal Logic | CodeCode Available | 2 | 5 |
| DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment | Mar 27, 2024 | | CodeCode Available | 2 | 5 |
| Graph Language Models | Jan 13, 2024 | Knowledge GraphsLanguage Modeling | CodeCode Available | 2 | 5 |
| A Short Survey of Viewing Large Language Models in Legal Aspect | Mar 16, 2023 | | CodeCode Available | 2 | 5 |
| Stylized Face Sketch Extraction via Generative Prior with Limited Data | Mar 17, 2024 | Face Sketch Synthesis | CodeCode Available | 2 | 5 |
| Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing | Jul 20, 2022 | 4kImage Enhancement | CodeCode Available | 2 | 5 |
| RGBDS-SLAM: A RGB-D Semantic Dense SLAM Based on 3D Multi Level Pyramid Gaussian Splatting | Dec 2, 2024 | | CodeCode Available | 2 | 5 |
| An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains | Oct 5, 2024 | DiagnosticEvent Detection | CodeCode Available | 2 | 5 |
| SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction | Nov 15, 2024 | 3D ReconstructionDepth Estimation | CodeCode Available | 2 | 5 |
| AthletePose3D: A Benchmark Dataset for 3D Human Pose Estimation and Kinematic Validation in Athletic Movements | Mar 10, 2025 | 3D Human Pose Estimation3D Pose Estimation | CodeCode Available | 2 | 5 |
| Deep Visual Geo-localization Benchmark | Apr 7, 2022 | BenchmarkingData Augmentation | CodeCode Available | 2 | 5 |
| AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning | Nov 18, 2024 | Mathematical Reasoning | CodeCode Available | 2 | 5 |
| VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark | Mar 12, 2024 | knowledge editingLanguage Modeling | CodeCode Available | 2 | 5 |
| Probabilistic Language-Image Pre-Training | Oct 24, 2024 | | CodeCode Available | 2 | 5 |
| R-AIF: Solving Sparse-Reward Robotic Tasks from Pixels with Active Inference and World Models | Sep 21, 2024 | | CodeCode Available | 2 | 5 |
| Zero-Shot Scene Change Detection | Jun 17, 2024 | Change DetectionScene Change Detection | CodeCode Available | 2 | 5 |
| Simultaneously Recovering Multi-Person Meshes and Multi-View Cameras with Human Semantics | Dec 25, 2024 | Camera Calibration | CodeCode Available | 2 | 5 |
| Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for Chemistry | Jul 4, 2024 | | CodeCode Available | 2 | 5 |
| Discrete Prior-based Temporal-coherent Content Prediction for Blind Face Video Restoration | Jan 17, 2025 | Video Restoration | CodeCode Available | 2 | 5 |
| KBNet: Kernel Basis Network for Image Restoration | Mar 6, 2023 | Color Image DenoisingDeblurring | CodeCode Available | 2 | 5 |
| Physical Plausibility-aware Trajectory Prediction via Locomotion Embodiment | Mar 21, 2025 | PredictionTrajectory Prediction | CodeCode Available | 2 | 5 |
| Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID | Mar 21, 2025 | | CodeCode Available | 2 | 5 |
| L-PR: Exploiting LiDAR Fiducial Marker for Unordered Low Overlap Multiview Point Cloud Registration | Jun 5, 2024 | 3D geometryPoint Cloud Registration | CodeCode Available | 2 | 5 |
| PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders | Aug 16, 2024 | 3D Object Classification3D Point Cloud Classification | CodeCode Available | 2 | 5 |
| Protein Conformation Generation via Force-Guided SE(3) Diffusion Models | Mar 21, 2024 | Diversity | CodeCode Available | 2 | 5 |
| LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing | Apr 27, 2024 | Log Parsing | CodeCode Available | 2 | 5 |
| CARLA2Real: a tool for reducing the sim2real gap in CARLA simulator | Oct 23, 2024 | Autonomous DrivingSelf-Driving Cars | CodeCode Available | 2 | 5 |
| RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering | Mar 8, 2022 | Neural Rendering | CodeCode Available | 2 | 5 |
| ADELIE: Aligning Large Language Models on Information Extraction | May 8, 2024 | | CodeCode Available | 2 | 5 |
| Web-Shepherd: Advancing PRMs for Reinforcing Web Agents | May 21, 2025 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 2 | 5 |
| DeepPrivacy2: Towards Realistic Full-Body Anonymization | Nov 17, 2022 | DiversityFace Anonymization | CodeCode Available | 2 | 5 |
| Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting | Jun 18, 2022 | Graph Neural NetworkMultivariate Time Series Forecasting | CodeCode Available | 2 | 5 |
| EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering | May 30, 2025 | Denoising | CodeCode Available | 2 | 5 |
| Graph Condensation: A Survey | Jan 22, 2024 | FairnessGraph Generation | CodeCode Available | 2 | 5 |