| Towards Relation-centered Pooling and Convolution for Heterogeneous Graph Learning Networks | Oct 31, 2022 | Graph LearningGraph Neural Network | CodeCode Available | 2 |
| Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification | Aug 15, 2023 | Arithmetic ReasoningMath | CodeCode Available | 2 |
| SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process | Dec 19, 2023 | DenoisingDichotomous Image Segmentation | CodeCode Available | 2 |
| Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction | Jun 6, 2024 | NeRF | CodeCode Available | 2 |
| GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI | Aug 6, 2024 | Question AnsweringVisual Question Answering | CodeCode Available | 2 |
| NNetscape Navigator: Complex Demonstrations for Web Agents Without a Demonstrator | Oct 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Elevating Flow-Guided Video Inpainting with Reference Generation | Dec 12, 2024 | 2kVideo Inpainting | CodeCode Available | 2 |
| From an Image to a Scene: Learning to Imagine the World from a Million 360 Videos | Dec 10, 2024 | 3D ReconstructionNovel View Synthesis | CodeCode Available | 2 |
| DMPlug: A Plug-in Method for Solving Inverse Problems with Diffusion Models | May 27, 2024 | | CodeCode Available | 2 |
| Speedy Deformable 3D Gaussian Splatting: Fast Rendering and Compression of Dynamic Scenes | Jun 9, 2025 | 3DGSNeRF | CodeCode Available | 2 |
| Fieldscale: Locality-Aware Field-based Adaptive Rescaling for Thermal Infrared Image | May 24, 2024 | Image Quality Assessment | CodeCode Available | 2 |
| GOReloc: Graph-based Object-Level Relocalization for Visual SLAM | Aug 15, 2024 | Objectobject-detection | CodeCode Available | 2 |
| Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields | Jul 27, 2022 | Denoising | CodeCode Available | 2 |
| Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation | May 25, 2022 | Cross-Lingual TransferMachine Translation | CodeCode Available | 2 |
| DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection | Jul 21, 2022 | 3D Object Detection3D Object Detection From Monocular Images | CodeCode Available | 2 |
| Unlocking Generalization Power in LiDAR Point Cloud Registration | Mar 13, 2025 | Autonomous DrivingPoint Cloud Registration | CodeCode Available | 2 |
| Low-latency Real-time Voice Conversion on CPU | Nov 1, 2023 | CPUKnowledge Distillation | CodeCode Available | 2 |
| Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation | Jan 1, 2024 | DescriptiveObject | CodeCode Available | 2 |
| FAMNet: Frequency-aware Matching Network for Cross-domain Few-shot Medical Image Segmentation | Dec 12, 2024 | Cross-Domain Few-ShotDomain Generalization | CodeCode Available | 2 |
| All-in-one simulation-based inference | Apr 15, 2024 | AllBayesian Inference | CodeCode Available | 2 |
| Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation | Feb 5, 2025 | BenchmarkingLarge Language Model | CodeCode Available | 2 |
| Curvature Diversity-Driven Deformation and Domain Alignment for Point Cloud | Oct 3, 2024 | DiversityDomain Adaptation | CodeCode Available | 2 |
| CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation | Jun 15, 2024 | In-Context LearningText Generation | CodeCode Available | 2 |
| T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation | Jul 19, 2024 | AttributeLanguage Modeling | CodeCode Available | 2 |
| Rethinking Patch Dependence for Masked Autoencoders | Jan 25, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 |
| Pretraining is All You Need for Image-to-Image Translation | May 25, 2022 | AllImage-to-Image Translation | CodeCode Available | 2 |
| EasyRec: Simple yet Effective Language Models for Recommendation | Aug 16, 2024 | Collaborative FilteringContrastive Learning | CodeCode Available | 2 |
| Descriptor-based Foundation Models for Molecular Property Prediction | Jun 18, 2025 | Molecular Property PredictionPrediction | CodeCode Available | 2 |
| PDX: A Data Layout for Vector Similarity Search | Mar 6, 2025 | Avg | CodeCode Available | 2 |
| A Simulation Tool for V2G Enabled Demand Response Based on Model Predictive Control | May 20, 2024 | energy managementManagement | CodeCode Available | 2 |
| SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model | Mar 27, 2024 | DenoisingDomain Adaptation | CodeCode Available | 2 |
| I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference | Jul 4, 2022 | Quantization | CodeCode Available | 2 |
| The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles | Feb 3, 2025 | ARCMultimodal Reasoning | CodeCode Available | 2 |
| DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection | Mar 15, 2023 | Anomaly DetectionDenoising | CodeCode Available | 2 |
| MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories | Jun 5, 2025 | BenchmarkingOptical Character Recognition | CodeCode Available | 2 |
| MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing | Apr 17, 2023 | Image GenerationText-based Image Editing | CodeCode Available | 2 |
| Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases | Jun 19, 2024 | 8kHallucination | CodeCode Available | 2 |
| Efficient Diffusion Models: A Survey | Feb 3, 2025 | Survey | CodeCode Available | 2 |
| THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation | Feb 13, 2024 | Robot Manipulation Generalization | CodeCode Available | 2 |
| Improved Representation Steering for Language Models | May 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| A Foundation Model for Music Informatics | Nov 6, 2023 | Information Retrievalmodel | CodeCode Available | 2 |
| GOOD: A Graph Out-of-Distribution Benchmark | Jun 16, 2022 | | CodeCode Available | 2 |
| Motion In-Betweening with Phase Manifolds | Aug 24, 2023 | Mixture-of-Expertsmotion in-betweening | CodeCode Available | 2 |
| AnglE-optimized Text Embeddings | Sep 22, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin | Dec 15, 2023 | Language ModellingMixture-of-Experts | CodeCode Available | 2 |
| AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation | Nov 23, 2024 | Data AugmentationDiversity | CodeCode Available | 2 |
| FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis | Mar 19, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| Policy-Guided Diffusion | Apr 9, 2024 | | CodeCode Available | 2 |
| HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion | Mar 29, 2023 | | CodeCode Available | 2 |
| Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation | Nov 24, 2024 | Semantic Segmentation | CodeCode Available | 2 |