| Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Apr 6, 2025 | Cross-Domain Few-ShotCross-Domain Few-Shot Object Detection | CodeCode Available | 2 |
| Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings | Oct 4, 2022 | Gesture GenerationRhythm | CodeCode Available | 2 |
| Contrastive learning of cell state dynamics in response to perturbations | Oct 15, 2024 | Cell TrackingContrastive Learning | CodeCode Available | 2 |
| KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems | Feb 22, 2022 | Conversational RecommendationRecommendation Systems | CodeCode Available | 2 |
| The Devil is in Temporal Token: High Quality Video Reasoning Segmentation | Jan 15, 2025 | Reasoning SegmentationReferring Expression Segmentation | CodeCode Available | 2 |
| ReplayCAD: Generative Diffusion Replay for Continual Anomaly Detection | May 10, 2025 | Anomaly Detectioncontinual anomaly detection | CodeCode Available | 2 |
| A Simple Episodic Linear Probe Improves Visual Recognition in the Wild | Jan 1, 2022 | Fine-Grained Image ClassificationImage Classification | CodeCode Available | 2 |
| SweetDreamer: Aligning Geometric Priors in 2D Diffusion for Consistent Text-to-3D | Oct 4, 2023 | 3D GenerationText to 3D | CodeCode Available | 2 |
| SALT: Introducing a Framework for Hierarchical Segmentations in Medical Imaging using Softmax for Arbitrary Label Trees | Jul 11, 2024 | Diagnostic | CodeCode Available | 2 |
| Skinned Motion Retargeting with Dense Geometric Interaction Perception | Oct 28, 2024 | motion retargeting | CodeCode Available | 2 |
| DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space | Dec 19, 2024 | | CodeCode Available | 2 |
| SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis | Nov 25, 2024 | 3D Generation3DGS | CodeCode Available | 2 |
| DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes | Dec 13, 2023 | Autonomous Driving | CodeCode Available | 2 |
| Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment | Nov 26, 2024 | Image Quality AssessmentQuestion Answering | CodeCode Available | 2 |
| PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness | Dec 4, 2023 | Autonomous Driving | CodeCode Available | 2 |
| SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations | Oct 5, 2022 | 3D ReconstructionContinual Learning | CodeCode Available | 2 |
| Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion | Jul 2, 2024 | 3D Semantic Scene Completionvalid | CodeCode Available | 2 |
| COALA: A Practical and Vision-Centric Federated Learning Platform | Jul 23, 2024 | BenchmarkingContinual Learning | CodeCode Available | 2 |
| SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos | Aug 18, 2023 | 3D Object DetectionObject | CodeCode Available | 2 |
| Follow Anything: Open-set detection, tracking, and following in real-time | Aug 10, 2023 | | CodeCode Available | 2 |
| WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting | May 1, 2024 | Scheduling | CodeCode Available | 2 |
| OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection | Jul 15, 2024 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation | Dec 16, 2024 | RAGRetrieval | CodeCode Available | 2 |
| ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis | Aug 16, 2024 | Contrastive LearningDiagnostic | CodeCode Available | 2 |
| HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation | Apr 13, 2025 | Multimodal ReasoningRAG | CodeCode Available | 2 |
| Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning | Jul 23, 2024 | | CodeCode Available | 2 |
| Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection | Jan 19, 2024 | Multispectral Object DetectionObject | CodeCode Available | 2 |
| Heating Up Quasi-Monte Carlo Graph Random Features: A Diffusion Kernel Perspective | Oct 10, 2024 | | CodeCode Available | 2 |
| CMGAN: Conformer-based Metric GAN for Speech Enhancement | Mar 28, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| ZClip: Adaptive Spike Mitigation for LLM Pre-Training | Apr 3, 2025 | Anomaly DetectionLarge Language Model | CodeCode Available | 2 |
| Domain-Independent Dynamic Programming | Jan 25, 2024 | Combinatorial OptimizationHeuristic Search | CodeCode Available | 2 |
| Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology | Sep 14, 2024 | Inductive BiasPrognosis | CodeCode Available | 2 |
| GRPose: Learning Graph Relations for Human Image Generation with Pose Priors | Aug 29, 2024 | Image GenerationPose Estimation | CodeCode Available | 2 |
| BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation | Jun 9, 2025 | QuantizationVision-Language-Action | CodeCode Available | 2 |
| Text-space Graph Foundation Models: Comprehensive Benchmarks and New Insights | Jun 15, 2024 | | CodeCode Available | 2 |
| Rawsamble: Overlapping and Assembling Raw Nanopore Signals using a Hash-based Seeding Mechanism | Oct 23, 2024 | CPU | CodeCode Available | 2 |
| Explanation-Preserving Augmentation for Semi-Supervised Graph Representation Learning | Oct 16, 2024 | Graph ClassificationGraph Representation Learning | CodeCode Available | 2 |
| ST-LLM: Large Language Models Are Effective Temporal Learners | Mar 30, 2024 | MVBenchReading Comprehension | CodeCode Available | 2 |
| VICRegL: Self-Supervised Learning of Local Visual Features | Oct 4, 2022 | SegmentationSelf-Supervised Learning | CodeCode Available | 2 |
| Adaptive Rectangular Convolution for Remote Sensing Pansharpening | Mar 1, 2025 | Pansharpening | CodeCode Available | 2 |
| DCT-Net: Domain-Calibrated Translation for Portrait Stylization | Jul 6, 2022 | Few-Shot LearningStyle Transfer | CodeCode Available | 2 |
| GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting | Jan 26, 2025 | Quantization | CodeCode Available | 2 |
| GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models | Jun 18, 2024 | BenchmarkingDepth Estimation | CodeCode Available | 2 |
| Few-shot Novel View Synthesis using Depth Aware 3D Gaussian Splatting | Oct 14, 2024 | 3DGSDepth Estimation | CodeCode Available | 2 |
| Scaling New Frontiers: Insights into Large Recommendation Models | Dec 1, 2024 | Recommendation Systems | CodeCode Available | 2 |
| Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models | Apr 7, 2025 | MathQuantization | CodeCode Available | 2 |
| DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries | Mar 29, 2024 | ObjectVideo Instance Segmentation | CodeCode Available | 2 |
| Coding Speech through Vocal Tract Kinematics | Jun 18, 2024 | Voice Conversion | CodeCode Available | 2 |
| DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering | Jul 15, 2025 | BenchmarkingInstruction Following | CodeCode Available | 2 |
| EV2Gym: A Flexible V2G Simulator for EV Smart Charging Research and Benchmarking | Apr 2, 2024 | BenchmarkingReinforcement Learning (RL) | CodeCode Available | 2 |