| Demand Estimation with Text and Image Data | Mar 26, 2025 | Attributecounterfactual | CodeCode Available | 1 |
| Siformer: Feature-isolated Transformer for Efficient Skeleton-based Sign Language Recognition | Mar 26, 2025 | Action RecognitionComputational Efficiency | CodeCode Available | 1 |
| Procedural Knowledge Ontology (PKO) | Mar 26, 2025 | | CodeCode Available | 1 |
| EGVD: Event-Guided Video Diffusion Model for Physically Realistic Large-Motion Frame Interpolation | Mar 26, 2025 | Video Frame Interpolation | CodeCode Available | 1 |
| VPO: Aligning Text-to-Video Generation Models with Prompt Optimization | Mar 26, 2025 | In-Context LearningSafety Alignment | CodeCode Available | 1 |
| Exploiting Temporal State Space Sharing for Video Semantic Segmentation | Mar 26, 2025 | MambaSemantic Segmentation | CodeCode Available | 1 |
| EditCLIP: Representation Learning for Image Editing | Mar 26, 2025 | Representation Learning | CodeCode Available | 1 |
| Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark | Mar 26, 2025 | MMLUMultiple-choice | CodeCode Available | 1 |
| PlatMetaX: An Integrated MATLAB platform for Meta-Black-Box Optimization | Mar 26, 2025 | Meta-Learning | CodeCode Available | 1 |
| 3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark | Mar 26, 2025 | DiagnosticMultimodal Reasoning | CodeCode Available | 1 |
| Pluggable Style Representation Learning for Multi-Style Transfer | Mar 26, 2025 | Representation LearningStyle Transfer | CodeCode Available | 1 |
| Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration | Mar 26, 2025 | DenoisingImage Restoration | CodeCode Available | 1 |
| SChanger: Change Detection from a Semantic Change and Spatial Consistency Perspective | Mar 26, 2025 | Change Detection | CodeCode Available | 1 |
| Fast, Modular, and Differentiable Framework for Machine Learning-Enhanced Molecular Simulations | Mar 26, 2025 | | CodeCode Available | 1 |
| InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction | Mar 26, 2025 | Instruction FollowingVideo Editing | CodeCode Available | 1 |
| A multi-agentic framework for real-time, autonomous freeform metasurface design | Mar 26, 2025 | | CodeCode Available | 1 |
| CamSAM2: Segment Anything Accurately in Camouflaged Videos | Mar 25, 2025 | Camouflaged Object SegmentationObject | CodeCode Available | 1 |
| LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text | Mar 25, 2025 | Cross-Modal RetrievalHallucination | CodeCode Available | 1 |
| HoarePrompt: Structural Reasoning About Program Correctness in Natural Language | Mar 25, 2025 | | CodeCode Available | 1 |
| NeoRL-2: Near Real-World Benchmarks for Offline Reinforcement Learning with Extended Realistic Scenarios | Mar 25, 2025 | BenchmarkingOffline RL | CodeCode Available | 1 |
| Interpretable Generative Models through Post-hoc Concept Bottlenecks | Mar 25, 2025 | | CodeCode Available | 1 |
| The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs | Mar 25, 2025 | BenchmarkingScene Segmentation | CodeCode Available | 1 |
| Adaptive Orchestration for Large-Scale Inference on Heterogeneous Accelerator Systems Balancing Cost, Performance, and Resilience | Mar 25, 2025 | | CodeCode Available | 1 |
| Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation | Mar 25, 2025 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| OpenSDI: Spotting Diffusion-Generated Images in the Open World | Mar 25, 2025 | | CodeCode Available | 1 |
| Exploring Semantic Feature Discrimination for Perceptual Image Super-Resolution and Opinion-Unaware No-Reference Image Quality Assessment | Mar 25, 2025 | Image Quality AssessmentImage Super-Resolution | CodeCode Available | 1 |
| TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception | Mar 25, 2025 | | CodeCode Available | 1 |
| LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation | Mar 25, 2025 | Code CompletionLanguage Modeling | CodeCode Available | 1 |
| Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models | Mar 25, 2025 | BenchmarkingImage Captioning | CodeCode Available | 1 |
| ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency | Mar 25, 2025 | | CodeCode Available | 1 |
| Analyzing the Synthetic-to-Real Domain Gap in 3D Hand Pose Estimation | Mar 25, 2025 | 3D Hand Pose EstimationFace Recognition | CodeCode Available | 1 |
| Curvature-Constrained Vector Field for Motion Planning of Nonholonomic Robots | Mar 25, 2025 | Motion Planning | CodeCode Available | 1 |
| Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval | Mar 25, 2025 | AttributeImage Retrieval | CodeCode Available | 1 |
| A scalable gene network model of regulatory dynamics in single cells | Mar 25, 2025 | | CodeCode Available | 1 |
| EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models | Mar 25, 2025 | Video Generation | CodeCode Available | 1 |
| SACB-Net: Spatial-awareness Convolutions for Medical Image Registration | Mar 25, 2025 | Image RegistrationMedical Image Registration | CodeCode Available | 1 |
| Simulating Tracking Data to Advance Sports Analytics Research | Mar 25, 2025 | Sports Analytics | CodeCode Available | 1 |
| PAVE: Patching and Adapting Video Large Language Models | Mar 25, 2025 | Audio-visual Question AnsweringMulti-Task Learning | CodeCode Available | 1 |
| A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction | Mar 25, 2025 | GPU | CodeCode Available | 1 |
| CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning | Mar 25, 2025 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Lean Formalization of Generalization Error Bound by Rademacher Complexity | Mar 25, 2025 | LEMMAPAC learning | CodeCode Available | 1 |
| DynOPETs: A Versatile Benchmark for Dynamic Object Pose Estimation and Tracking in Moving Camera Scenarios | Mar 25, 2025 | 3D Object DetectionObject | CodeCode Available | 1 |
| CoLLM: A Large Language Model for Composed Image Retrieval | Mar 25, 2025 | Image RetrievalLanguage Modeling | CodeCode Available | 1 |
| IgCraft: A versatile sequence generation framework for antibody discovery and engineering | Mar 25, 2025 | | CodeCode Available | 1 |
| LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation | Mar 25, 2025 | cross-modal alignmentOpen Vocabulary Semantic Segmentation | CodeCode Available | 1 |
| Attention IoU: Examining Biases in CelebA using Attention Maps | Mar 25, 2025 | Attribute | CodeCode Available | 1 |
| ASP-VMUNet: Atrous Shifted Parallel Vision Mamba U-Net for Skin Lesion Segmentation | Mar 25, 2025 | Image SegmentationLesion Segmentation | CodeCode Available | 1 |
| PatchWorkPlot: simultaneous visualization of local alignments across multiple sequences | Mar 25, 2025 | | CodeCode Available | 1 |
| Inertial-Based LQG Control: A New Look at Inverted Pendulum Stabilization | Mar 24, 2025 | Sensor FusionState Estimation | CodeCode Available | 1 |
| Good Keypoints for the Two-View Geometry Estimation Problem | Mar 24, 2025 | Homography Estimation | CodeCode Available | 1 |