| Event-Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion Fields | Feb 19, 2025 | Video Frame Interpolation | CodeCode Available | 2 |
| OBSeg: Accurate and Fast Instance Segmentation Framework Using Segmentation Foundation Models with Oriented Bounding Box Prompts | Jan 16, 2024 | Amodal Instance SegmentationInstance Segmentation | CodeCode Available | 2 |
| In-Context Editing: Learning Knowledge from Self-Induced Distributions | Jun 17, 2024 | Image EditingIn-Context Learning | CodeCode Available | 2 |
| Differentiable and accelerated spherical harmonic and Wigner transforms | Nov 24, 2023 | | CodeCode Available | 2 |
| A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search | Jun 5, 2025 | Imitation Learning | CodeCode Available | 2 |
| shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python | Apr 2, 2025 | | CodeCode Available | 2 |
| Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing | Jul 1, 2024 | DenoisingImage Restoration | CodeCode Available | 2 |
| Safety Alignment Should Be Made More Than Just a Few Tokens Deep | Jun 10, 2024 | Safety Alignment | CodeCode Available | 2 |
| Transcoders Find Interpretable LLM Feature Circuits | Jun 17, 2024 | | CodeCode Available | 2 |
| OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner Framework | Feb 21, 2025 | Autonomous Driving | CodeCode Available | 2 |
| Humanity's Last Code Exam: Can Advanced LLMs Conquer Human's Hardest Code Competition? | Jun 15, 2025 | Code Generation | CodeCode Available | 2 |
| Early Detection and Localization of Pancreatic Cancer by Label-Free Tumor Synthesis | Aug 6, 2023 | Specificity | CodeCode Available | 2 |
| STAIR: Improving Safety Alignment with Introspective Reasoning | Feb 4, 2025 | Safety Alignment | CodeCode Available | 2 |
| EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education | Aug 5, 2023 | ChatbotLanguage Modeling | CodeCode Available | 2 |
| Multi-modal Queried Object Detection in the Wild | May 30, 2023 | Few-Shot Object DetectionObject | CodeCode Available | 2 |
| 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data | Jul 6, 2018 | General Classification | CodeCode Available | 2 |
| Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators | Feb 19, 2024 | | CodeCode Available | 2 |
| Conditional Image-to-Video Generation with Latent Flow Diffusion Models | Mar 24, 2023 | Image to Video GenerationMotion Generation | CodeCode Available | 2 |
| SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound | Jun 6, 2024 | Audio Generation | CodeCode Available | 2 |
| Window Function-less DFT with Reduced Noise and Latency for Real-Time Music Analysis | Oct 10, 2024 | | CodeCode Available | 2 |
| An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep Networks | Jan 23, 2025 | GPU | CodeCode Available | 2 |
| OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems | Jun 19, 2023 | BenchmarkingDecoder | CodeCode Available | 2 |
| Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method | Dec 22, 2022 | 4k8k | CodeCode Available | 2 |
| Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity | Mar 14, 2024 | In-Context Learning | CodeCode Available | 2 |
| MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion | Jul 3, 2023 | Image Generation | CodeCode Available | 2 |
| CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey | Apr 19, 2025 | Computational EfficiencyDomain Adaptation | CodeCode Available | 2 |
| Hierarchical Integration Diffusion Model for Realistic Image Deblurring | May 22, 2023 | DeblurringImage Deblurring | CodeCode Available | 2 |
| Instant Gaussian Stream: Fast and Generalizable Streaming of Dynamic Scene Reconstruction via Gaussian Splatting | Mar 21, 2025 | | CodeCode Available | 2 |
| MeshLoc: Mesh-Based Visual Localization | Jul 21, 2022 | Camera Pose EstimationNeural Rendering | CodeCode Available | 2 |
| Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models | Sep 19, 2024 | Semantic SimilaritySemantic Textual Similarity | CodeCode Available | 2 |
| Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement | Apr 14, 2023 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 2 |
| Agent AI: Surveying the Horizons of Multimodal Interaction | Jan 7, 2024 | multimodal interaction | CodeCode Available | 2 |
| β-DPO: Direct Preference Optimization with Dynamic β | Jul 11, 2024 | Informativeness | CodeCode Available | 2 |
| RedCode: Risky Code Execution and Generation Benchmark for Code Agents | Nov 12, 2024 | | CodeCode Available | 2 |
| Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench | Oct 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model | Jun 3, 2024 | geo-localizationLanguage Modeling | CodeCode Available | 2 |
| A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech | Feb 8, 2023 | Code GenerationDiversity | CodeCode Available | 2 |
| Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning | May 28, 2024 | | CodeCode Available | 2 |
| FreeInit: Bridging Initialization Gap in Video Diffusion Models | Dec 12, 2023 | DenoisingText-to-Video Generation | CodeCode Available | 2 |
| GUICourse: From General Vision Language Models to Versatile GUI Agents | Jun 17, 2024 | Natural Language Visual GroundingOptical Character Recognition (OCR) | CodeCode Available | 2 |
| The CLRS Algorithmic Reasoning Benchmark | May 31, 2022 | Learning to Execute | CodeCode Available | 2 |
| Language Models are Realistic Tabular Data Generators | Oct 12, 2022 | Tabular Data Generation | CodeCode Available | 2 |
| Video Quality Assessment: A Comprehensive Survey | Dec 4, 2024 | BenchmarkingSurvey | CodeCode Available | 2 |
| BEBLID: Boosted efficient binary local image descriptor | Feb 7, 2024 | Computational EfficiencyRetrieval | CodeCode Available | 2 |
| FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention | May 17, 2023 | DenoisingDiffusion Personalization | CodeCode Available | 2 |
| AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios | Oct 18, 2024 | Anomaly ClassificationAnomaly Detection | CodeCode Available | 2 |
| End-to-End Ontology Learning with Large Language Models | Oct 31, 2024 | | CodeCode Available | 2 |
| TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection | Mar 31, 2025 | Fraud DetectionLarge Language Model | CodeCode Available | 2 |
| Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time | Oct 26, 2023 | In-Context Learning | CodeCode Available | 2 |
| FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding | Apr 14, 2025 | | CodeCode Available | 2 |