| CoIR: A Comprehensive Benchmark for Code Information Retrieval Models | Jul 3, 2024 | BenchmarkingCode Search | CodeCode Available | 2 |
| A Unified Framework for 3D Scene Understanding | Jul 3, 2024 | Contrastive LearningKnowledge Distillation | CodeCode Available | 2 |
| Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages | Jul 3, 2024 | Language Modellingvalid | CodeCode Available | 2 |
| VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors | Jul 3, 2024 | Neural Rendering | CodeCode Available | 2 |
| Solving Motion Planning Tasks with a Scalable Generative Model | Jul 3, 2024 | Autonomous DrivingMotion Planning | CodeCode Available | 2 |
| Context-Aware Video Instance Segmentation | Jul 3, 2024 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 2 |
| HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation | Jul 3, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Free-SurGS: SfM-Free 3D Gaussian Splatting for Surgical Scene Reconstruction | Jul 3, 2024 | 3DGS3D Reconstruction | CodeCode Available | 2 |
| DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents | Jul 3, 2024 | Image GenerationMolecular Docking | CodeCode Available | 2 |
| SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding | Jul 3, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| CATT: Character-based Arabic Tashkeel Transformer | Jul 3, 2024 | Arabic Text DiacritizationDecoder | CodeCode Available | 2 |
| Explicitly Guided Information Interaction Network for Cross-modal Point Cloud Completion | Jul 3, 2024 | Point Cloud Completion | CodeCode Available | 2 |
| Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation | Jul 3, 2024 | Domain GeneralizationKnowledge Distillation | CodeCode Available | 2 |
| MHNet: Multi-view High-order Network for Diagnosing Neurodevelopmental Disorders Using Resting-state fMRI | Jul 3, 2024 | Functional ConnectivityGraph Neural Network | CodeCode Available | 2 |
| A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding | Jul 2, 2024 | document understandingKey Information Extraction | CodeCode Available | 2 |
| ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation | Jul 2, 2024 | PredictionText to 3D | CodeCode Available | 2 |
| MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation | Jul 2, 2024 | In-Context Learning | CodeCode Available | 2 |
| WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation | Jul 2, 2024 | | CodeCode Available | 2 |
| Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion | Jul 2, 2024 | 3D Semantic Scene Completionvalid | CodeCode Available | 2 |
| Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather | Jul 2, 2024 | Data AugmentationLIDAR Semantic Segmentation | CodeCode Available | 2 |
| BeNeRF: Neural Radiance Fields from a Single Blurry Image and Event Stream | Jul 2, 2024 | NeRF | CodeCode Available | 2 |
| Safety-Driven Deep Reinforcement Learning Framework for Cobots: A Sim2Real Approach | Jul 2, 2024 | Deep Reinforcement Learning | CodeCode Available | 2 |
| Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts | Jul 2, 2024 | Few-Shot Semantic SegmentationSemantic Segmentation | CodeCode Available | 2 |
| MeMemo: On-device Retrieval Augmentation for Private and Personalized Text Generation | Jul 2, 2024 | HallucinationRAG | CodeCode Available | 2 |
| VFIMamba: Video Frame Interpolation with State Space Models | Jul 2, 2024 | 2k4k | CodeCode Available | 2 |
| AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans | Jul 2, 2024 | 3D ClassificationAlzheimer's Disease Detection | CodeCode Available | 2 |
| GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models | Jul 2, 2024 | Marketing | CodeCode Available | 2 |
| Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models | Jul 2, 2024 | Story Visualization | CodeCode Available | 2 |
| DiscoveryBench: Towards Data-Driven Discovery with Large Language Models | Jul 1, 2024 | Code GenerationSociology | CodeCode Available | 2 |
| Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction | Jul 1, 2024 | CT Reconstruction | CodeCode Available | 2 |
| DCoM: Active Learning for All Learners | Jul 1, 2024 | Active LearningAll | CodeCode Available | 2 |
| SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection | Jul 1, 2024 | Objectobject-detection | CodeCode Available | 2 |
| MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations | Jul 1, 2024 | Benchmarkingdocument understanding | CodeCode Available | 2 |
| Robust and Reliable Early-Stage Website Fingerprinting Attacks via Spatial-Temporal Distribution Analysis | Jul 1, 2024 | Contrastive LearningData Augmentation | CodeCode Available | 2 |
| Centerline Boundary Dice Loss for Vascular Segmentation | Jul 1, 2024 | Segmentation | CodeCode Available | 2 |
| Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems | Jul 1, 2024 | RAG | CodeCode Available | 2 |
| GalLoP: Learning Global and Local Prompts for Vision-Language Models | Jul 1, 2024 | DiversityDomain Generalization | CodeCode Available | 2 |
| IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation | Jul 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing | Jul 1, 2024 | DenoisingImage Restoration | CodeCode Available | 2 |
| SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving | Jul 1, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| FORA: Fast-Forward Caching in Diffusion Transformer Acceleration | Jul 1, 2024 | Denoising | CodeCode Available | 2 |
| E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness | Jul 1, 2024 | 3D Generation | CodeCode Available | 2 |
| Equivariant Diffusion Policy | Jul 1, 2024 | Imitation LearningRobot Manipulation | CodeCode Available | 2 |
| FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models | Jul 1, 2024 | BenchmarkingFairness | CodeCode Available | 2 |
| AutoFlow: Automated Workflow Generation for Large Language Model Agents | Jul 1, 2024 | AI AgentLanguage Modeling | CodeCode Available | 2 |
| DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models | Jul 1, 2024 | DenoisingImage Restoration | CodeCode Available | 2 |
| RegMix: Data Mixture as Regression for Language Model Pre-training | Jul 1, 2024 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 2 |
| We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? | Jul 1, 2024 | MathMathematical Reasoning | CodeCode Available | 2 |
| Benchmarking Predictive Coding Networks -- Made Simple | Jul 1, 2024 | Benchmarking | CodeCode Available | 2 |
| KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches | Jul 1, 2024 | Book summarizationQuantization | CodeCode Available | 2 |