| E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection | Mar 14, 2024 | Autonomous DrivingObject | CodeCode Available | 2 |
| Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection | Mar 14, 2024 | Knowledge DistillationNovel Object Detection | CodeCode Available | 2 |
| An Image Is Worth 1000 Lies: Adversarial Transferability across Prompts on Vision-Language Models | Mar 14, 2024 | | CodeCode Available | 2 |
| GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping | Mar 14, 2024 | Contrastive LearningNeRF | CodeCode Available | 2 |
| Caltech Aerial RGB-Thermal Dataset in the Wild | Mar 13, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| MonoOcc: Digging into Monocular Semantic Occupancy Prediction | Mar 13, 2024 | 3D geometryAutonomous Vehicles | CodeCode Available | 2 |
| Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era | Mar 13, 2024 | | CodeCode Available | 2 |
| Envision3D: One Image to 3D with Anchor Views Interpolation | Mar 13, 2024 | Image to 3D | CodeCode Available | 2 |
| LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments | Mar 13, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 2 |
| AcademiaOS: Automating Grounded Theory Development in Qualitative Research with Large Language Models | Mar 13, 2024 | | CodeCode Available | 2 |
| SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents | Mar 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| A Decade's Battle on Dataset Bias: Are We There Yet? | Mar 13, 2024 | Memorization | CodeCode Available | 2 |
| Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models | Mar 13, 2024 | | CodeCode Available | 2 |
| Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale | Mar 13, 2024 | Constituency Grammar InductionLanguage Modeling | CodeCode Available | 2 |
| Knowledge Conflicts for LLMs: A Survey | Mar 13, 2024 | MisinformationSurvey | CodeCode Available | 2 |
| Towards Dense and Accurate Radar Perception Via Efficient Cross-Modal Diffusion Model | Mar 13, 2024 | Autonomous Navigation | CodeCode Available | 2 |
| Scattered Mixture-of-Experts Implementation | Mar 13, 2024 | Mixture-of-Experts | CodeCode Available | 2 |
| Language models scale reliably with over-training and on downstream tasks | Mar 13, 2024 | Language Modelling | CodeCode Available | 2 |
| JAXbind: Bind any function to JAX | Mar 13, 2024 | | CodeCode Available | 2 |
| Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study | Mar 13, 2024 | Code Generation | CodeCode Available | 2 |
| Pairwise Comparisons Are All You Need | Mar 13, 2024 | AllFace Image Quality Assessment | CodeCode Available | 2 |
| PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency | Mar 13, 2024 | In-Context LearningText to SQL | CodeCode Available | 2 |
| CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model | Mar 13, 2024 | General KnowledgeInstruction Following | CodeCode Available | 2 |
| GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing | Mar 13, 2024 | 3DGS | CodeCode Available | 2 |
| MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | Mar 13, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| CleanAgent: Automating Data Standardization with LLM-based Agents | Mar 13, 2024 | Code GenerationNatural Language Understanding | CodeCode Available | 2 |
| FastMAC: Stochastic Spectral Sampling of Correspondence Graph | Mar 13, 2024 | Point Cloud Registration | CodeCode Available | 2 |
| Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation | Mar 12, 2024 | Cross-Modal RetrievalGPU | CodeCode Available | 2 |
| Motion Mamba: Efficient and Long Sequence Motion Generation | Mar 12, 2024 | MambaMotion Generation | CodeCode Available | 2 |
| Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture | Mar 12, 2024 | Motion MagnificationRepresentation Learning | CodeCode Available | 2 |
| CMax-SLAM: Event-based Rotational-Motion Bundle Adjustment and SLAM System using Contrast Maximization | Mar 12, 2024 | Motion Estimation | CodeCode Available | 2 |
| NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning | Mar 12, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark | Mar 12, 2024 | knowledge editingLanguage Modeling | CodeCode Available | 2 |
| SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM | Mar 12, 2024 | Semantic SegmentationSemantic SLAM | CodeCode Available | 2 |
| Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework | Mar 12, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving | Mar 12, 2024 | Autonomous DrivingDepth Estimation | CodeCode Available | 2 |
| Harder Tasks Need More Experts: Dynamic Routing in MoE Models | Mar 12, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 2 |
| Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis | Mar 12, 2024 | Graph Representation LearningRepresentation Learning | CodeCode Available | 2 |
| Open-World Semantic Segmentation Including Class Similarity | Mar 12, 2024 | Anomaly SegmentationAutonomous Vehicles | CodeCode Available | 2 |
| CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning | Mar 12, 2024 | Knowledge DistillationMultivariate Time Series Forecasting | CodeCode Available | 2 |
| LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation | Mar 12, 2024 | Image SegmentationLong-range modeling | CodeCode Available | 2 |
| Beyond Text: Frozen Large Language Models in Visual Signal Comprehension | Mar 12, 2024 | DeblurringDecoder | CodeCode Available | 2 |
| Characterization of Large Language Model Development in the Datacenter | Mar 12, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding | Mar 12, 2024 | Multi-Agent Path FindingMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction | Mar 12, 2024 | Code GenerationLanguage Modelling | CodeCode Available | 2 |
| CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion | Mar 12, 2024 | Code CompletionSafety Alignment | CodeCode Available | 2 |
| RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model | Mar 12, 2024 | Change DetectionZero-shot Generalization | CodeCode Available | 2 |
| Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning | Mar 12, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| Robust Synthetic-to-Real Transfer for Stereo Matching | Mar 12, 2024 | Domain GeneralizationPseudo Label | CodeCode Available | 2 |
| Scalable Spatiotemporal Prediction with Bayesian Neural Fields | Mar 12, 2024 | Bayesian InferenceDemand Forecasting | CodeCode Available | 2 |