| TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning | Jun 21, 2024 | FairnessGeographic Question Answering | CodeCode Available | 2 |
| FIRST: Faster Improved Listwise Reranking with Single Token Decoding | Jun 21, 2024 | Information RetrievalLanguage Modeling | CodeCode Available | 2 |
| RouteFinder: Towards Foundation Models for Vehicle Routing Problems | Jun 21, 2024 | AttributeMulti-Task Learning | CodeCode Available | 2 |
| SelfReg-UNet: Self-Regularized UNet for Medical Image Segmentation | Jun 21, 2024 | DecoderImage Segmentation | CodeCode Available | 2 |
| Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models | Jun 21, 2024 | Spatial Reasoning | CodeCode Available | 2 |
| GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation | Jun 21, 2024 | 3D GenerationGPU | CodeCode Available | 2 |
| MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression | Jun 21, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark | Jun 21, 2024 | Anomaly DetectionOut-of-Distribution Detection | CodeCode Available | 2 |
| LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection | Jun 20, 2024 | Computational EfficiencyObject | CodeCode Available | 2 |
| Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework | Jun 20, 2024 | HallucinationQuestion Answering | CodeCode Available | 2 |
| LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning | Jun 20, 2024 | Autonomous NavigationHeuristic Search | CodeCode Available | 2 |
| CodeRAG-Bench: Can Retrieval Augment Code Generation? | Jun 20, 2024 | Code GenerationRAG | CodeCode Available | 2 |
| Feature Fusion Based on Mutual-Cross-Attention Mechanism for EEG Emotion Recognition | Jun 20, 2024 | DiagnosticEEG | CodeCode Available | 2 |
| CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information | Jun 20, 2024 | Vision and Language Navigation | CodeCode Available | 2 |
| EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms | Jun 20, 2024 | Evolutionary Algorithms | CodeCode Available | 2 |
| HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting? | Jun 20, 2024 | BenchmarkingPoint Processes | CodeCode Available | 2 |
| How far are today's time-series models from real-world weather forecasting applications? | Jun 20, 2024 | BenchmarkingTime Series | CodeCode Available | 2 |
| MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading | Jun 20, 2024 | Algorithmic TradingDecision Making | CodeCode Available | 2 |
| TAGLAS: An atlas of text-attributed graph datasets in the era of large graph and language models | Jun 20, 2024 | Graph Question AnsweringNode Classification | CodeCode Available | 2 |
| Asynchronous Large Language Model Enhanced Planner for Autonomous Driving | Jun 20, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 2 |
| Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study | Jun 20, 2024 | In-Context LearningKnowledge Distillation | CodeCode Available | 2 |
| Adaptable Logical Control for Large Language Models | Jun 19, 2024 | MathText Generation | CodeCode Available | 2 |
| SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words | Jun 19, 2024 | Dialogue Understanding | CodeCode Available | 2 |
| ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World | Jun 19, 2024 | DiagnosticMultiple-choice | CodeCode Available | 2 |
| Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases | Jun 19, 2024 | 8kHallucination | CodeCode Available | 2 |
| GraphKAN: Enhancing Feature Extraction with Graph Kolmogorov Arnold Networks | Jun 19, 2024 | Kolmogorov-Arnold Networks | CodeCode Available | 2 |
| A large-scale multicenter breast cancer DCE-MRI benchmark dataset with expert segmentations | Jun 19, 2024 | Benchmarking | CodeCode Available | 2 |
| InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales | Jun 19, 2024 | DenoisingIn-Context Learning | CodeCode Available | 2 |
| Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks | Jun 19, 2024 | DecoderLanguage Modeling | CodeCode Available | 2 |
| WATT: Weight Average Test-Time Adaptation of CLIP | Jun 19, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images | Jun 19, 2024 | Object RecognitionScene Understanding | CodeCode Available | 2 |
| RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design | Jun 19, 2024 | Diversity | CodeCode Available | 2 |
| Dissecting Adversarial Robustness of Multimodal LM Agents | Jun 18, 2024 | Adversarial RobustnessAdversarial Text | CodeCode Available | 2 |
| Can Go AIs be adversarially robust? | Jun 18, 2024 | Diversity | CodeCode Available | 2 |
| DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving | Jun 18, 2024 | Arithmetic ReasoningMath | CodeCode Available | 2 |
| Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment | Jun 18, 2024 | Denoising | CodeCode Available | 2 |
| Universal Score-based Speech Enhancement with High Content Preservation | Jun 18, 2024 | Speech Enhancement | CodeCode Available | 2 |
| Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling | Jun 18, 2024 | Arithmetic ReasoningLanguage Modeling | CodeCode Available | 2 |
| SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization | Jun 18, 2024 | Landmark-based LipreadingLipreading | CodeCode Available | 2 |
| Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM | Jun 18, 2024 | Anomaly DetectionAnomaly Localization | CodeCode Available | 2 |
| OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI | Jun 18, 2024 | Benchmarkingscientific discovery | CodeCode Available | 2 |
| AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention | Jun 18, 2024 | ObjectResponse Generation | CodeCode Available | 2 |
| Automated MRI Quality Assessment of Brain T1-weighted MRI in Clinical Data Warehouses: A Transfer Learning Approach Relying on Artefact Simulation | Jun 18, 2024 | Transfer Learning | CodeCode Available | 2 |
| GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models | Jun 18, 2024 | BenchmarkingDepth Estimation | CodeCode Available | 2 |
| Coding Speech through Vocal Tract Kinematics | Jun 18, 2024 | Voice Conversion | CodeCode Available | 2 |
| AgentReview: Exploring Peer Review Dynamics with LLM Agents | Jun 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction | Jun 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| AEM: Attention Entropy Maximization for Multiple Instance Learning based Whole Slide Image Classification | Jun 18, 2024 | Diversityimage-classification | CodeCode Available | 2 |
| ChangeViT: Unleashing Plain Vision Transformers for Change Detection | Jun 18, 2024 | Change Detection | CodeCode Available | 2 |
| TroL: Traversal of Layers for Large Language and Vision Models | Jun 18, 2024 | Visual Question Answering | CodeCode Available | 2 |