| OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies | May 8, 2024 | Domain AdaptationScene Understanding | CodeCode Available | 2 |
| The Entropy Enigma: Success and Failure of Entropy Minimization | May 8, 2024 | Self-Supervised Learning | CodeCode Available | 2 |
| HILCodec: High-Fidelity and Lightweight Neural Audio Codec | May 8, 2024 | | CodeCode Available | 2 |
| Preble: Efficient Distributed Prompt Scheduling for LLM Serving | May 8, 2024 | GPUScheduling | CodeCode Available | 2 |
| HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution | May 8, 2024 | Image Super-Resolution | CodeCode Available | 2 |
| Dynamic GNNs for Precise Seizure Detection and Classification from EEG Data | May 8, 2024 | EEGGraph Classification | CodeCode Available | 2 |
| SemiCD-VL: Visual-Language Model Guidance Makes Better Semi-supervised Change Detector | May 8, 2024 | Change DetectionLanguage Modeling | CodeCode Available | 2 |
| DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature | May 8, 2024 | Question Answering | CodeCode Available | 2 |
| Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models | May 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| ADELIE: Aligning Large Language Models on Information Extraction | May 8, 2024 | | CodeCode Available | 2 |
| Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID | May 8, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio | May 8, 2024 | Audio Deepfake DetectionAudio Generation | CodeCode Available | 2 |
| Vision Mamba: A Comprehensive Survey and Taxonomy | May 7, 2024 | MambaMedical Image Analysis | CodeCode Available | 2 |
| Acceleration Algorithms in GNNs: A Survey | May 7, 2024 | Graph LearningSurvey | CodeCode Available | 2 |
| ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers | May 7, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| Tactile-Augmented Radiance Fields | May 7, 2024 | | CodeCode Available | 2 |
| NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts | May 7, 2024 | HumanEvalmbpp | CodeCode Available | 2 |
| IMU-Aided Event-based Stereo Visual Odometry | May 7, 2024 | Pose TrackingVisual Odometry | CodeCode Available | 2 |
| Detecting music deepfakes is easy but actually hard | May 7, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| BUDDy: Single-Channel Blind Unsupervised Dereverberation with Diffusion Models | May 7, 2024 | | CodeCode Available | 2 |
| Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement | May 6, 2024 | Computational EfficiencyDeep Learning | CodeCode Available | 2 |
| AntiFold: Improved antibody structure-based design using inverse folding | May 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Explainable Fake News Detection With Large Language Model via Defense Among Competing Wisdom | May 6, 2024 | Fake News DetectionLanguage Modeling | CodeCode Available | 2 |
| Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge | May 6, 2024 | | CodeCode Available | 2 |
| TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning | May 6, 2024 | Multiple Instance LearningTime Series | CodeCode Available | 2 |
| Video Diffusion Models: A Survey | May 6, 2024 | SurveyText-to-Video Generation | CodeCode Available | 2 |
| 3D LiDAR Mapping in Dynamic Environments Using a 4D Implicit Neural Representation | May 6, 2024 | Autonomous VehiclesDecoder | CodeCode Available | 2 |
| CRA5: Extreme Compression of ERA5 for Portable Global Climate and Weather Research via an Efficient Variational Transformer | May 6, 2024 | Weather Forecasting | CodeCode Available | 2 |
| Word2World: Generating Stories and Worlds through Large Language Models | May 6, 2024 | Game Design | CodeCode Available | 2 |
| Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration | May 6, 2024 | | CodeCode Available | 2 |
| Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning | May 6, 2024 | Reinforcement Learning (RL) | CodeCode Available | 2 |
| LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model | May 6, 2024 | Motion Generation | CodeCode Available | 2 |
| CityLLaVA: Efficient Fine-Tuning for VLMs in City Scenario | May 6, 2024 | PositionPrediction | CodeCode Available | 2 |
| Foundation Models for Video Understanding: A Survey | May 6, 2024 | SurveyVideo Understanding | CodeCode Available | 2 |
| PTQ4SAM: Post-Training Quantization for Segment Anything | May 6, 2024 | Instance Segmentationobject-detection | CodeCode Available | 2 |
| Parameter-Efficient Fine-Tuning with Discrete Fourier Transform | May 5, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval | May 5, 2024 | BenchmarkingComposed Image Retrieval (CoIR) | CodeCode Available | 2 |
| DVMSR: Distillated Vision Mamba for Efficient Super-Resolution | May 5, 2024 | Image Super-ResolutionLong-range modeling | CodeCode Available | 2 |
| Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning | May 5, 2024 | GSM8KMath | CodeCode Available | 2 |
| Self-Reflection in LLM Agents: Effects on Problem-Solving Performance | May 5, 2024 | Multiple-choice | CodeCode Available | 2 |
| Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration | May 5, 2024 | Color Image DenoisingImage Restoration | CodeCode Available | 2 |
| Overview of the EHRSQL 2024 Shared Task on Reliable Text-to-SQL Modeling on Electronic Health Records | May 4, 2024 | Information RetrievalQuestion Answering | CodeCode Available | 2 |
| UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model | May 4, 2024 | ObjectOptical Flow Estimation | CodeCode Available | 2 |
| PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation | May 4, 2024 | In-Context LearningRetrieval | CodeCode Available | 2 |
| MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning | May 4, 2024 | Earth Observationimage-classification | CodeCode Available | 2 |
| FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State Space | May 3, 2024 | Facial Expression RecognitionFacial Expression Recognition (FER) | CodeCode Available | 2 |
| On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? | May 3, 2024 | Computational EfficiencyPrompt Learning | CodeCode Available | 2 |
| Automating the Enterprise with Foundation Models | May 3, 2024 | Management | CodeCode Available | 2 |
| SCIMAP: A Python Toolkit for Integrated Spatial Analysis of Multiplexed Imaging Data | May 3, 2024 | | CodeCode Available | 2 |
| Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations | May 3, 2024 | Optical Flow EstimationReference-based Super-Resolution | CodeCode Available | 2 |