| M^2PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning | Sep 24, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| CAD: Memory Efficient Convolutional Adapter for Segment Anything | Sep 24, 2024 | DecoderGPU | CodeCode Available | 1 |
| CloudTrack: Scalable UAV Tracking with Cloud Semantics | Sep 24, 2024 | Object Tracking | CodeCode Available | 1 |
| Neuromorphic Drone Detection: an Event-RGB Multimodal Approach | Sep 24, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| Looped Transformers for Length Generalization | Sep 24, 2024 | | CodeCode Available | 1 |
| Exploring Hint Generation Approaches in Open-Domain Question Answering | Sep 24, 2024 | Hint GenerationOpen-Domain Question Answering | CodeCode Available | 1 |
| AIM 2024 Challenge on UHD Blind Photo Quality Assessment | Sep 24, 2024 | 4kComputational Efficiency | CodeCode Available | 1 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 |
| LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation | Sep 24, 2024 | ObjectPose Estimation | CodeCode Available | 1 |
| Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs | Sep 24, 2024 | Knowledge TracingMisconceptions | CodeCode Available | 1 |
| CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data | Sep 24, 2024 | | CodeCode Available | 1 |
| Mind the Prompt: A Novel Benchmark for Prompt-based Class-Agnostic Counting | Sep 24, 2024 | ObjectObject Counting | CodeCode Available | 1 |
| CDChat: A Large Multimodal Model for Remote Sensing Change Description | Sep 24, 2024 | | CodeCode Available | 1 |
| VideoPatchCore: An Effective Method to Memorize Normality for Video Anomaly Detection | Sep 24, 2024 | Anomaly DetectionDecoder | CodeCode Available | 1 |
| VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images | Sep 24, 2024 | Artery/Veins Retinal Vessel SegmentationRetinal Vessel Segmentation | CodeCode Available | 1 |
| In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models | Sep 23, 2024 | In-Context Learning | CodeCode Available | 1 |
| From Commands to Prompts: LLM-based Semantic File System for AIOS | Sep 23, 2024 | ManagementNavigate | CodeCode Available | 1 |
| FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera | Sep 23, 2024 | Autonomous VehiclesDepth Estimation | CodeCode Available | 1 |
| DecoupleNet: A Lightweight Backbone Network With Efficient Feature Decoupling for Remote Sensing Visual Tasks | Sep 23, 2024 | ARCComputational Efficiency | CodeCode Available | 1 |
| LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation | Sep 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models | Sep 23, 2024 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 1 |
| GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth | Sep 23, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 1 |
| DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis | Sep 23, 2024 | Human Animation | CodeCode Available | 1 |
| Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method | Sep 23, 2024 | | CodeCode Available | 1 |
| RAMBO: Enhancing RAG-based Repository-Level Method Body Completion | Sep 23, 2024 | Code CompletionCode Generation | CodeCode Available | 1 |
| Boosting Healthcare LLMs Through Retrieved Context | Sep 23, 2024 | BenchmarkingMultiple-choice | CodeCode Available | 1 |
| A new baseline for edge detection: Make Encoder-Decoder great again | Sep 23, 2024 | DecoderEdge Detection | CodeCode Available | 1 |
| Neural Differential Appearance Equations | Sep 23, 2024 | Denoising | CodeCode Available | 1 |
| MICSim: A Modular Simulator for Mixed-signal Compute-in-Memory based AI Accelerator | Sep 23, 2024 | Quantization | CodeCode Available | 1 |
| For Overall Nighttime Visibility: Integrate Irregular Glow Removal With Glow-Aware Enhancement | Sep 23, 2024 | Flare RemovalImage Enhancement | CodeCode Available | 1 |
| FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension | Sep 23, 2024 | Image ComprehensionReferring Expression | CodeCode Available | 1 |
| M2OST: Many-to-one Regression for Predicting Spatial Transcriptomics from Digital Pathology Images | Sep 23, 2024 | regression | CodeCode Available | 1 |
| SpaGBOL: Spatial-Graph-Based Orientated Localisation | Sep 23, 2024 | Camera LocalizationCross-View Geo-Localisation | CodeCode Available | 1 |
| Matérn Kernels for Tunable Implicit Surface Reconstruction | Sep 23, 2024 | 3D ReconstructionARC | CodeCode Available | 1 |
| FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large Scale | Sep 23, 2024 | GPU | CodeCode Available | 1 |
| The BRAVO Semantic Segmentation Challenge Results in UNCV2024 | Sep 23, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models | Sep 23, 2024 | Image Generation | CodeCode Available | 1 |
| ControlEdit: A MultiModal Local Clothing Image Editing Method | Sep 23, 2024 | Self-Supervised Learning | CodeCode Available | 1 |
| MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification | Sep 23, 2024 | ClassificationHateful Meme Classification | CodeCode Available | 1 |
| RMCBench: Benchmarking Large Language Models' Resistance to Malicious Code | Sep 23, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 |
| Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections | Sep 23, 2024 | Image Inpainting | CodeCode Available | 1 |
| PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs | Sep 23, 2024 | | CodeCode Available | 1 |
| ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback | Sep 23, 2024 | Instruction Following | CodeCode Available | 1 |
| Steward: Natural Language Web Automation | Sep 23, 2024 | | CodeCode Available | 1 |
| CUTE: Measuring LLMs' Understanding of Their Tokens | Sep 23, 2024 | | CodeCode Available | 1 |
| UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework | Sep 23, 2024 | Domain AdaptationUnsupervised Domain Adaptation | CodeCode Available | 1 |
| AIM 2024 Challenge on Video Saliency Prediction: Methods and Results | Sep 23, 2024 | Saliency DetectionSaliency Prediction | CodeCode Available | 1 |
| TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features | Sep 22, 2024 | | CodeCode Available | 1 |
| Can AI writing be salvaged? Mitigating Idiosyncrasies and Improving Human-AI Alignment in the Writing Process through Edits | Sep 22, 2024 | | CodeCode Available | 1 |
| MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators | Sep 22, 2024 | Automatic Post-EditingMachine Translation | CodeCode Available | 1 |