| Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs | Apr 22, 2024 | Misinformation | CodeCode Available | 2 |
| Graphic Design with Large Multimodal Model | Apr 22, 2024 | Layout Generationmodel | CodeCode Available | 2 |
| An empirical study of LLaMA3 quantization: from LLMs to MLLMs | Apr 22, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| SwinFuSR: an image fusion-inspired model for RGB-guided thermal image super-resolution | Apr 22, 2024 | Image Super-ResolutionSSIM | CodeCode Available | 2 |
| FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization | Apr 21, 2024 | Anomaly DetectionPosition | CodeCode Available | 2 |
| Bracketing Image Restoration and Enhancement with High-Low Frequency Decomposition | Apr 21, 2024 | Image Restoration | CodeCode Available | 2 |
| How to Encode Domain Information in Relation Classification | Apr 21, 2024 | ClassificationRelation | CodeCode Available | 2 |
| AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs | Apr 21, 2024 | MMLURed Teaming | CodeCode Available | 2 |
| Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation | Apr 21, 2024 | Semantic SegmentationVideo Object Segmentation | CodeCode Available | 2 |
| Mixture of LoRA Experts | Apr 21, 2024 | | CodeCode Available | 2 |
| Retrieval-Augmented Generation-based Relation Extraction | Apr 20, 2024 | RelationRelation Extraction | CodeCode Available | 2 |
| Vim4Path: Self-Supervised Vision Mamba for Histopathology Images | Apr 20, 2024 | DiagnosticMamba | CodeCode Available | 2 |
| HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding | Apr 20, 2024 | cross-modal alignmentVisual Grounding | CodeCode Available | 2 |
| Movie101v2: Improved Movie Narration Benchmark | Apr 20, 2024 | Video Captioning | CodeCode Available | 2 |
| FakeBench: Probing Explainable Fake Image Detection via Large Multimodal Models | Apr 20, 2024 | Binary ClassificationFake Image Detection | CodeCode Available | 2 |
| Augmented Object Intelligence with XR-Objects | Apr 20, 2024 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| Large Language Models for Next Point-of-Interest Recommendation | Apr 19, 2024 | | CodeCode Available | 2 |
| decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points | Apr 19, 2024 | Quantization | CodeCode Available | 2 |
| MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model | Apr 19, 2024 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| DeeperHistReg: Robust Whole Slide Images Registration Framework | Apr 19, 2024 | whole slide images | CodeCode Available | 2 |
| Linearly-evolved Transformer for Pan-sharpening | Apr 19, 2024 | | CodeCode Available | 2 |
| Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration | Apr 19, 2024 | Ensemble Learning | CodeCode Available | 2 |
| FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation | Apr 19, 2024 | DecoderNetwork Embedding | CodeCode Available | 2 |
| LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency | Apr 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References | Apr 19, 2024 | Image Harmonization | CodeCode Available | 2 |
| MAexp: A Generic Platform for RL-based Multi-Agent Exploration | Apr 19, 2024 | DiversityMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| MoVA: Adapting Mixture of Vision Experts to Multimodal Context | Apr 19, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework | Apr 19, 2024 | Earth ObservationSegmentation | CodeCode Available | 2 |
| Token-level Direct Preference Optimization | Apr 18, 2024 | Diversity | CodeCode Available | 2 |
| Partial-to-Partial Shape Matching with Geometric Consistency | Apr 18, 2024 | | CodeCode Available | 2 |
| Point-In-Context: Understanding Point Cloud via In-Context Learning | Apr 18, 2024 | In-Context Learning | CodeCode Available | 2 |
| GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction | Apr 18, 2024 | Graph structure learningJoint Entity and Relation Extraction | CodeCode Available | 2 |
| SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation | Apr 18, 2024 | Autonomous DrivingDepth Estimation | CodeCode Available | 2 |
| Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM | Apr 18, 2024 | Topic Models | CodeCode Available | 2 |
| Model-free quantification of completeness, uncertainties, and outliers in atomistic machine learning using information theory | Apr 18, 2024 | Active LearningUncertainty Quantification | CodeCode Available | 2 |
| MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space | Apr 18, 2024 | Drug Design | CodeCode Available | 2 |
| LongEmbed: Extending Embedding Models for Long Context Retrieval | Apr 18, 2024 | 4k8k | CodeCode Available | 2 |
| 6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction | Apr 18, 2024 | 3D ReconstructionImage to 3D | CodeCode Available | 2 |
| Partial Large Kernel CNNs for Efficient Super-Resolution | Apr 18, 2024 | Computational EfficiencyGPU | CodeCode Available | 2 |
| An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Apr 18, 2024 | Contrastive LearningCPU | CodeCode Available | 2 |
| Transformer tricks: Removing weights for skipless transformers | Apr 18, 2024 | | CodeCode Available | 2 |
| Introducing v0.5 of the AI Safety Benchmark from MLCommons | Apr 18, 2024 | | CodeCode Available | 2 |
| ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier Transformer | Apr 18, 2024 | Image Shadow Removalobject-detection | CodeCode Available | 2 |
| Aligning language models with human preferences | Apr 18, 2024 | Bayesian Inference | CodeCode Available | 2 |
| Physics-informed active learning for accelerating quantum chemical simulations | Apr 18, 2024 | Active LearningUncertainty Quantification | CodeCode Available | 2 |
| RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models | Apr 17, 2024 | Graph Neural Network | CodeCode Available | 2 |
| VBR: A Vision Benchmark in Rome | Apr 17, 2024 | Autonomous VehiclesBenchmarking | CodeCode Available | 2 |
| Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution | Apr 17, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| Behavior Alignment: A New Perspective of Evaluating LLM-based Conversational Recommender Systems | Apr 17, 2024 | Conversational RecommendationRecommendation Systems | CodeCode Available | 2 |
| Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System | Apr 17, 2024 | AllCollaborative Filtering | CodeCode Available | 2 |