| LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models | Aug 28, 2024 | BenchmarkingLogical Reasoning | CodeCode Available | 1 |
| Toward Automated Simulation Research Workflow through LLM Prompt Engineering Design | Aug 28, 2024 | Experimental DesignPrompt Engineering | CodeCode Available | 1 |
| Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models | Aug 28, 2024 | 2k4k | CodeCode Available | 1 |
| μgat: Improving Single-Page Document Parsing by Providing Multi-Page Context | Aug 28, 2024 | | CodeCode Available | 1 |
| On the Benefits of Visual Stabilization for Frame- and Event-based Perception | Aug 28, 2024 | Event-based visionMotion Estimation | CodeCode Available | 1 |
| VFLIP: A Backdoor Defense for Vertical Federated Learning via Identification and Purification | Aug 28, 2024 | Anomaly Detectionbackdoor defense | CodeCode Available | 1 |
| TrafficGamer: Reliable and Flexible Traffic Simulation for Safety-Critical Scenarios with Game-Theoretic Oracles | Aug 28, 2024 | | CodeCode Available | 1 |
| Legilimens: Practical and Unified Content Moderation for Large Language Model Services | Aug 28, 2024 | Data AugmentationLanguage Modeling | CodeCode Available | 1 |
| EPO: Hierarchical LLM Agents with Environment Preference Optimization | Aug 28, 2024 | Action GenerationDecision Making | CodeCode Available | 1 |
| More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding | Aug 28, 2024 | | CodeCode Available | 1 |
| MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion | Aug 28, 2024 | Pedestrian Detection | CodeCode Available | 1 |
| NAS-BNN: Neural Architecture Search for Binary Neural Networks | Aug 28, 2024 | Neural Architecture Searchobject-detection | CodeCode Available | 1 |
| Evaluating Named Entity Recognition Using Few-Shot Prompting with Large Language Models | Aug 28, 2024 | In-Context Learningnamed-entity-recognition | CodeCode Available | 1 |
| Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation | Aug 28, 2024 | | CodeCode Available | 1 |
| Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need | Aug 28, 2024 | AllMamba | CodeCode Available | 1 |
| Trading with Time Series Causal Discovery: An Empirical Study | Aug 28, 2024 | Causal DiscoveryTime Series | CodeCode Available | 1 |
| A Survey on Facial Expression Recognition of Static and Dynamic Emotions | Aug 28, 2024 | cross-modal alignmentFacial Expression Recognition | CodeCode Available | 1 |
| SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge | Aug 28, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 1 |
| CBF-LLM: Safe Control for LLM Alignment | Aug 28, 2024 | Text Generation | CodeCode Available | 1 |
| Segmentation-guided Layer-wise Image Vectorization with Gradient Fills | Aug 28, 2024 | SegmentationVector Graphics | CodeCode Available | 1 |
| Can Unconfident LLM Annotations Be Used for Confident Conclusions? | Aug 27, 2024 | valid | CodeCode Available | 1 |
| What makes math problems hard for reinforcement learning: a case study | Aug 27, 2024 | MathReinforcement Learning (RL) | CodeCode Available | 1 |
| DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding | Aug 27, 2024 | document understandingOptical Character Recognition (OCR) | CodeCode Available | 1 |
| LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive Streaming | Aug 27, 2024 | 3DGSSSIM | CodeCode Available | 1 |
| PAT: Pruning-Aware Tuning for Large Language Models | Aug 27, 2024 | | CodeCode Available | 1 |
| Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection | Aug 27, 2024 | Decoderobject-detection | CodeCode Available | 1 |
| CMTA: Cross-Modal Temporal Alignment for Event-guided Video Deblurring | Aug 27, 2024 | DeblurringVideo Deblurring | CodeCode Available | 1 |
| T-FAKE: Synthesizing Thermal Images for Facial Landmarking | Aug 27, 2024 | Autonomous DrivingStyle Transfer | CodeCode Available | 1 |
| MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders | Aug 27, 2024 | DecoderMamba | CodeCode Available | 1 |
| Mamba2MIL: State Space Duality Based Multiple Instance Learning for Computational Pathology | Aug 27, 2024 | feature selectionMultiple Instance Learning | CodeCode Available | 1 |
| GPU-Accelerated Counterfactual Regret Minimization | Aug 27, 2024 | counterfactualGPU | CodeCode Available | 1 |
| ERX: A Fast Real-Time Anomaly Detection Algorithm for Hyperspectral Line Scanning | Aug 27, 2024 | Anomaly DetectionCPU | CodeCode Available | 1 |
| CVPT: Cross-Attention help Visual Prompt Tuning adapt visual task | Aug 27, 2024 | parameter-efficient fine-tuningVisual Prompt Tuning | CodeCode Available | 1 |
| LyCon: Lyrics Reconstruction from the Bag-of-Words Using Large Language Models | Aug 27, 2024 | | CodeCode Available | 1 |
| MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum Disorder | Aug 27, 2024 | Optical Flow EstimationPrivacy Preserving | CodeCode Available | 1 |
| No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery | Aug 27, 2024 | | CodeCode Available | 1 |
| Measuring Human Contribution in AI-Assisted Content Generation | Aug 27, 2024 | | CodeCode Available | 1 |
| TourSynbio: A Multi-Modal Large Model and Agent Framework to Bridge Text and Protein Sequences for Protein Engineering | Aug 27, 2024 | Multiple-choiceProtein Folding | CodeCode Available | 1 |
| YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection | Aug 27, 2024 | | CodeCode Available | 1 |
| AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems | Aug 27, 2024 | | CodeCode Available | 1 |
| RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models | Aug 27, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| Enhancing License Plate Super-Resolution: A Layout-Aware and Character-Driven Approach | Aug 27, 2024 | License Plate RecognitionOptical Character Recognition | CodeCode Available | 1 |
| Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance | Aug 27, 2024 | Decoderobject-detection | CodeCode Available | 1 |
| XG-NID: Dual-Modality Network Intrusion Detection using a Heterogeneous Graph Neural Network and Large Language Model | Aug 27, 2024 | Graph Neural NetworkIntrusion Detection | CodeCode Available | 1 |
| DRL-Based Federated Self-Supervised Learning for Task Offloading and Resource Allocation in ISAC-Enabled Vehicle Edge Computing | Aug 27, 2024 | CPUEdge-computing | CodeCode Available | 1 |
| DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays | Aug 27, 2024 | AnatomyComputed Tomography (CT) | CodeCode Available | 1 |
| SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models | Aug 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DCT-CryptoNets: Scaling Private Inference in the Frequency Domain | Aug 27, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos | Aug 26, 2024 | FormLanguage Modelling | CodeCode Available | 1 |
| Center Direction Network for Grasping Point Localization on Cloths | Aug 26, 2024 | Keypoint Detection | CodeCode Available | 1 |