| ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations | Feb 16, 2025 | Text Segmentation | CodeCode Available | 1 |
| Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding | Feb 16, 2025 | AttributeObject | CodeCode Available | 1 |
| MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models | Feb 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models | Feb 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DEEPER Insight into Your User: Directed Persona Refinement for Dynamic Persona Modeling | Feb 16, 2025 | PredictionRecommendation Systems | CodeCode Available | 1 |
| Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning | Feb 15, 2025 | Causal DiscoveryStructured Prediction | CodeCode Available | 1 |
| Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation Correntropy | Feb 15, 2025 | Point Cloud RegistrationScene Understanding | CodeCode Available | 1 |
| LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization | Feb 15, 2025 | feature selectionRAG | CodeCode Available | 1 |
| A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecasting: A Channel Strategy Perspective | Feb 15, 2025 | Multivariate Time Series ForecastingTime Series | CodeCode Available | 1 |
| Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model | Feb 15, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| BASE-SQL: A powerful open source Text-To-SQL baseline approach | Feb 15, 2025 | In-Context LearningLarge Language Model | CodeCode Available | 1 |
| CoPEFT: Fast Adaptation Framework for Multi-Agent Collaborative Perception with Parameter-Efficient Fine-Tuning | Feb 15, 2025 | Domain Adaptationparameter-efficient fine-tuning | CodeCode Available | 1 |
| Reduced Order Modeling with Shallow Recurrent Decoder Networks | Feb 15, 2025 | Computational EfficiencyDecoder | CodeCode Available | 1 |
| CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs | Feb 15, 2025 | Computational EfficiencyGPU | CodeCode Available | 1 |
| BalanceBenchmark: A Survey for Imbalanced Learning | Feb 15, 2025 | Survey | CodeCode Available | 1 |
| Forget the Data and Fine-Tuning! Just Fold the Network to Compress | Feb 14, 2025 | Model Compression | CodeCode Available | 1 |
| Can Large Language Model Agents Balance Energy Systems? | Feb 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| SegX: Improving Interpretability of Clinical Image Diagnosis with Segmentation-based Enhancement | Feb 14, 2025 | Decision MakingMedical Image Analysis | CodeCode Available | 1 |
| Manual2Skill: Learning to Read Manuals and Acquire Robotic Skills for Furniture Assembly Using Vision-Language Models | Feb 14, 2025 | Motion PlanningPose Estimation | CodeCode Available | 1 |
| QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images | Feb 14, 2025 | DecoderImage Segmentation | CodeCode Available | 1 |
| Classifier-free Guidance with Adaptive Scaling | Feb 14, 2025 | Denoising | CodeCode Available | 1 |
| Evaluating and Improving Graph-based Explanation Methods for Multi-Agent Coordination | Feb 14, 2025 | Graph Learning | CodeCode Available | 1 |
| AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting | Feb 14, 2025 | Multivariate Time Series ForecastingRepresentation Learning | CodeCode Available | 1 |
| A Lightweight and Effective Image Tampering Localization Network with Vision Mamba | Feb 14, 2025 | DecoderMamba | CodeCode Available | 1 |
| X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Compromising Usability | Feb 14, 2025 | Safety Alignment | CodeCode Available | 1 |
| MITO: Enabling Non-Line-of-Sight Perception using Millimeter-waves through Real-World Datasets and Simulation Tools | Feb 14, 2025 | Semantic Segmentation | CodeCode Available | 1 |
| A synergistic CNN-transformer network with pooling attention fusion for hyperspectral image classification | Feb 14, 2025 | Hyperspectral Image Classificationimage-classification | CodeCode Available | 1 |
| CISSIR: Beam Codebooks with Self-Interference Reduction Guarantees for Integrated Sensing and Communication Beyond 5G | Feb 14, 2025 | Integrated sensing and communicationISAC | CodeCode Available | 1 |
| Automated Muscle and Fat Segmentation in Computed Tomography for Comprehensive Body Composition Analysis | Feb 13, 2025 | Segmentation | CodeCode Available | 1 |
| Large Images are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting | Feb 13, 2025 | 3D ReconstructionNovel View Synthesis | CodeCode Available | 1 |
| Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection | Feb 13, 2025 | object-detectionObject Detection | CodeCode Available | 1 |
| A Contextual-Aware Position Encoding for Sequential Recommendation | Feb 13, 2025 | PositionRecommendation Systems | CodeCode Available | 1 |
| Rethinking Evaluation Metrics for Grammatical Error Correction: Why Use a Different Evaluation Process than Human? | Feb 13, 2025 | Grammatical Error CorrectionSentence | CodeCode Available | 1 |
| Medicine on the Edge: Comparative Performance Analysis of On-Device LLMs for Clinical Reasoning | Feb 13, 2025 | Computational Efficiency | CodeCode Available | 1 |
| QueryAttack: Jailbreaking Aligned Large Language Models Using Structured Non-natural Query Language | Feb 13, 2025 | Safety Alignment | CodeCode Available | 1 |
| Inverse Design with Dynamic Mode Decomposition | Feb 13, 2025 | Uncertainty Quantification | CodeCode Available | 1 |
| SQ-GAN: Semantic Image Communications Using Masked Vector Quantization | Feb 13, 2025 | Image CompressionQuantization | CodeCode Available | 1 |
| You Do Not Fully Utilize Transformer's Representation Capacity | Feb 13, 2025 | | CodeCode Available | 1 |
| AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection | Feb 13, 2025 | Anomaly DetectionGraph Anomaly Detection | CodeCode Available | 1 |
| Biologically Plausible Brain Graph Transformer | Feb 13, 2025 | | CodeCode Available | 1 |
| Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs | Feb 13, 2025 | BenchmarkingRetrieval | CodeCode Available | 1 |
| Reevaluating Policy Gradient Methods for Imperfect-Information Games | Feb 13, 2025 | counterfactualDeep Reinforcement Learning | CodeCode Available | 1 |
| LOB-Bench: Benchmarking Generative AI for Finance -- an Application to Limit Order Book Data | Feb 13, 2025 | BenchmarkingState Space Models | CodeCode Available | 1 |
| GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis | Feb 13, 2025 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 1 |
| PTZ-Calib: Robust Pan-Tilt-Zoom Camera Calibration | Feb 13, 2025 | Camera CalibrationComputational Efficiency | CodeCode Available | 1 |
| Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative | Feb 13, 2025 | ImputationTime Series | CodeCode Available | 1 |
| Spiking Neural Networks for Temporal Processing: Status Quo and Future Prospects | Feb 13, 2025 | | CodeCode Available | 1 |
| A Deep Inverse-Mapping Model for a Flapping Robotic Wing | Feb 13, 2025 | | CodeCode Available | 1 |
| Enhancing the Utility of Higher-Order Information in Relational Learning | Feb 13, 2025 | Relational Reasoning | CodeCode Available | 1 |
| MC2SleepNet: Multi-modal Cross-masking with Contrastive Learning for Sleep Stage Classification | Feb 13, 2025 | Contrastive LearningEEG | CodeCode Available | 1 |