| Window Token Concatenation for Efficient Visual Large Language Models | Apr 5, 2025 | Token Reduction | CodeCode Available | 1 |
| Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAVTarget Detection | Apr 5, 2025 | parameter estimation | CodeCode Available | 1 |
| A Survey of Pathology Foundation Model: Progress and Future Directions | Apr 5, 2025 | BenchmarkingMultiple Instance Learning | CodeCode Available | 1 |
| Learning-Based Conformal Tube MPC for Safe Control in Interactive Multi-Agent Systems | Apr 4, 2025 | Conformal PredictionModel Predictive Control | CodeCode Available | 1 |
| OLAF: An Open Life Science Analysis Framework for Conversational Bioinformatics Powered by Large Language Models | Apr 4, 2025 | Data Visualization | CodeCode Available | 1 |
| Discovering Partially Known Ordinary Differential Equations: a Case Study on the Chemical Kinetics of Cellulose Degradation | Apr 4, 2025 | regressionSymbolic Regression | CodeCode Available | 1 |
| Meta-DAN: towards an efficient prediction strategy for page-level handwritten text recognition | Apr 4, 2025 | GPUHandwritten Text Recognition | CodeCode Available | 1 |
| SARLANG-1M: A Benchmark for Vision-Language Modeling in SAR Image Understanding | Apr 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking | Apr 4, 2025 | Document RankingInformation Retrieval | CodeCode Available | 1 |
| Single-Pass Document Scanning for Question Answering | Apr 4, 2025 | Question Answering | CodeCode Available | 1 |
| Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models | Apr 4, 2025 | DenoisingVideo Generation | CodeCode Available | 1 |
| Sparsity-Promoting Reachability Analysis and Optimization of Constrained Zonotopes | Apr 4, 2025 | State Estimation | CodeCode Available | 1 |
| The AI Cosmologist I: An Agentic System for Automated Data Analysis | Apr 4, 2025 | scientific discovery | CodeCode Available | 1 |
| Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation | Apr 4, 2025 | ClusteringHallucination | CodeCode Available | 1 |
| Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images | Apr 4, 2025 | Image AnimationMotion Estimation | CodeCode Available | 1 |
| Monte Carlo Graph Coloring | Apr 4, 2025 | | CodeCode Available | 1 |
| Language Models Are Implicitly Continuous | Apr 4, 2025 | Language Modelling | CodeCode Available | 1 |
| Multi-Flow: Multi-View-Enriched Normalizing Flows for Industrial Anomaly Detection | Apr 4, 2025 | Anomaly Detection | CodeCode Available | 1 |
| Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction | Apr 4, 2025 | AttributeLanguage Modeling | CodeCode Available | 1 |
| IPA-CHILDES & G2P+: Feature-Rich Resources for Cross-Lingual Phonology and Phonemic Language Modeling | Apr 3, 2025 | Grapheme-to-Phoneme ConversionLanguage Modeling | CodeCode Available | 1 |
| Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation | Apr 3, 2025 | State Space Models | CodeCode Available | 1 |
| A Physics-Informed Meta-Learning Framework for the Continuous Solution of Parametric PDEs on Arbitrary Geometries | Apr 3, 2025 | DecoderMeta-Learning | CodeCode Available | 1 |
| ESC: Erasing Space Concept for Knowledge Deletion | Apr 3, 2025 | | CodeCode Available | 1 |
| APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers | Apr 3, 2025 | Quantization | CodeCode Available | 1 |
| Narrative Studio: Visual narrative exploration using LLMs and Monte Carlo Tree Search | Apr 3, 2025 | | CodeCode Available | 1 |
| MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank Compensators | Apr 3, 2025 | Mixture-of-ExpertsQuantization | CodeCode Available | 1 |
| Adaptive Frequency Enhancement Network for Remote Sensing Image Semantic Segmentation | Apr 3, 2025 | Semantic Segmentation | CodeCode Available | 1 |
| Charm: The Missing Piece in ViT fine-tuning for Image Aesthetic Assessment | Apr 3, 2025 | | CodeCode Available | 1 |
| MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception | Apr 3, 2025 | Multi-Task LearningTransfer Learning | CodeCode Available | 1 |
| Large (Vision) Language Models are Unsupervised In-Context Learners | Apr 3, 2025 | GSM8KIn-Context Learning | CodeCode Available | 1 |
| PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation | Apr 3, 2025 | ObjectPose Estimation | CodeCode Available | 1 |
| Generative Evaluation of Complex Reasoning in Large Language Models | Apr 3, 2025 | BenchmarkingMemorization | CodeCode Available | 1 |
| AnesBench: Multi-Dimensional Evaluation of LLM Reasoning in Anesthesiology | Apr 3, 2025 | | CodeCode Available | 1 |
| JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model | Apr 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Noise Calibration and Spatial-Frequency Interactive Network for STEM Image Enhancement | Apr 3, 2025 | Image Enhancement | CodeCode Available | 1 |
| F-ViTA: Foundation Model Guided Visible to Thermal Translation | Apr 3, 2025 | Scene UnderstandingStyle Transfer | CodeCode Available | 1 |
| MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs | Apr 3, 2025 | | CodeCode Available | 1 |
| Robustly identifying concepts introduced during chat fine-tuning using crosscoders | Apr 3, 2025 | | CodeCode Available | 1 |
| Do Two AI Scientists Agree? | Apr 3, 2025 | | CodeCode Available | 1 |
| Multi-Head Adaptive Graph Convolution Network for Sparse Point Cloud-Based Human Activity Recognition | Apr 3, 2025 | Activity RecognitionHuman Activity Recognition | CodeCode Available | 1 |
| Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline | Apr 3, 2025 | object-detectionObject Detection | CodeCode Available | 1 |
| Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results | Apr 3, 2025 | Instance Segmentationobject-detection | CodeCode Available | 1 |
| Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision | Apr 3, 2025 | 3D Object Detectioncross-modal alignment | CodeCode Available | 1 |
| GMR-Conv: An Efficient Rotation and Reflection Equivariant Convolution Kernel Using Gaussian Mixture Rings | Apr 3, 2025 | | CodeCode Available | 1 |
| MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities | Apr 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation | Apr 3, 2025 | Denoising | CodeCode Available | 1 |
| TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection | Apr 3, 2025 | Anomaly DetectionUnsupervised Anomaly Detection | CodeCode Available | 1 |
| STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection | Apr 3, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 1 |
| Detecting Lip-Syncing Deepfakes: Vision Temporal Transformer for Analyzing Mouth Inconsistencies | Apr 2, 2025 | Face Swapping | CodeCode Available | 1 |
| Representation Bending for Large Language Model Safety | Apr 2, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |