| Vision and Language Reference Prompt into SAM for Few-shot Segmentation | Feb 2, 2025 | Segmentation | CodeCode Available | 1 |
| CycleGuardian: A Framework for Automatic RespiratorySound classification Based on Improved Deep clustering and Contrastive Learning | Feb 2, 2025 | Audio ClassificationClustering | CodeCode Available | 1 |
| SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters | Feb 2, 2025 | | CodeCode Available | 1 |
| LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation | Feb 2, 2025 | Inductive BiasVisual Prompting | CodeCode Available | 1 |
| BrainOOD: Out-of-distribution Generalizable Brain Network Analysis | Feb 2, 2025 | | CodeCode Available | 1 |
| MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models | Feb 2, 2025 | Benchmarking | CodeCode Available | 1 |
| Error-quantified Conformal Inference for Time Series | Feb 2, 2025 | PredictionTime Series | CodeCode Available | 1 |
| AgentBreeder: Mitigating the AI Safety Impact of Multi-Agent Scaffolds via Self-Improvement | Feb 2, 2025 | | CodeCode Available | 1 |
| UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs | Feb 2, 2025 | Graph Neural NetworkMixture-of-Experts | CodeCode Available | 1 |
| DeepGate4: Efficient and Effective Representation Learning for Circuit Design at Scale | Feb 2, 2025 | Representation Learning | CodeCode Available | 1 |
| Milmer: a Framework for Multiple Instance Learning based Multimodal Emotion Recognition | Feb 1, 2025 | EEGElectroencephalogram (EEG) | CodeCode Available | 1 |
| Complex Wavelet Mutual Information Loss: A Multi-Scale Loss Function for Semantic Segmentation | Feb 1, 2025 | Semantic Segmentation | CodeCode Available | 1 |
| Work-Efficient Parallel Non-Maximum Suppression Kernels | Feb 1, 2025 | GPUobject-detection | CodeCode Available | 1 |
| Prostate-Specific Foundation Models for Enhanced Detection of Clinically Significant Cancer | Feb 1, 2025 | Contrastive LearningDiagnostic | CodeCode Available | 1 |
| PM-MOE: Mixture of Experts on Private Model Parameters for Personalized Federated Learning | Feb 1, 2025 | DenoisingFederated Learning | CodeCode Available | 1 |
| Parameter Efficient Fine-Tuning of Segment Anything Model | Feb 1, 2025 | modelparameter-efficient fine-tuning | CodeCode Available | 1 |
| RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone Scenes | Feb 1, 2025 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 1 |
| Sub-Sequential Physics-Informed Learning with State Space Model | Feb 1, 2025 | | CodeCode Available | 1 |
| Sagalee: an Open Source Automatic Speech Recognition Dataset for Oromo Language | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| K Nearest Neighbor-Guided Trajectory Similarity Learning | Feb 1, 2025 | | CodeCode Available | 1 |
| NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning | Feb 1, 2025 | Referring ExpressionVisual Grounding | CodeCode Available | 1 |
| SigWavNet: Learning Multiresolution Signal Wavelet Network for Speech Emotion Recognition | Feb 1, 2025 | DenoisingEmotion Recognition | CodeCode Available | 1 |
| VertiFormer: A Data-Efficient Multi-Task Transformer for Off-Road Robot Mobility | Feb 1, 2025 | | CodeCode Available | 1 |
| BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution | Feb 1, 2025 | BinarizationSuper-Resolution | CodeCode Available | 1 |
| Physics-Inspired Distributed Radio Map Estimation | Feb 1, 2025 | Federated LearningPhysical Intuition | CodeCode Available | 1 |
| Speculative Ensemble: Fast Large Language Model Ensemble via Speculation | Feb 1, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Generating crossmodal gene expression from cancer histopathology improves multimodal AI predictions | Feb 1, 2025 | Prognosis | CodeCode Available | 1 |
| Estimating LLM Uncertainty with Evidence | Feb 1, 2025 | | CodeCode Available | 1 |
| Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation | Feb 1, 2025 | Membership Inference AttackRAG | CodeCode Available | 1 |
| A Benchmark for Incremental Micro-expression Recognition | Jan 31, 2025 | Incremental LearningMicro Expression Recognition | CodeCode Available | 1 |
| Full-scale Representation Guided Network for Retinal Vessel Segmentation | Jan 31, 2025 | Retinal Vessel Segmentation | CodeCode Available | 1 |
| The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training | Jan 31, 2025 | Scheduling | CodeCode Available | 1 |
| Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior | Jan 31, 2025 | GPU | CodeCode Available | 1 |
| SHARPIE: A Modular Framework for Reinforcement Learning and Human-AI Interaction Experiments | Jan 31, 2025 | reinforcement-learningReinforcement Learning | CodeCode Available | 1 |
| Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models | Jan 31, 2025 | | CodeCode Available | 1 |
| XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and Glasses | Jan 31, 2025 | Action LocalizationAction Recognition | CodeCode Available | 1 |
| Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them | Jan 31, 2025 | | CodeCode Available | 1 |
| Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations | Jan 31, 2025 | | CodeCode Available | 1 |
| Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models | Jan 31, 2025 | GPUQuantization | CodeCode Available | 1 |
| LiDAR Loop Closure Detection using Semantic Graphs with Graph Attention Networks | Jan 31, 2025 | Graph AttentionGraph Embedding | CodeCode Available | 1 |
| RMDM: Radio Map Diffusion Model with Physics Informed | Jan 31, 2025 | DenoisingIntelligent Communication | CodeCode Available | 1 |
| -Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation | Jan 31, 2025 | Question AnsweringVideo Question Answering | CodeCode Available | 1 |
| Vintix: Action Model via In-Context Reinforcement Learning | Jan 31, 2025 | Decision MakingIn-Context Reinforcement Learning | CodeCode Available | 1 |
| Low-Rank Adapting Models for Sparse Autoencoders | Jan 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Fixing the Double Penalty in Data-Driven Weather Forecasting Through a Modified Spherical Harmonic Loss Function | Jan 31, 2025 | Weather Forecasting | CodeCode Available | 1 |
| RIGNO: A Graph-based framework for robust and accurate operator learning for PDEs on arbitrary domains | Jan 31, 2025 | DiversityGraph Neural Network | CodeCode Available | 1 |
| Scalable-Softmax Is Superior for Attention | Jan 31, 2025 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search | Jan 31, 2025 | DenoisingVideo Alignment | CodeCode Available | 1 |
| DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits | Jan 31, 2025 | ECG ClassificationSensitivity | CodeCode Available | 1 |
| RGB-Event ISP: The Dataset and Benchmark | Jan 31, 2025 | Denoising | CodeCode Available | 1 |