| Rethinking Machine Unlearning in Image Generation Models | Jun 3, 2025 | BenchmarkingImage Generation | CodeCode Available | 1 |
| ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions | Jun 3, 2025 | BenchmarkingDiversity | CodeCode Available | 1 |
| TL;DR: Too Long, Do Re-weighting for Efficient LLM Reasoning Compression | Jun 3, 2025 | | CodeCode Available | 1 |
| OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation | Jun 3, 2025 | Question Answering | CodeCode Available | 1 |
| EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models | Jun 2, 2025 | Action RecognitionAction Segmentation | CodeCode Available | 1 |
| OD3: Optimization-free Dataset Distillation for Object Detection | Jun 2, 2025 | Dataset Distillationimage-classification | CodeCode Available | 1 |
| Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models | Jun 2, 2025 | Instruction FollowingReinforcement Learning (RL) | CodeCode Available | 1 |
| WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue | Jun 2, 2025 | Task-Oriented Dialogue Systems | CodeCode Available | 1 |
| Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation | Jun 2, 2025 | MisinformationTalking Head Generation | CodeCode Available | 1 |
| Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean | Jun 2, 2025 | Multiple-choice | CodeCode Available | 1 |
| STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent Framework | Jun 2, 2025 | Math | CodeCode Available | 1 |
| EfficientFER: EfficientNetv2 Based Deep Learning Approach for Facial Expression Recognition | Jun 2, 2025 | Deep LearningEmotion Recognition | CodeCode Available | 1 |
| Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis | Jun 2, 2025 | | CodeCode Available | 1 |
| scDataset: Scalable Data Loading for Deep Learning on Large-Scale Single-Cell Omics | Jun 2, 2025 | | CodeCode Available | 1 |
| SEMNAV: A Semantic Segmentation-Driven Approach to Visual Semantic Navigation | Jun 2, 2025 | Domain AdaptationNavigate | CodeCode Available | 1 |
| SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost | Jun 2, 2025 | Image SegmentationSemantic Segmentation | CodeCode Available | 1 |
| GLoSS: Generative Language Models with Semantic Search for Sequential Recommendation | Jun 2, 2025 | Sequential Recommendation | CodeCode Available | 1 |
| TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal Discovery | Jun 2, 2025 | Causal DiscoveryDataset Generation | CodeCode Available | 1 |
| Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability | Jun 2, 2025 | DescriptiveSynthetic Data Generation | CodeCode Available | 1 |
| IF-GUIDE: Influence Function-Guided Detoxification of LLMs | Jun 2, 2025 | | CodeCode Available | 1 |
| AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions | Jun 2, 2025 | | CodeCode Available | 1 |
| SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model | Jun 2, 2025 | Mixture-of-ExpertsUnsupervised Pre-training | CodeCode Available | 1 |
| Crowdsourcing MUSHRA Tests in the Age of Generative Speech Technologies: A Comparative Analysis of Subjective and Objective Testing Methods | Jun 1, 2025 | | CodeCode Available | 1 |
| LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World | Jun 1, 2025 | document understandingEntity Linking | CodeCode Available | 1 |
| Protap: A Benchmark for Protein Modeling on Realistic Downstream Applications | Jun 1, 2025 | | CodeCode Available | 1 |
| CODEMENV: Benchmarking Large Language Models on Code Migration | Jun 1, 2025 | Benchmarking | CodeCode Available | 1 |
| PFMBench: Protein Foundation Model Benchmark | Jun 1, 2025 | model | CodeCode Available | 1 |
| IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory | Jun 1, 2025 | Semantic SimilaritySemantic Textual Similarity | CodeCode Available | 1 |
| Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs | May 31, 2025 | | CodeCode Available | 1 |
| Con Instruction: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual Modalities | May 31, 2025 | ARC | CodeCode Available | 1 |
| MIRROR: Cognitive Inner Monologue Between Conversational Turns for Persistent Reflection and Reasoning in Conversational LLMs | May 31, 2025 | | CodeCode Available | 1 |
| Look mom, no experimental data! Learning to score protein-ligand interactions from simulations | May 31, 2025 | | CodeCode Available | 1 |
| A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning for Any Atlas and Disorder | May 31, 2025 | Contrastive LearningMeta-Learning | CodeCode Available | 1 |
| An LLM Agent for Functional Bug Detection in Network Protocols | May 31, 2025 | | CodeCode Available | 1 |
| AVROBUSTBENCH: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time | May 31, 2025 | BenchmarkingTest-time Adaptation | CodeCode Available | 1 |
| PAKTON: A Multi-Agent Framework for Question Answering in Long Legal Agreements | May 31, 2025 | Privacy PreservingQuestion Answering | CodeCode Available | 1 |
| dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data Generation | May 31, 2025 | Synthetic Data GenerationTabular Data Generation | CodeCode Available | 1 |
| SEED: A Benchmark Dataset for Sequential Facial Attribute Editing with Diffusion Models | May 31, 2025 | AttributeFacial Editing | CodeCode Available | 1 |
| DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity Environments | May 31, 2025 | Large Language Model | CodeCode Available | 1 |
| Neuro2Semantic: A Transfer Learning Framework for Semantic Reconstruction of Continuous Language from Human Intracranial EEG | May 31, 2025 | EEGText Generation | CodeCode Available | 1 |
| Synergizing LLMs with Global Label Propagation for Multimodal Fake News Detection | May 31, 2025 | Fake News Detection | CodeCode Available | 1 |
| DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis? | May 30, 2025 | DiagnosticMedical Image Analysis | CodeCode Available | 1 |
| Bench4KE: Benchmarking Automated Competency Question Generation | May 30, 2025 | BenchmarkingQuestion Generation | CodeCode Available | 1 |
| CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning | May 30, 2025 | class-incremental learningClass Incremental Learning | CodeCode Available | 1 |
| Timing is Important: Risk-aware Fund Allocation based on Time-Series Forecasting | May 30, 2025 | Time SeriesTime Series Forecasting | CodeCode Available | 1 |
| Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting | May 30, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Chameleon: A MatMul-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data | May 30, 2025 | Continual LearningFew-Shot Learning | CodeCode Available | 1 |
| A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings | May 30, 2025 | Math | CodeCode Available | 1 |
| Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors | May 30, 2025 | Human-Object Interaction DetectionSemantic Segmentation | CodeCode Available | 1 |
| VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software | May 30, 2025 | Question AnsweringSpatial Reasoning | CodeCode Available | 1 |