| Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models | Feb 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Multi-Objective Causal Bayesian Optimization | Feb 20, 2025 | Bayesian OptimizationDecision Making | CodeCode Available | 1 |
| SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images | Feb 20, 2025 | Image SegmentationSegmentation | CodeCode Available | 1 |
| Exploiting Deblurring Networks for Radiance Fields | Feb 20, 2025 | Computational EfficiencyDeblurring | CodeCode Available | 1 |
| Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning | Feb 20, 2025 | AttributeDiagnostic | CodeCode Available | 1 |
| CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models | Feb 20, 2025 | BlockingLanguage Modeling | CodeCode Available | 1 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| CLIPPER: Compression enables long-context synthetic data generation | Feb 20, 2025 | Claim VerificationSynthetic Data Generation | CodeCode Available | 1 |
| Unstructured Evidence Attribution for Long Context Query Focused Summarization | Feb 20, 2025 | Query-focused Summarization | CodeCode Available | 1 |
| SEA-HELM: Southeast Asian Holistic Evaluation of Language Models | Feb 20, 2025 | | CodeCode Available | 1 |
| Pre-training Graph Neural Networks on Molecules by Using Subgraph-Conditioned Graph Information Bottleneck | Feb 20, 2025 | Graph ClassificationGraph Neural Network | CodeCode Available | 1 |
| Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps | Feb 20, 2025 | Question Answering | CodeCode Available | 1 |
| Bridging Text and Vision: A Multi-View Text-Vision Registration Approach for Cross-Modal Place Recognition | Feb 20, 2025 | Cross-modal place recognitionNatural Language Understanding | CodeCode Available | 1 |
| CDGS: Confidence-Aware Depth Regularization for 3D Gaussian Splatting | Feb 20, 2025 | 3DGS3D Reconstruction | CodeCode Available | 1 |
| StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following | Feb 20, 2025 | Instruction Following | CodeCode Available | 1 |
| MedFuncta: Modality-Agnostic Representations Based on Efficient Neural Fields | Feb 20, 2025 | Medical Image AnalysisMeta-Learning | CodeCode Available | 1 |
| I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search | Feb 20, 2025 | AutoMLCode Generation | CodeCode Available | 1 |
| NAVIG: Natural Language-guided Analysis with Vision Language Models for Image Geo-localization | Feb 20, 2025 | geo-localization | CodeCode Available | 1 |
| Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs | Feb 20, 2025 | Cross-Lingual TransferMachine Translation | CodeCode Available | 1 |
| Noisy Test-Time Adaptation in Vision-Language Models | Feb 20, 2025 | Test-time Adaptation | CodeCode Available | 1 |
| LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models | Feb 20, 2025 | | CodeCode Available | 1 |
| How to Get Your LLM to Generate Challenging Problems for Evaluation | Feb 20, 2025 | Code CompletionMath | CodeCode Available | 1 |
| Pursuing Top Growth with Novel Loss Function | Feb 20, 2025 | | CodeCode Available | 1 |
| H3DE-Net: Efficient and Accurate 3D Landmark Detection in Medical Imaging | Feb 20, 2025 | Computational EfficiencyMedical Image Analysis | CodeCode Available | 1 |
| Tree-of-Debate: Multi-Persona Debate Trees Elicit Critical Thinking for Scientific Comparative Analysis | Feb 20, 2025 | Articles | CodeCode Available | 1 |
| FlowAgent: Achieving Compliance and Flexibility for Workflow Agents | Feb 20, 2025 | | CodeCode Available | 1 |
| Improving LLM-powered Recommendations with Personalized Information | Feb 19, 2025 | Recommendation Systems | CodeCode Available | 1 |
| Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data | Feb 19, 2025 | Fine-Grained Visual RecognitionPneumonia Detection | CodeCode Available | 1 |
| PeerQA: A Scientific Question Answering Dataset from Peer Reviews | Feb 19, 2025 | answerability predictionAnswer Generation | CodeCode Available | 1 |
| Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning | Feb 19, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| RobustX: Robust Counterfactual Explanations Made Easy | Feb 19, 2025 | counterfactualDecision Making | CodeCode Available | 1 |
| MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification | Feb 19, 2025 | Multimodal Reasoning | CodeCode Available | 1 |
| Judging the Judges: A Collection of LLM-Generated Relevance Judgements | Feb 19, 2025 | Information Retrieval | CodeCode Available | 1 |
| Reasoning with Reinforced Functional Token Tuning | Feb 19, 2025 | Math | CodeCode Available | 1 |
| Deep Learning for VWAP Execution in Crypto Markets: Beyond the Volume Curve | Feb 19, 2025 | | CodeCode Available | 1 |
| Triad: Vision Foundation Model for 3D Magnetic Resonance Imaging | Feb 19, 2025 | Cancer ClassificationComputed Tomography (CT) | CodeCode Available | 1 |
| Spiking Point Transformer for Point Cloud Classification | Feb 19, 2025 | ClassificationPoint Cloud Classification | CodeCode Available | 1 |
| LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization | Feb 19, 2025 | | CodeCode Available | 1 |
| SPEX: Scaling Feature Interaction Explanations for LLMs | Feb 19, 2025 | | CodeCode Available | 1 |
| 2.5D U-Net with Depth Reduction for 3D CryoET Object Identification | Feb 19, 2025 | Electron TomographyKeypoint Detection | CodeCode Available | 1 |
| Which Attention Heads Matter for In-Context Learning? | Feb 19, 2025 | In-Context Learning | CodeCode Available | 1 |
| Latent Distribution Decoupling: A Probabilistic Framework for Uncertainty-Aware Multimodal Emotion Recognition | Feb 19, 2025 | Emotion RecognitionMultimodal Emotion Recognition | CodeCode Available | 1 |
| Lost in Sequence: Do Large Language Models Understand Sequential Recommendation? | Feb 19, 2025 | Sequential Recommendation | CodeCode Available | 1 |
| Benchmarking LLMs for Political Science: A United Nations Perspective | Feb 19, 2025 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Refining embeddings with fill-tuning: data-efficient generalised performance improvements for materials foundation models | Feb 19, 2025 | | CodeCode Available | 1 |
| AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence | Feb 19, 2025 | Code GenerationDecision Making | CodeCode Available | 1 |
| From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions | Feb 19, 2025 | | CodeCode Available | 1 |
| Collaborative Retrieval for Large Language Model-based Conversational Recommender Systems | Feb 19, 2025 | Collaborative FilteringConversational Recommendation | CodeCode Available | 1 |
| Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling | Feb 18, 2025 | Combinatorial OptimizationJob Shop Scheduling | CodeCode Available | 1 |
| A Cognitive Writing Perspective for Constrained Long-Form Text Generation | Feb 18, 2025 | FormText Generation | CodeCode Available | 1 |