| CaseGen: A Benchmark for Multi-Stage Legal Case Documents Generation | Feb 25, 2025 | Legal Reasoning | CodeCode Available | 1 |
| Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning | Feb 25, 2025 | | CodeCode Available | 1 |
| Steering Language Model to Stable Speech Emotion Recognition via Contextual Perception and Chain of Thought | Feb 25, 2025 | Emotion RecognitionLanguage Modeling | CodeCode Available | 1 |
| Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs | Feb 25, 2025 | BenchmarkingChunking | CodeCode Available | 1 |
| Multi-Perspective Data Augmentation for Few-shot Object Detection | Feb 25, 2025 | Data AugmentationFew-Shot Object Detection | CodeCode Available | 1 |
| LLM Knows Geometry Better than Algebra: Numerical Understanding of LLM-Based Agents in A Trading Arena | Feb 25, 2025 | | CodeCode Available | 1 |
| Training Consistency Models with Variational Noise Coupling | Feb 25, 2025 | Image Generation | CodeCode Available | 1 |
| FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models | Feb 25, 2025 | Fact Checking | CodeCode Available | 1 |
| Can Multimodal LLMs Perform Time Series Anomaly Detection? | Feb 25, 2025 | Anomaly DetectionIrregular Time Series | CodeCode Available | 1 |
| Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric | Feb 24, 2025 | Diversity | CodeCode Available | 1 |
| ReFocus: Reinforcing Mid-Frequency and Key-Frequency Modeling for Multivariate Time Series Forecasting | Feb 24, 2025 | Multivariate Time Series ForecastingTime Series | CodeCode Available | 1 |
| Snoopy: Effective and Efficient Semantic Join Discovery via Proxy Columns | Feb 24, 2025 | Contrastive LearningGraph Matching | CodeCode Available | 1 |
| AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models | Feb 24, 2025 | Logical ReasoningMultiple-choice | CodeCode Available | 1 |
| Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch | Feb 24, 2025 | | CodeCode Available | 1 |
| CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought | Feb 24, 2025 | Mathematical ReasoningMisinformation | CodeCode Available | 1 |
| LongAttn: Selecting Long-context Training Data via Token-level Attention | Feb 24, 2025 | Sentence | CodeCode Available | 1 |
| Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions | Feb 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Training a Generally Curious Agent | Feb 24, 2025 | Decision MakingEfficient Exploration | CodeCode Available | 1 |
| Function-Space Learning Rates | Feb 24, 2025 | | CodeCode Available | 1 |
| Hallucination Detection in LLMs Using Spectral Features of Attention Maps | Feb 24, 2025 | Hallucination | CodeCode Available | 1 |
| CalibRefine: Deep Learning-Based Online Automatic Targetless LiDAR-Camera Calibration with Iterative and Attention-Driven Post-Refinement | Feb 24, 2025 | Autonomous DrivingCamera Calibration | CodeCode Available | 1 |
| HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization | Feb 24, 2025 | DiversityFact Verification | CodeCode Available | 1 |
| Posterior Inference with Diffusion Models for High-dimensional Black-box Optimization | Feb 24, 2025 | Bayesian OptimizationUncertainty Quantification | CodeCode Available | 1 |
| COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs | Feb 24, 2025 | | CodeCode Available | 1 |
| MAD-AD: Masked Diffusion for Unsupervised Brain Anomaly Detection | Feb 24, 2025 | AnatomyAnomaly Detection | CodeCode Available | 1 |
| PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance | Feb 24, 2025 | | CodeCode Available | 1 |
| REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective | Feb 24, 2025 | | CodeCode Available | 1 |
| Towards Hierarchical Rectified Flow | Feb 24, 2025 | | CodeCode Available | 1 |
| SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding | Feb 24, 2025 | cross-modal alignmentVisual Grounding | CodeCode Available | 1 |
| FADE: Why Bad Descriptions Happen to Good Features | Feb 24, 2025 | | CodeCode Available | 1 |
| Tidiness Score-Guided Monte Carlo Tree Search for Visual Tabletop Rearrangement | Feb 24, 2025 | | CodeCode Available | 1 |
| LongSafety: Evaluating Long-Context Safety of Large Language Models | Feb 24, 2025 | | CodeCode Available | 1 |
| MambaFlow: A Novel and Flow-guided State Space Model for Scene Flow Estimation | Feb 24, 2025 | Autonomous DrivingDecoder | CodeCode Available | 1 |
| Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning | Feb 24, 2025 | Self-Supervised Learning | CodeCode Available | 1 |
| MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference | Feb 24, 2025 | | CodeCode Available | 1 |
| LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences | Feb 24, 2025 | HallucinationInformation Retrieval | CodeCode Available | 1 |
| Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam | Feb 24, 2025 | | CodeCode Available | 1 |
| CipherPrune: Efficient and Scalable Private Transformer Inference | Feb 24, 2025 | Privacy Preserving | CodeCode Available | 1 |
| JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning | Feb 24, 2025 | Legal Reasoning | CodeCode Available | 1 |
| AeroReformer: Aerial Referring Transformer for UAV-based Referring Image Segmentation | Feb 23, 2025 | Image SegmentationSegmentation | CodeCode Available | 1 |
| Code Summarization Beyond Function Level | Feb 23, 2025 | Code SummarizationFew-Shot Learning | CodeCode Available | 1 |
| A Reverse Mamba Attention Network for Pathological Liver Segmentation | Feb 23, 2025 | Computational EfficiencyLiver Segmentation | CodeCode Available | 1 |
| OptionZero: Planning with Learned Options | Feb 23, 2025 | Atari Games | CodeCode Available | 1 |
| CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale | Feb 23, 2025 | | CodeCode Available | 1 |
| Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression | Feb 23, 2025 | Efficient Neural NetworkQuantization | CodeCode Available | 1 |
| FanChuan: A Multilingual and Graph-Structured Benchmark For Parody Detection and Analysis | Feb 23, 2025 | SentenceSentence Embedding | CodeCode Available | 1 |
| Automatic Input Rewriting Improves Translation with Large Language Models | Feb 23, 2025 | Machine TranslationText Simplification | CodeCode Available | 1 |
| Towards Optimal Adversarial Robust Reinforcement Learning with Infinity Measurement Error | Feb 23, 2025 | Adversarial RobustnessDeep Reinforcement Learning | CodeCode Available | 1 |
| BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning | Feb 23, 2025 | Benchmarking | CodeCode Available | 1 |
| Are Sparse Autoencoders Useful? A Case Study in Sparse Probing | Feb 23, 2025 | Inductive BiasLarge Language Model | CodeCode Available | 1 |