| Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models | Mar 30, 2025 | BenchmarkingRelational Reasoning | —Unverified | 0 |
| MHTS: Multi-Hop Tree Structure Framework for Generating Difficulty-Controllable QA Datasets for RAG Evaluation | Mar 29, 2025 | Answer GenerationBenchmarking | —Unverified | 0 |
| Unsupervised Anomaly Detection in Multivariate Time Series across Heterogeneous Domains | Mar 29, 2025 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis | Mar 29, 2025 | BenchmarkingLarge Language Model | —Unverified | 0 |
| RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations | Mar 29, 2025 | Benchmarkingreinforcement-learning | —Unverified | 0 |
| SimBank: from Simulation to Solution in Prescriptive Process Monitoring | Mar 28, 2025 | Benchmarking | —Unverified | 0 |
| Generalization Bias in Large Language Model Summarization of Scientific Research | Mar 28, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| LIM: Large Interpolator Model for Dynamic Reconstruction | Mar 28, 2025 | 4D reconstructionBenchmarking | —Unverified | 0 |
| An Advanced Ensemble Deep Learning Framework for Stock Price Prediction Using VAE, Transformer, and LSTM Model | Mar 28, 2025 | Algorithmic TradingBenchmarking | —Unverified | 0 |
| Benchmarking Ultra-Low-Power μNPUs | Mar 28, 2025 | Benchmarking | —Unverified | 0 |