| Similarity-Quantized Relative Difference Learning for Improved Molecular Activity Prediction | Jan 15, 2025 | Activity PredictionBenchmarking | —Unverified | 0 | 0 |
| WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences | Jun 16, 2024 | BenchmarkingSpatial Reasoning | —Unverified | 0 | 0 |
| Simple Feedfoward Neural Networks are Almost All You Need for Time Series Forecasting | Mar 30, 2025 | AllBenchmarking | —Unverified | 0 | 0 |
| VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment | Jun 16, 2024 | Action UnderstandingBenchmarking | —Unverified | 0 | 0 |
| Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries | Apr 2, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 | 0 |
| VeriContaminated: Assessing LLM-Driven Verilog Coding for Data Contamination | Mar 17, 2025 | BenchmarkingCode Generation | —Unverified | 0 | 0 |
| VeriFact: Enhancing Long-Form Factuality Evaluation with Refined Fact Extraction and Reference Facts | May 14, 2025 | BenchmarkingForm | —Unverified | 0 | 0 |
| Verifiable Format Control for Large Language Model Generations | Feb 6, 2025 | BenchmarkingInstruction Following | —Unverified | 0 | 0 |
| Simulation-Based Sensitivity Analysis in Optimal Treatment Regimes and Causal Decomposition with Individualized Interventions | Jun 23, 2025 | BenchmarkingSensitivity | —Unverified | 0 | 0 |
| Simulation of Large Scale Neural Networks for Evaluation Applications | May 20, 2018 | Benchmarking | —Unverified | 0 | 0 |
| An Evolutionary Algorithm For the Vehicle Routing Problem with Drones with Interceptions | Sep 21, 2024 | BenchmarkingScheduling | —Unverified | 0 | 0 |
| SinaTools: Open Source Toolkit for Arabic Natural Language Processing | Nov 3, 2024 | BenchmarkingLemmatization | —Unverified | 0 | 0 |
| SINDy vs Hard Nonlinearities and Hidden Dynamics: a Benchmarking Study | Mar 1, 2024 | Benchmarking | —Unverified | 0 | 0 |
| VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity | Mar 14, 2025 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| An evaluation framework for comparing causal inference models | Aug 31, 2022 | BenchmarkingCausal Inference | —Unverified | 0 | 0 |
| Single-Cell Omics Arena: A Benchmark Study for Large Language Models on Cell Type Annotation Using Single-Cell Data | Dec 3, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Single Stage Prediction with Embedded Topic Modeling of Online Reviews for Mobile App Management | Feb 19, 2018 | BenchmarkingManagement | —Unverified | 0 | 0 |
| An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models | May 23, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| Site2Vec: a reference frame invariant algorithm for vector embedding of protein-ligand binding sites | Mar 18, 2020 | BenchmarkingDrug Discovery | —Unverified | 0 | 0 |
| An Empirical Study of Super-resolution on Low-resolution Micro-expression Recognition | Oct 16, 2023 | BenchmarkingMicro Expression Recognition | —Unverified | 0 | 0 |
| Six-CD: Benchmarking Concept Removals for Text-to-image Diffusion Models | Jan 1, 2025 | Benchmarking | —Unverified | 0 | 0 |
| An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad Prediction | Nov 3, 2023 | BenchmarkingSentence | —Unverified | 0 | 0 |
| Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation | Jan 27, 2025 | BenchmarkingC++ code | —Unverified | 0 | 0 |
| VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models | May 21, 2025 | BenchmarkingReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping | Oct 21, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Sketch 'n Solve: An Efficient Python Package for Large-Scale Least Squares Using Randomized Numerical Linear Algebra | Sep 22, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Sketchtopia: A Dataset and Foundational Agents for Benchmarking Asynchronous Multimodal Communication with Iconic Feedback | Jan 1, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Skills and Liquidity Barriers to Youth Employment: Medium-term Evidence from a Cash Benchmarking Experiment in Rwanda | Sep 18, 2022 | Benchmarking | —Unverified | 0 | 0 |
| SkyRover: A Modular Simulator for Cross-Domain Pathfinding | Feb 13, 2025 | Benchmarking | —Unverified | 0 | 0 |
| SlangDIT: Benchmarking LLMs in Interpretative Slang Translation | May 20, 2025 | BenchmarkingSentence | —Unverified | 0 | 0 |
| A Case for Dataset Specific Profiling | Aug 1, 2022 | BenchmarkingModel Selection | —Unverified | 0 | 0 |
| An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets | Dec 2, 2023 | Benchmarking | —Unverified | 0 | 0 |
| An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification | Nov 24, 2023 | Benchmarkingimage-classification | —Unverified | 0 | 0 |
| AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING | Dec 20, 2023 | Benchmarking | —Unverified | 0 | 0 |
| SMiCRM: A Benchmark Dataset of Mechanistic Molecular Images | Jul 25, 2024 | Benchmarking | —Unverified | 0 | 0 |
| An efficient and perceptually motivated auditory neural encoding and decoding algorithm for spiking neural networks | Sep 3, 2019 | Benchmarkingspeech-recognition | —Unverified | 0 | 0 |
| Smiling Women Pitching Down: Auditing Representational and Presentational Gender Biases in Image Generative AI | May 17, 2023 | Benchmarking | —Unverified | 0 | 0 |
| SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge | May 17, 2024 | BenchmarkingSocial Media Popularity Prediction | —Unverified | 0 | 0 |
| An efficiency analysis of Spanish airports | Nov 8, 2023 | Benchmarking | —Unverified | 0 | 0 |
| An EEG-based Stereoscopic Research to Reveal the Brain's Response to What Happens Before and After Watching 2D and 3D Movies | Mar 13, 2019 | BenchmarkingEEG | —Unverified | 0 | 0 |
| An Early Warning Sign of Critical Transition in The Antarctic Ice Sheet -- A Data Driven Tool for Spatiotemporal Tipping Point | Apr 21, 2020 | BenchmarkingClustering | —Unverified | 0 | 0 |
| SMPLy Benchmarking 3D Human Pose Estimation in the Wild | Dec 4, 2020 | 3D Human Pose EstimationBenchmarking | —Unverified | 0 | 0 |
| Absolute Ranking: An Essential Normalization for Benchmarking Optimization Algorithms | Sep 6, 2024 | Bayesian InferenceBenchmarking | —Unverified | 0 | 0 |
| VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution | May 6, 2022 | BenchmarkingSpeaker Identification | —Unverified | 0 | 0 |
| SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos | Apr 14, 2022 | BenchmarkingMultiple Object Tracking | —Unverified | 0 | 0 |
| SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents | Jul 2, 2021 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Window-of-interest based Multi-objective Evolutionary Search for Satisficing Concepts | Jul 4, 2017 | Benchmarking | —Unverified | 0 | 0 |
| Social Bias Probing: Fairness Benchmarking for Language Models | Nov 15, 2023 | BenchmarkingFairness | —Unverified | 0 | 0 |
| Sockpuppet Detection in Wikipedia: A Corpus of Real-World Deceptive Writing for Linking Identities | Oct 24, 2013 | Benchmarking | —Unverified | 0 | 0 |
| Socratic-PRMBench: Benchmarking Process Reward Models with Systematic Reasoning Patterns | May 29, 2025 | Benchmarking | —Unverified | 0 | 0 |