| Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations | Dec 23, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Galvatron: An Automatic Distributed System for Efficient Foundation Model Training | Apr 30, 2025 | Benchmarking | —Unverified | 0 |
| Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks | May 24, 2024 | BenchmarkingDecoder | —Unverified | 0 |
| Hybrid data driven/thermal simulation model for comfort assessment | Sep 4, 2023 | Benchmarking | —Unverified | 0 |
| GANmut: Generating and Modifying Facial Expressions | Jun 16, 2024 | BenchmarkingDiversity | —Unverified | 0 |
| GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR | Apr 15, 2025 | Benchmarking | —Unverified | 0 |
| FactLens: Benchmarking Fine-Grained Fact Verification | Nov 8, 2024 | BenchmarkingFact Verification | —Unverified | 0 |
| GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics | Mar 27, 2025 | BenchmarkingNatural Language Queries | —Unverified | 0 |
| FACT: Learning Governing Abstractions Behind Integer Sequences | Sep 20, 2022 | Benchmarking | —Unverified | 0 |
| Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy | Dec 4, 2024 | AnatomyBenchmarking | —Unverified | 0 |
| Face Morphing Attack Generation & Detection: A Comprehensive Survey | Nov 3, 2020 | BenchmarkingFace Recognition | —Unverified | 0 |
| Face Detection on Surveillance Images | Oct 22, 2019 | BenchmarkingFace Detection | —Unverified | 0 |
| A Survey of Small Language Models | Oct 25, 2024 | BenchmarkingModel Compression | —Unverified | 0 |
| Hybrid Long Document Summarization using C2F-FAR and ChatGPT: A Practical Study | Jun 1, 2023 | ArticlesBenchmarking | —Unverified | 0 |
| Hydrological time series forecasting using simple combinations: Big data testing and investigations on one-year ahead river flow predictability | Jan 2, 2020 | BenchmarkingManagement | —Unverified | 0 |
| Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning | Apr 19, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content | Mar 13, 2025 | BenchmarkingImage Generation | —Unverified | 0 |
| A Unified Taylor Framework for Revisiting Attribution Methods | Aug 21, 2020 | BenchmarkingDecision Making | —Unverified | 0 |
| Benchmarking the Benchmark -- Analysis of Synthetic NIDS Datasets | Apr 19, 2021 | BenchmarkingIntrusion Detection | —Unverified | 0 |
| GenderBias-VL: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing | Jun 30, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases | May 25, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis | Aug 22, 2024 | Benchmarking | —Unverified | 0 |
| A Survey of Predictive Maintenance Methods: An Analysis of Prognostics via Classification and Regression | Jun 25, 2025 | BenchmarkingManagement | —Unverified | 0 |
| Extraction of clinical information from the non-invasive fetal electrocardiogram | May 27, 2016 | BenchmarkingHeart Rate Variability | —Unverified | 0 |
| Extensible Logging and Empirical Attainment Function for IOHexperimenter | Sep 28, 2021 | Benchmarking | —Unverified | 0 |