| FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition | Dec 16, 2023 | BenchmarkingFacial Expression Recognition | —Unverified | 0 |
| FETCH: A Memory-Efficient Replay Approach for Continual Learning in Image Classification | Jul 17, 2024 | BenchmarkingContinual Learning | —Unverified | 0 |
| FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding | Nov 16, 2021 | BenchmarkingNatural Language Understanding | —Unverified | 0 |
| Few-Shot Defect Segmentation Leveraging Abundant Normal Training Samples Through Normal Background Regularization and Crop-and-Paste Operation | Jul 18, 2020 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| Few-Shot Learning for Industrial Time Series: A Comparative Analysis Using the Example of Screw-Fastening Process Monitoring | Jun 16, 2025 | BenchmarkingFew-Shot Learning | —Unverified | 0 |
| Fiber Bundle Morphisms as a Framework for Modeling Many-to-Many Maps | Mar 15, 2022 | BenchmarkingSentiment Analysis | —Unverified | 0 |
| E(3)-equivariant models cannot learn chirality: Field-based molecular generation | Feb 24, 2024 | BenchmarkingGraph Neural Network | —Unverified | 0 |
| Filter Methods for Feature Selection in Supervised Machine Learning Applications -- Review and Benchmark | Nov 23, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| Finance Language Model Evaluation (FLaME) | Jun 18, 2025 | BenchmarkingLanguage Model Evaluation | —Unverified | 0 |
| Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging | Jun 6, 2023 | BenchmarkingSentence | —Unverified | 0 |
| Findings of the Shared Task on Offensive Language Identification in Tamil, Malayalam, and Kannada | Apr 1, 2021 | BenchmarkingLanguage Identification | —Unverified | 0 |
| Fine-Grained Classification of Pedestrians in Video: Benchmark and State of the Art | May 20, 2016 | BenchmarkingGeneral Classification | —Unverified | 0 |
| FineText: Text Classification via Attention-based Language Model Fine-tuning | Oct 25, 2019 | BenchmarkingClassification | —Unverified | 0 |
| Fine-tuning LLaMA 2 interference: a comparative study of language implementations for optimal efficiency | Jan 30, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets | Oct 7, 2023 | Benchmarkingnamed-entity-recognition | —Unverified | 0 |
| FinLoRA: Benchmarking LoRA Methods for Fine-Tuning LLMs on Financial Datasets | May 26, 2025 | BenchmarkingGPU | —Unverified | 0 |
| FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance | Mar 7, 2025 | ArticlesBenchmarking | —Unverified | 0 |
| FIORD: A Fisheye Indoor-Outdoor Dataset with LIDAR Ground Truth for 3D Scene Reconstruction and Benchmarking | Apr 2, 2025 | 3D Scene ReconstructionBenchmarking | —Unverified | 0 |
| FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures | Jan 1, 2024 | BenchmarkingInstance Segmentation | —Unverified | 0 |
| FixCLR: Negative-Class Contrastive Learning for Semi-Supervised Domain Generalization | Jun 25, 2025 | BenchmarkingContrastive Learning | —Unverified | 0 |
| FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems | Jun 8, 2023 | BenchmarkingEdge-computing | —Unverified | 0 |
| FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning | Jan 1, 2024 | BenchmarkingFederated Learning | —Unverified | 0 |
| FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents | Jun 21, 2024 | Benchmarking | —Unverified | 0 |
| FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models | Jun 3, 2025 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| FlowMind: Automatic Workflow Generation with LLMs | Mar 17, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Fluorescent Neuronal Cells v2: Multi-Task, Multi-Format Annotations for Deep Learning in Microscopy | Jul 26, 2023 | Benchmarkingobject-detection | —Unverified | 0 |
| FMBench: Benchmarking Fairness in Multimodal Large Language Models on Medical Tasks | Oct 1, 2024 | BenchmarkingFairness | —Unverified | 0 |
| uto\!L: Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks | Oct 11, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| ForamViT-GAN: Exploring New Paradigms in Deep Learning for Micropaleontological Image Analysis | Apr 9, 2023 | BenchmarkingDeep Learning | —Unverified | 0 |
| Forecasting Lithium-Ion Battery Longevity with Limited Data Availability: Benchmarking Different Machine Learning Algorithms | Dec 10, 2023 | Battery cycle life predictionBenchmarking | —Unverified | 0 |
| Forecasting NIFTY 50 benchmark Index using Seasonal ARIMA time series models | Jan 9, 2020 | BenchmarkingTime Series | —Unverified | 0 |
| FOR-instance: a UAV laser scanning benchmark dataset for semantic and instance segmentation of individual trees | Sep 3, 2023 | BenchmarkingInstance Segmentation | —Unverified | 0 |
| FORLAPS: An Innovative Data-Driven Reinforcement Learning Approach for Prescriptive Process Monitoring | Jan 17, 2025 | BenchmarkingData Augmentation | —Unverified | 0 |
| Formal Covariate Benchmarking to Bound Omitted Variable Bias | Jun 18, 2023 | BenchmarkingSensitivity | —Unverified | 0 |
| FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents | Jun 2, 2025 | BenchmarkingForm | —Unverified | 0 |
| Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization | Apr 14, 2025 | BenchmarkingEarth Observation | —Unverified | 0 |
| Foundations for learning from noisy quantum experiments | Apr 28, 2022 | Benchmarking | —Unverified | 0 |
| Found in Translation: Measuring Multilingual LLM Consistency as Simple as Translate then Evaluate | May 28, 2025 | Benchmarking | —Unverified | 0 |
| FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting | Oct 15, 2024 | Benchmarkingenergy management | —Unverified | 0 |
| Framework and Benchmarks for Combinatorial and Mixed-variable Bayesian Optimization | Jun 16, 2023 | Bayesian OptimizationBenchmarking | —Unverified | 0 |
| FRED: The Florence RGB-Event Drone Dataset | Jun 5, 2025 | BenchmarkingTrajectory Forecasting | —Unverified | 0 |
| Free Performance Gain from Mixing Multiple Partially Labeled Samples in Multi-label Image Classification | May 24, 2024 | BenchmarkingData Augmentation | —Unverified | 0 |
| From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction | Mar 15, 2022 | 3D geometryBenchmarking | —Unverified | 0 |
| From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano | Jul 5, 2024 | AttributeBenchmarking | —Unverified | 0 |
| From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems | Oct 24, 2024 | BenchmarkingCommon Sense Reasoning | —Unverified | 0 |
| From Code to Play: Benchmarking Program Search for Games Using Large Language Models | Dec 5, 2024 | Atari GamesBenchmarking | —Unverified | 0 |
| From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks | Apr 14, 2022 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT | May 17, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 |
| From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation | May 24, 2025 | ArticlesBenchmarking | —Unverified | 0 |
| From Grounding to Planning: Benchmarking Bottlenecks in Web Agents | Sep 3, 2024 | Benchmarking | —Unverified | 0 |