SOTAVerified

Benchmarking

Papers

Showing 31513175 of 5548 papers

TitleStatusHype
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition0
FETCH: A Memory-Efficient Replay Approach for Continual Learning in Image Classification0
FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding0
Few-Shot Defect Segmentation Leveraging Abundant Normal Training Samples Through Normal Background Regularization and Crop-and-Paste Operation0
Few-Shot Learning for Industrial Time Series: A Comparative Analysis Using the Example of Screw-Fastening Process Monitoring0
Fiber Bundle Morphisms as a Framework for Modeling Many-to-Many Maps0
E(3)-equivariant models cannot learn chirality: Field-based molecular generation0
Filter Methods for Feature Selection in Supervised Machine Learning Applications -- Review and Benchmark0
Finance Language Model Evaluation (FLaME)0
Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging0
Findings of the Shared Task on Offensive Language Identification in Tamil, Malayalam, and Kannada0
Fine-Grained Classification of Pedestrians in Video: Benchmark and State of the Art0
FineText: Text Classification via Attention-based Language Model Fine-tuning0
Fine-tuning LLaMA 2 interference: a comparative study of language implementations for optimal efficiency0
FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets0
FinLoRA: Benchmarking LoRA Methods for Fine-Tuning LLMs on Financial Datasets0
FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance0
FIORD: A Fisheye Indoor-Outdoor Dataset with LIDAR Ground Truth for 3D Scene Reconstruction and Benchmarking0
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures0
FixCLR: Negative-Class Contrastive Learning for Semi-Supervised Domain Generalization0
FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems0
FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning0
FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents0
FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models0
FlowMind: Automatic Workflow Generation with LLMs0
Show:102550
← PrevPage 127 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified