SOTAVerified

Benchmarking

Papers

Showing 601610 of 5548 papers

TitleStatusHype
Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design0
Trade-offs in Privacy-Preserving Eye Tracking through Iris Obfuscation: A Benchmarking StudyCode0
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding0
TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning0
LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMsCode1
SortBench: Benchmarking LLMs based on their ability to sort lists0
TorchFX: A modern approach to Audio DSP with PyTorch and GPU accelerationCode2
Adaptive Shrinkage Estimation For Personalized Deep Kernel Regression In Modeling Brain TrajectoriesCode0
Benchmarking Suite for Synthetic Aperture Radar Imagery Anomaly Detection (SARIAD) AlgorithmsCode0
SydneyScapes: Image Segmentation for Australian Environments0
Show:102550
← PrevPage 61 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified