SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
Benchmarking
Benchmarking
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
No papers found.
Benchmark Results
▼
CloudEval-YAML
1 submissions
↑ higher is better
#
Model
Metric
Claimed
Verified
Status
1
GPT-4 Turbo
ACC
0.56
—
Unverified