SOTAVerified

Benchmarking

Papers

Showing 9911000 of 5548 papers

TitleStatusHype
Improving and Benchmarking Offline Reinforcement Learning AlgorithmsCode1
End-to-end Knowledge Retrieval with Multi-modal QueriesCode1
Accurate and Efficient Structural Ensemble Generation of Macrocyclic Peptides using Internal Coordinate DiffusionCode1
IDToolkit: A Toolkit for Benchmarking and Developing Inverse Design Algorithms in NanophotonicsCode1
SheetCopilot: Bringing Software Productivity to the Next Level through Large Language ModelsCode1
Decoding the Underlying Meaning of Multimodal Hateful MemesCode1
Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial TasksCode1
KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired True-Range MultilaterationCode1
ReadMe++: Benchmarking Multilingual Language Models for Multi-Domain Readability AssessmentCode1
Exploring Large Language Models for Classical PhilologyCode1
Show:102550
← PrevPage 100 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified