SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 361–370 of 1107 papers

Title	Date	Tasks	Status	Hype
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do	Sep 17, 2024	Language ModelingLanguage Modelling	—Unverified	0
Annealed Winner-Takes-All for Motion Forecasting	Sep 17, 2024	AllAutonomous Driving	CodeCode Available	1
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia	Sep 13, 2024	MathMultiple-choice	—Unverified	0
Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement	Sep 10, 2024	Multiple-choiceSentence	—Unverified	0
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach	Sep 9, 2024	Computational EfficiencyContinual Pretraining	CodeCode Available	0
COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes	Sep 6, 2024	Multiple-choiceQuestion Answering	CodeCode Available	0
MaterialBENCH: Evaluating College-Level Materials Science Problem-Solving Abilities of Large Language Models	Sep 5, 2024	Multiple-choice	—Unverified	0
CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models	Sep 4, 2024	GSM8KMath	CodeCode Available	2
Training on the Benchmark Is Not All You Need	Sep 3, 2024	AllMultiple-choice	CodeCode Available	1
The Role of Large Language Models in Musicology: Are We Ready to Trust the Machines?	Sep 3, 2024	Multiple-choiceQuestion Generation	—Unverified	0

Show:10 25 50

← PrevPage 37 of 111Next →

No leaderboard results yet.