SOTAVerified

HellaSwag

Papers

Showing 125 of 39 papers

TitleStatusHype
Training Compute-Optimal Large Language ModelsCode6
DataDecide: How to Predict Best Pretraining Data with Small ExperimentsCode3
LayerSkip: Enabling Early Exit Inference and Self-Speculative DecodingCode3
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language ModelsCode1
An Open Source Data Contamination Report for Large Language ModelsCode1
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data AugmentationCode1
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA OptimizationCode1
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask BenchmarkCode1
You can remove GPT2's LayerNorm by fine-tuningCode0
Attacks on Node Attributes in Graph Neural NetworksCode0
FinerWeb-10BT: Refining Web Data with LLM-Based Line-Level FilteringCode0
GraDA: Graph Generative Data Augmentation for Commonsense ReasoningCode0
HellaSwag: Can a Machine Really Finish Your Sentence?Code0
In-Contextual Gender Bias Suppression for Large Language ModelsCode0
On Curriculum Learning for Commonsense ReasoningCode0
SaGE: Evaluating Moral Consistency in Large Language ModelsCode0
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM EvaluationCode0
metabench -- A Sparse Benchmark to Measure General Ability in Large Language ModelsCode0
Toward Adversarial Training on Contextualized Language RepresentationCode0
What the HellaSwag? On the Validity of Common-Sense Reasoning BenchmarksCode0
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs0
Promises, Outlooks and Challenges of Diffusion Language Modeling0
Comparing Test Sets with Item Response Theory0
English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.