SOTAVerified

16k

Papers

Showing 51100 of 146 papers

TitleStatusHype
Long Context Alignment with Short Instructions and Synthesized Positions0
SnapKV: LLM Knows What You are Looking for Before GenerationCode3
FPT: Feature Prompt Tuning for Few-shot Readability AssessmentCode0
Long-form factuality in large language modelsCode4
RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine ConflictCode0
An AI-Assisted Skincare Routine Recommendation System in XR0
Human Evaluation of English--Irish Transformer-Based NMT0
Transformers for Low-Resource Languages:Is Féidir Linn!0
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free AttentionCode1
Training-Free Long-Context Scaling of Large Language ModelsCode3
Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays0
Hydragen: High-Throughput LLM Inference with Shared PrefixesCode1
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256KCode2
Analyzing the Effectiveness of Large Language Models on Text-to-SQL SynthesisCode1
Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active LearningCode0
Detours for Navigating Instructional Videos0
Compositional Zero-Shot Learning for Attribute-Based Object Reference in Human-Robot Interaction0
Beyond Accuracy: Statistical Measures and Benchmark for Evaluation of Representation from Self-Supervised Learning0
Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic PapersCode1
Improved prompting and process for writing user personas with LLMs, using qualitative interviews: Capturing behaviour and personality traits of users0
Scaling Laws of RoPE-based ExtrapolationCode1
Retrieval meets Long Context Large Language Models0
Home Electricity Data Generator (HEDGE): An open-access tool for the generation of electric vehicle, residential demand, and PV generation profilesCode1
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models0
LongBench: A Bilingual, Multitask Benchmark for Long Context UnderstandingCode3
Code Llama: Open Foundation Models for CodeCode6
Giraffe: Adventures in Expanding Context Lengths in LLMsCode2
Hadiths Classification Using a Novel Author-Based Hadith Classification Dataset (ABCD)Code0
Detecting and Preventing Hallucinations in Large Vision Language ModelsCode1
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image UnderstandingCode2
The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon TasksCode1
Faster Causal Attention Over Large Sequences Through Sparse Flash AttentionCode1
BertRLFuzzer: A BERT and Reinforcement Learning Based FuzzerCode0
AI-assisted Code Authoring at Scale: Fine-tuning, deploying, and mixed methods evaluation0
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens0
Understanding Social Media Cross-Modality Discourse in Linguistic SpaceCode0
In-Context Learning with Many Demonstration ExamplesCode1
Leveraging Summary Guidance on Medical Report Summarization0
An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing ConditionsCode1
Spectrograms Are Sequences of PatchesCode0
COLING 2022 Shared Task: LED Finteuning and Recursive Summary Generation for Automatic Summarization of Chapters from Novels0
CIRCLe: Color Invariant Representation Learning for Unbiased Classification of Skin LesionsCode1
Investigating Efficiently Extending Transformers for Long Input SummarizationCode3
0/1 Deep Neural Networks via Block Coordinate Descent0
Improved two-stage hate speech classification for twitter based on Deep Neural Networks0
Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction0
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-AwarenessCode6
There’s a Time and Place for Reasoning Beyond the ImageCode1
Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality ReductionCode1
There is a Time and Place for Reasoning Beyond the ImageCode1
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Suprime21'"1Unverified