SOTAVerified

16k

Papers

Showing 51100 of 146 papers

TitleStatusHype
COUGH: A Challenge Dataset and Models for COVID-19 FAQ RetrievalCode1
Analyzing the Effectiveness of Large Language Models on Text-to-SQL SynthesisCode1
DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single CameraCode1
An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing ConditionsCode1
Denial-of-Service Poisoning Attacks against Large Language ModelsCode1
SMYRF: Efficient Attention using Asymmetric ClusteringCode1
X-LRM: X-ray Large Reconstruction Model for Extremely Sparse-View Computed Tomography Recovery in One SecondCode0
Achieving Scalable Robot Autonomy via neurosymbolic planning using lightweight local LLMCode0
An Empirical Study of Mamba-based Language ModelsCode0
Author Profiling for Abuse DetectionCode0
BertRLFuzzer: A BERT and Reinforcement Learning Based FuzzerCode0
Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active LearningCode0
CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese NovelsCode0
Code-Switching Red-Teaming: LLM Evaluation for Safety and Multilingual UnderstandingCode0
Deep Learning for Detecting Cyberbullying Across Multiple Social Media PlatformsCode0
Deep Learning for Hate Speech Detection in TweetsCode0
Detecting Offensive Language in Tweets Using Deep LearningCode0
Extending Context Window of Large Language Models from a Distributional PerspectiveCode0
FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and ItalianCode0
FPT: Feature Prompt Tuning for Few-shot Readability AssessmentCode0
Give Me Something to Eat: Referring Expression Comprehension with Commonsense KnowledgeCode0
Hadiths Classification Using a Novel Author-Based Hadith Classification Dataset (ABCD)Code0
How Far Are We from Optimal Reasoning Efficiency?Code0
ImageNet Training in MinutesCode0
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing ApplicationsCode0
Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approachCode0
Leolani: a reference machine with a theory of mind for social communicationCode0
Model Editing for LLMs4Code: How Far are We?Code0
MSTAR: Box-free Multi-query Scene Text Retrieval with Attention RecyclingCode0
Nonlinear Conjugate Gradients For Scaling Synchronous Distributed DNN TrainingCode0
PSC: Extending Context Window of Large Language Models via Phase Shift CalibrationCode0
Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context EvaluationCode0
RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine ConflictCode0
SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long SequencesCode0
Spectrograms Are Sequences of PatchesCode0
TabFact: A Large-scale Dataset for Table-based Fact VerificationCode0
The Point Where Reality Meets Fantasy: Mixed Adversarial Generators for Image Splice DetectionCode0
Understanding Social Media Cross-Modality Discourse in Linguistic SpaceCode0
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models0
Fast and Full-Resolution Light Field Deblurring using a Deep Neural Network0
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning0
An AI-Assisted Skincare Routine Recommendation System in XR0
Evaluating the Suitability of Different Intraoral Scan Resolutions for Deep Learning-Based Tooth Segmentation0
EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts0
A Multi-Task Network for Joint Specular Highlight Detection and Removal0
Multilingual Visual Sentiment Concept Matching0
MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences0
End-to-end argumentation knowledge graph construction0
Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays0
Transformers for Low-Resource Languages: Is Féidir Linn!0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Suprime21'"1Unverified