SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 78017850 of 661570 papers

TitleStatusHype
Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next LevelCode2
WiMANS: A Benchmark Dataset for WiFi-based Multi-user Activity SensingCode2
Pretrained LLM Adapted with LoRA as a Decision Transformer for Offline RL in Quantitative TradingCode2
RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language ModelsCode2
SynJax: Structured Probability Distributions for JAXCode2
GPTopic: Dynamic and Interactive Topic RepresentationsCode2
A Length-Extrapolatable TransformerCode2
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNsCode2
Democratizing Neural Machine Translation with OPUS-MTCode2
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety TrainingCode2
YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture DetectionCode2
PPLLaVA: Varied Video Sequence Understanding With Prompt GuidanceCode2
EEG-Deformer: A Dense Convolutional Transformer for Brain-computer InterfacesCode2
An Unsupervised Approach to Achieve Supervised-Level Explainability in Healthcare RecordsCode2
CHGNet: Pretrained universal neural network potential for charge-informed atomistic modelingCode2
LogAI: A Library for Log Analytics and IntelligenceCode2
ReMoDiffuse: Retrieval-Augmented Motion Diffusion ModelCode2
ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior ConstraintsCode2
Geometric Latent Diffusion Models for 3D Molecule GenerationCode2
Accelerating Self-Play Learning in GoCode2
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language ModelsCode2
MoEUT: Mixture-of-Experts Universal TransformersCode2
LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization AlgorithmsCode2
LaMI-DETR: Open-Vocabulary Detection with Language Model InstructionCode2
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMsCode2
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual TokenizationCode2
Cross-Image Relational Knowledge Distillation for Semantic SegmentationCode2
MLAgentBench: Evaluating Language Agents on Machine Learning ExperimentationCode2
Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text DetectionCode2
R3LIVE: A Robust, Real-time, RGB-colored, LiDAR-Inertial-Visual tightly-coupled state Estimation and mapping packageCode2
ActiveRAG: Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented AgentsCode2
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment RetrievalCode2
SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter OptimizationCode2
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge DeviceCode2
MultiOOD: Scaling Out-of-Distribution Detection for Multiple ModalitiesCode2
Self-Exploring Language Models: Active Preference Elicitation for Online AlignmentCode2
FEC: Fast Euclidean Clustering for Point Cloud SegmentationCode2
PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical ImagingCode2
Jailbreak Vision Language Models via Bi-Modal Adversarial PromptCode2
UMBRELA: UMbrela is the (Open-Source Reproduction of the) Bing RELevance AssessorCode2
Exploring Orthogonality in Open World Object DetectionCode2
You Name It, I Run It: An LLM Agent to Execute Tests of Arbitrary ProjectsCode2
Equinox: neural networks in JAX via callable PyTrees and filtered transformationsCode2
Deep Architectures for Content Moderation and Movie Content RatingCode2
Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal ModelsCode2
Denoising as Adaptation: Noise-Space Domain Adaptation for Image RestorationCode2
Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields ReconstructionCode2
Investigating Tradeoffs in Real-World Video Super-ResolutionCode2
SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood SearchCode2
Deep Learning Interviews: Hundreds of fully solved job interview questions from a wide range of key topics in AICode2
Show:102550
← PrevPage 157 of 13232Next →