SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 22012250 of 659983 papers

TitleStatusHype
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat DataCode4
Video Seal: Open and Efficient Video WatermarkingCode4
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh TokenizationCode4
TimeGPT-1Code4
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language ModelsCode4
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual ModelsCode4
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion TransformerCode4
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning ServingCode4
ControlNet++: Improving Conditional Controls with Efficient Consistency FeedbackCode4
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric TasksCode4
Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained HubertCode4
Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-stepCode4
A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANsCode4
Rephrase and Respond: Let Large Language Models Ask Better Questions for ThemselvesCode4
AnyDoor: Zero-shot Object-level Image CustomizationCode4
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPSCode4
S^3Gaussian: Self-Supervised Street Gaussians for Autonomous DrivingCode4
Tracking Everything Everywhere All at OnceCode4
Moûsai: Text-to-Music Generation with Long-Context Latent DiffusionCode4
sbi reloaded: a toolkit for simulation-based inference workflowsCode4
Enhancing Chat Language Models by Scaling High-quality Instructional ConversationsCode4
Zero-shot forecasting of chaotic systemsCode4
One Embedder, Any Task: Instruction-Finetuned Text EmbeddingsCode4
JetMoE: Reaching Llama2 Performance with 0.1M DollarsCode4
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language ModelsCode4
A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion PerspectiveCode4
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion ModelsCode4
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video UnderstandingCode4
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video GeneratorsCode4
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust RefusalCode4
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion ModelCode4
Generalizable Humanoid Manipulation with 3D Diffusion PoliciesCode4
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QACode4
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed ImagesCode4
Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language ModelsCode4
Multimodal Chain-of-Thought Reasoning in Language ModelsCode4
Efficient Automated Deep Learning for Time Series ForecastingCode4
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAMCode4
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt InjectionCode4
Lean Workbook: A large-scale Lean problem set formalized from natural language math problemsCode4
GeoCalib: Learning Single-image Calibration with Geometric OptimizationCode4
ManimML: Communicating Machine Learning Architectures with AnimationCode4
Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous DrivingCode4
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference OptimizationCode4
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language ModelsCode4
Reasoning with Language Model is Planning with World ModelCode4
Fine-Tuning Image-Conditional Diffusion Models is Easier than You ThinkCode4
DocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode4
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image SynthesisCode4
Flamingo: a Visual Language Model for Few-Shot LearningCode4
Show:102550
← PrevPage 45 of 13200Next →