SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1160111650 of 661570 papers

TitleStatusHype
LayoutGPT: Compositional Visual Planning and Generation with Large Language ModelsCode2
CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity RecognitionCode2
A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic CorrespondenceCode2
NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving ScenarioCode2
Enabling Large Language Models to Generate Text with CitationsCode2
gRNAde: Geometric Deep Learning for 3D RNA inverse designCode2
A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal VerificationCode2
torchgfn: A PyTorch GFlowNet libraryCode2
ExpertPrompting: Instructing Large Language Models to be Distinguished ExpertsCode2
Adapting Language Models to Compress ContextsCode2
Lawyer LLaMA Technical ReportCode2
Unpaired Image-to-Image Translation via Neural Schrödinger BridgeCode2
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language ModelsCode2
Sparse4D v2: Recurrent Temporal Fusion with Sparse ModelCode2
Improving Factuality and Reasoning in Language Models through Multiagent DebateCode2
Grammar-Constrained Decoding for Structured NLP Tasks without FinetuningCode2
Link Prediction without Graph Neural NetworksCode2
SAD: Segment Any RGBDCode2
DetGPT: Detect What You Need via ReasoningCode2
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language ModelsCode2
Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free ApproachCode2
Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback LearningCode2
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-trainingCode2
Efficient Multi-Scale Attention Module with Cross-Spatial LearningCode2
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-TuningCode2
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text GenerationCode2
REC-MV: REconstructing 3D Dynamic Cloth from Monocular VideosCode2
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase SpectraCode2
ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language ModelsCode2
Perception Test: A Diagnostic Benchmark for Multimodal Video ModelsCode2
SMT 2.0: A Surrogate Modeling Toolbox with a focus on Hierarchical and Mixed Variables Gaussian ProcessesCode2
MAGE: Machine-generated Text Detection in the WildCode2
Hierarchical Integration Diffusion Model for Realistic Image DeblurringCode2
Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infectionCode2
Multimodal Automated Fact-Checking: A SurveyCode2
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head AvatarsCode2
Training Diffusion Models with Reinforcement LearningCode2
FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex ManipulationCode2
Matcher: Segment Anything with One Shot Using All-Purpose Feature MatchingCode2
VDT: General-purpose Video Diffusion Transformers via Mask ModelingCode2
Boosting Knowledge Graph Generation from Tabular Data with RML ViewsCode2
LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future OpportunitiesCode2
Mist: Towards Improved Adversarial Examples for Diffusion ModelsCode2
LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-OnCode2
Lion: Adversarial Distillation of Proprietary Large Language ModelsCode2
ControlVideo: Training-free Controllable Text-to-Video GenerationCode2
VanillaNet: the Power of Minimalism in Deep LearningCode2
Evaluating the Performance of Large Language Models on GAOKAO BenchmarkCode2
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical ReasoningCode2
Knowledge-Design: Pushing the Limit of Protein Design via Knowledge RefinementCode2
Show:102550
← PrevPage 233 of 13232Next →