SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 20012050 of 659983 papers

TitleStatusHype
Large Models for Time Series and Spatio-Temporal Data: A Survey and OutlookCode4
4D Gaussian Splatting for Real-Time Dynamic Scene RenderingCode4
An Empirical Study of Instruction-tuning Large Language Models in ChineseCode4
3D TransUNet: Advancing Medical Image Segmentation through Vision TransformersCode4
OpenWebMath: An Open Dataset of High-Quality Mathematical Web TextCode4
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?Code4
Language Model Beats Diffusion -- Tokenizer is Key to Visual GenerationCode4
Retrieval-Generation Synergy Augmented Large Language ModelsCode4
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferenceCode4
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent DiffusionCode4
TimeGPT-1Code4
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic AlignmentCode4
Time-LLM: Time Series Forecasting by Reprogramming Large Language ModelsCode4
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image SynthesisCode4
Guiding Instruction-based Image Editing via Multimodal Large Language ModelsCode4
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content CreationCode4
TradeMaster: A Holistic Quantitative Trading Platform Empowered by Reinforcement LearningCode4
Efficient Post-training Quantization with FP8 FormatsCode4
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer ModelsCode4
Safurai 001: New Qualitative Approach for Code LLM EvaluationCode4
Baichuan 2: Open Large-scale Language ModelsCode4
Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted TreesCode4
ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis TestingCode4
Advancing Parsimonious Deep Learning Weather Prediction using the HEALPix MeshCode4
NExT-GPT: Any-to-Any Multimodal LLMCode4
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMsCode4
Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical DomainCode4
Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in ChineseCode4
Cognitive Architectures for Language AgentsCode4
DiffBIR: Towards Blind Image Restoration with Generative Diffusion PriorCode4
Prompt2Model: Generating Deployable Models from Natural Language InstructionsCode4
A Survey on Large Language Model based Autonomous AgentsCode4
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent BehaviorsCode4
GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated LearningCode4
ChatHaruhi: Reviving Anime Character in Reality via Large Language ModelCode4
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View SynthesisCode4
Graph of Thoughts: Solving Elaborate Problems with Large Language ModelsCode4
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised PretrainingCode4
OpenProteinSet: Training data for structural biology at scaleCode4
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language ModelsCode4
AgentBench: Evaluating LLMs as AgentsCode4
TOPIQ: A Top-down Approach from Semantics to Distortions for Image Quality AssessmentCode4
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language ModelsCode4
From Discrete Tokens to High-Fidelity Audio Using Multi-Band DiffusionCode4
LISA: Reasoning Segmentation via Large Language ModelCode4
Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language ModelsCode4
Effective Whole-body Pose Estimation with Two-stages DistillationCode4
Universal and Transferable Adversarial Attacks on Aligned Language ModelsCode4
Guaranteed Approximation Bounds for Mixed-Precision Neural OperatorsCode4
Turning Whisper into Real-Time Transcription SystemCode4
Show:102550
← PrevPage 41 of 13200Next →