SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 15761600 of 661570 papers

TitleStatusHype
On the limits of agency in agent-based modelsCode4
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image RetrievalCode4
Windows Agent Arena: Evaluating Multi-Modal OS Agents at ScaleCode4
GeoCalib: Learning Single-image Calibration with Geometric OptimizationCode4
RealisDance: Equip controllable character animation with realistic handsCode4
One-Shot Diffusion Mimicker for Handwritten Text GenerationCode4
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual GenerationCode4
xLAM: A Family of Large Action Models to Empower AI Agent SystemsCode4
iText2KG: Incremental Knowledge Graphs Construction Using Large Language ModelsCode4
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QACode4
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding BenchmarkCode4
Large Language Model-Based Agents for Software Engineering: A SurveyCode4
OLMoE: Open Mixture-of-Experts Language ModelsCode4
IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo MatchingCode4
Diffusion Policy Policy OptimizationCode4
CrisperWhisper: Accurate Timestamps on Verbatim Speech TranscriptionsCode4
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of EncodersCode4
MegActor-Σ: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion TransformerCode4
Text2SQL is Not Enough: Unifying AI and Databases with TAGCode4
Relationships are Complicated! An Analysis of Relationships Between Datasets on the WebCode4
EmbodiedSAM: Online Segment Any 3D Thing in Real TimeCode4
SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion RecognitionCode4
RUMI: Rummaging Using Mutual InformationCode4
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual GuidanceCode4
FuseChat: Knowledge Fusion of Chat ModelsCode4
Show:102550
← PrevPage 64 of 26463Next →