SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 69517000 of 661570 papers

TitleStatusHype
Towards Generative Ray Path Sampling for Faster Point-to-Point Ray TracingCode2
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI AcceleratorsCode2
Language Models can Self-Lengthen to Generate Long TextsCode2
The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical DomainsCode2
Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge GraphsCode2
ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D ImagesCode2
EgoMimic: Scaling Imitation Learning via Egocentric VideoCode2
RSL-SQL: Robust Schema Linking in Text-to-SQL GenerationCode2
GPT or BERT: why not both?Code2
Ada-MSHyper: Adaptive Multi-Scale Hypergraph Transformer for Time Series ForecastingCode2
VecCity: A Taxonomy-guided Library for Map Entity Representation LearningCode2
End-to-End Ontology Learning with Large Language ModelsCode2
What is Wrong with Perplexity for Long-context Language Modeling?Code2
InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail ModelsCode2
SciPIP: An LLM-based Scientific Paper Idea ProposerCode2
Multi-Agent Large Language Models for Conversational Task-SolvingCode2
Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control TasksCode2
Controlling Language and Diffusion Models by Transporting ActivationsCode2
Consistency Diffusion Bridge ModelsCode2
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic SegmentationCode2
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation GenerationCode2
MassSpecGym: A benchmark for the discovery and identification of moleculesCode2
$100K or 100 Days: Trade-offs when Pre-Training with Academic ResourcesCode2
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesisCode2
EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture ModelsCode2
Multi-Programming Language Sandbox for LLMsCode2
Very fast Bayesian Additive Regression Trees on GPUCode2
CHORDONOMICON: A Dataset of 666,000 Songs and their Chord ProgressionsCode2
Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial ApplicationsCode2
PC-Gym: Benchmark Environments For Process Control ProblemsCode2
Protecting Privacy in Multimodal Large Language Models with MLLMU-BenchCode2
Multimodality Helps Few-Shot 3D Point Cloud Semantic SegmentationCode2
ET-Flow: Equivariant Flow-Matching for Molecular Conformer GenerationCode2
A Survey on RGB, 3D, and Multimodal Approaches for Unsupervised Industrial Anomaly DetectionCode2
ActiveSplat: High-Fidelity Scene Reconstruction through Active Gaussian SplattingCode2
AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer AttemptsCode2
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM GuidanceCode2
RecFlow: An Industrial Full Flow Recommendation DatasetCode2
LARP: Tokenizing Videos with a Learned Autoregressive Generative PriorCode2
Domain Adaptation with a Single Vision-Language EmbeddingCode2
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive LearningCode2
Skinned Motion Retargeting with Dense Geometric Interaction PerceptionCode2
ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian SplattingsCode2
Fast Calibrated Explanations: Efficient and Uncertainty-Aware Explanations for Machine Learning ModelsCode2
Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven CyberattacksCode2
BSD: a Bayesian framework for parametric models of neural spectraCode2
Trajectory Flow Matching with Applications to Clinical Time Series ModelingCode2
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse AutoencodersCode2
Semantic Editing Increment Benefits Zero-Shot Composed Image RetrievalCode2
Flaming-hot Initiation with Regular Execution Sampling for Large Language ModelsCode2
Show:102550
← PrevPage 140 of 13232Next →