SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 54015450 of 661570 papers

TitleStatusHype
MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language ModelsCode2
A Tutorial on Structural Identifiability of Epidemic Models Using StructuralIdentifiability.jlCode2
AdaptCLIP: Adapting CLIP for Universal Visual Anomaly DetectionCode2
VRSplat: Fast and Robust Gaussian Splatting for Virtual RealityCode2
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning ModelsCode2
MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-LearningCode2
Recent Advances in Medical Imaging Segmentation: A SurveyCode2
Learning to Detect Multi-class Anomalies with Just One Normal Image PromptCode2
Few-Shot Anomaly-Driven Generation for Anomaly Classification and SegmentationCode2
WavReward: Spoken Dialogue Models With Generalist Reward EvaluatorsCode2
Reproducibility Study of "Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents"Code2
BAT: Benchmark for Auto-bidding TaskCode2
Behind Maya: Building a Multilingual Vision Language ModelCode2
Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and EnhancementCode2
CodePDE: An Inference Framework for LLM-driven PDE Solver GenerationCode2
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented GenerationCode2
Unified Continuous Generative ModelsCode2
Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMsCode2
SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language ModelsCode2
Boosting Global-Local Feature Matching via Anomaly Synthesis for Multi-Class Point Cloud Anomaly DetectionCode2
Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search AgentCode2
Adaptive Latent-Space Constraints in Personalized FLCode2
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem SolvingCode2
LEAD: Iterative Data Selection for Efficient LLM Instruction TuningCode2
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning EngineeringCode2
Piloting Structure-Based Drug Design via Modality-Specific Optimal ScheduleCode2
YuLan-OneSim: Towards the Next Generation of Social Simulator with Large Language ModelsCode2
GuidedQuant: Large Language Model Quantization via Exploiting End Loss GuidanceCode2
Text-to-CadQuery: A New Paradigm for CAD Generation with Scalable Large Model CapabilitiesCode2
ReplayCAD: Generative Diffusion Replay for Continual Anomaly DetectionCode2
Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVACode2
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and SegmentationCode2
InstanceGen: Image Generation with Instance-level InstructionsCode2
Bring Reason to Vision: Understanding Perception and Reasoning through Model MergingCode2
Foam-Agent: Towards Automated Intelligent CFD WorkflowsCode2
SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data AugmentationCode2
StabStitch++: Unsupervised Online Video Stitching with Spatiotemporal Bidirectional WarpsCode2
Diffusion Model Quantization: A ReviewCode2
TetWeave: Isosurface Extraction using On-The-Fly Delaunay Tetrahedral Grids for Gradient-Based Mesh OptimizationCode2
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan WorldCode2
Steerable Scene Generation with Post Training and Inference-Time SearchCode2
EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement LearningCode2
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D GenerationCode2
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data RestorationCode2
Non-stationary Diffusion For Probabilistic Time Series ForecastingCode2
DeCLIP: Decoupled Learning for Open-Vocabulary Dense PerceptionCode2
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative SynchronizationCode2
Rethinking Boundary Detection in Deep Learning-Based Medical Image SegmentationCode2
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image EditingCode2
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language ModelsCode2
Show:102550
← PrevPage 109 of 13232Next →