SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1300113050 of 474278 papers

TitleStatusHype
PyPop7: A Pure-Python Library for Population-Based Black-Box OptimizationCode2
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache ManagementCode2
MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQLCode2
Benchmarking Agentic Workflow GenerationCode2
Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions FollowingCode2
Chat AI: A Seamless Slurm-Native Solution for HPC-Based ServicesCode2
Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine PerceptionCode2
A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language ModelCode2
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial TokensCode2
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language ModelCode2
PointLLM: Empowering Large Language Models to Understand Point CloudsCode2
AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit DetectorsCode2
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive ReviewCode2
InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image GenerationCode2
Palu: Compressing KV-Cache with Low-Rank ProjectionCode2
RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor SearchCode2
Image Segmentation in Foundation Model Era: A SurveyCode2
Monocular Obstacle Avoidance Based on Inverse PPO for Fixed-wing UAVsCode2
SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human ReconstructionCode2
ChemAgent: Self-updating Library in Large Language Models Improves Chemical ReasoningCode2
Generative AI for Cel-Animation: A SurveyCode2
AutoPatent: A Multi-Agent Framework for Automatic Patent GenerationCode2
ASCNet: Asymmetric Sampling Correction Network for Infrared Image DestripingCode2
Torchattacks: A PyTorch Repository for Adversarial AttacksCode2
Scale-Aware Modulation Meet TransformerCode2
Follow the Rules: Reasoning for Video Anomaly Detection with Large Language ModelsCode2
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal GroundingCode2
Let LLMs Break Free from Overthinking via Self-Braking TuningCode2
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation BenchmarksCode2
Towards Automatic Power Battery Detection: New Challenge, Benchmark Dataset and BaselineCode2
robosuite: A Modular Simulation Framework and Benchmark for Robot LearningCode2
Mask3D: Mask Transformer for 3D Semantic Instance SegmentationCode2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language ModelsCode2
A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint SpaceCode2
MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security ExploitsCode2
Test-Time Zero-Shot Temporal Action LocalizationCode2
Equivariant Diffusion for Molecule Generation in 3DCode2
Demystifying the Compression of Mixture-of-Experts Through a Unified FrameworkCode2
Neural Fields in Visual Computing and BeyondCode2
RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging RadarCode2
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient TuningCode2
UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement LearningCode2
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?Code2
Fine-Grained Prototypes Distillation for Few-Shot Object DetectionCode2
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A BenchmarkCode2
Generative Image DynamicsCode2
Reasoning Language Models: A BlueprintCode2
MMVU: Measuring Expert-Level Multi-Discipline Video UnderstandingCode2
Instruct-NeRF2NeRF: Editing 3D Scenes with InstructionsCode2
Black-Box Prompt Optimization: Aligning Large Language Models without Model TrainingCode2
Show:102550
← PrevPage 261 of 9486Next →