SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 56015650 of 661570 papers

TitleStatusHype
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning IncentivizationCode2
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D ReconstructionCode2
PyTopo3D: A Python Framework for 3D SIMP-based Topology OptimizationCode2
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned GuidanceCode2
Hogwild! Inference: Parallel LLM Generation via Concurrent AttentionCode2
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency AdaptationCode2
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofingCode2
Human Activity Recognition using RGB-Event based Sensors: A Multi-modal Heat Conduction Model and A Benchmark DatasetCode2
Holistic Fusion: Task- and Setup-Agnostic Robot Localization and State Estimation with Factor GraphsCode2
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE InferenceCode2
InteractVLM: 3D Interaction Reasoning from 2D Foundational ModelsCode2
Machine learning interatomic potential can infer electrical responseCode2
Gaussian Mixture Flow Matching ModelsCode2
SlicerNNInteractive: A 3D Slicer extension for nnInteractiveCode2
Weak-for-Strong: Training Weak Meta-Agent to Harness Strong ExecutorsCode2
Efficient Reinforcement Finetuning via Adaptive Curriculum LearningCode2
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal PromptingCode2
Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language ModelsCode2
Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer PerformanceCode2
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning ModelsCode2
SEAL: Steerable Reasoning Calibration of Large Language Models for FreeCode2
Content-Aware Transformer for All-in-one Image RestorationCode2
One Quantizer is Enough: Toward a Lightweight Audio CodecCode2
MedM-VL: What Makes a Good Medical LVLM?Code2
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual EncodingCode2
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object DetectionCode2
SAM2MOT: A Novel Paradigm of Multi-Object Tracking by SegmentationCode2
VocalNet: Speech LLM with Multi-Token Prediction for Faster and High-Quality GenerationCode2
MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech TranslationCode2
Agentic Knowledgeable Self-awarenessCode2
RWKVTTS: Yet another TTS based on RWKV-7Code2
Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic SegmentationCode2
Investigating Affective Use and Emotional Well-being on ChatGPTCode2
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation SchemeCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
Re-thinking Temporal Search for Long-Form Video UnderstandingCode2
Sparse Autoencoders Learn Monosemantic Features in Vision-Language ModelsCode2
Exploration-Driven Generative Interactive EnvironmentsCode2
GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric CalibrationCode2
GPG: A Simple and Strong Reinforcement Learning Baseline for Model ReasoningCode2
CrystalFormer-RL: Reinforcement Fine-Tuning for Materials DesignCode2
Scaling Video-Language Models to 10K Frames via Hierarchical Differential DistillationCode2
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual EditingCode2
ZClip: Adaptive Spike Mitigation for LLM Pre-TrainingCode2
Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite ImageryCode2
Benchmarking Synthetic Tabular Data: A Multi-Dimensional Evaluation FrameworkCode2
Scene-Centric Unsupervised Panoptic SegmentationCode2
MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security ExploitsCode2
AI-Newton: A Concept-Driven Physical Law Discovery System without Prior Physical KnowledgeCode2
SpaceR: Reinforcing MLLMs in Video Spatial ReasoningCode2
Show:102550
← PrevPage 113 of 13232Next →