SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 62766300 of 474278 papers

TitleStatusHype
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced KnowledgeCode2
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution FittingCode2
NUDT4MSTAR: A Large Dataset and Benchmark Towards Remote Sensing Object Recognition in the WildCode2
Parameter-Efficient Fine-Tuning for Foundation ModelsCode2
YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-IDCode2
GeoPixel: Pixel Grounding Large Multimodal Model in Remote SensingCode2
An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep NetworksCode2
Tensor-Var: Variational Data Assimilation in Tensor Product Feature SpaceCode2
GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian SplattingCode2
Querying Databases with Function CallingCode2
PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object DetectionCode2
TimeFilter: Patch-Specific Spatial-Temporal Graph Filtration for Time Series ForecastingCode2
Distillation Quantification for Large Language ModelsCode2
Towards Robust Multi-tab Website FingerprintingCode2
GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian SplattingCode2
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual FeedbackCode2
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning PruningCode2
A Survey on Multimodal Recommender Systems: Recent Advances and Future DirectionsCode2
Supervised Learning for Analog and RF Circuit Design: Benchmarks and Comparative InsightsCode2
MedS^3: Towards Medical Small Language Models with Self-Evolved Slow ThinkingCode2
Automating High Quality RT Planning at ScaleCode2
Episodic Memories Generation and Evaluation Benchmark for Large Language ModelsCode2
EmbodiedEval: Evaluate Multimodal LLMs as Embodied AgentsCode2
Exploring Temporally-Aware Features for Point TrackingCode2
MMVU: Measuring Expert-Level Multi-Discipline Video UnderstandingCode2
Show:102550
← PrevPage 252 of 18972Next →