SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 66516700 of 177340 papers

TitleStatusHype
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMsCode2
High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing NetCode2
Color Shift Estimation-and-Correction for Image EnhancementCode2
Matcher: Segment Anything with One Shot Using All-Purpose Feature MatchingCode2
Dirichlet Flow Matching with Applications to DNA Sequence DesignCode2
ViewFusion: Towards Multi-View Consistency via Interpolated DenoisingCode2
M3: 3D-Spatial MultiModal MemoryCode2
Sparse Instance Activation for Real-Time Instance SegmentationCode2
Transformers are Sample-Efficient World ModelsCode2
Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM EraCode2
A Judge-free LLM Open-ended Generation Benchmark Based on the Distributional HypothesisCode2
AdaptFormer: Adapting Vision Transformers for Scalable Visual RecognitionCode2
An Egocentric Vision-Language Model based Portable Real-time Smart AssistantCode2
Fourier Neural Operator for Parametric Partial Differential EquationsCode2
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language ModelsCode2
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement LearningCode2
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference timeCode2
GraphMAE: Self-Supervised Masked Graph AutoencodersCode2
PET-MAD, a universal interatomic potential for advanced materials modelingCode2
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language LearningCode2
BTS: Building Timeseries Dataset: Empowering Large-Scale Building AnalyticsCode2
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific HypothesesCode2
Source-Free Domain Adaptation with Frozen Multimodal Foundation ModelCode2
CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution TransformersCode2
TimeLMs: Diachronic Language Models from TwitterCode2
string2string: A Modern Python Library for String-to-String AlgorithmsCode2
Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark SuiteCode2
Spectrally Pruned Gaussian Fields with Neural CompensationCode2
BIG-Bench Extra HardCode2
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient PerspectiveCode2
Chain of Hindsight Aligns Language Models with FeedbackCode2
MiraGe: Editable 2D Images using Gaussian SplattingCode2
Maverick: Efficient and Accurate Coreference Resolution Defying Recent TrendsCode2
Vision-aided UAV navigation and dynamic obstacle avoidance using gradient-based B-spline trajectory optimizationCode2
Deep learning-driven pulmonary artery and vein segmentation reveals demography-associated vasculature anatomical differencesCode2
A Novel State Space Model with Local Enhancement and State Sharing for Image FusionCode2
The Dark Side of Function Calling: Pathways to Jailbreaking Large Language ModelsCode2
Spiking Diffusion ModelsCode2
Putting People in their Place: Monocular Regression of 3D People in DepthCode2
MMPareto: Boosting Multimodal Learning with Innocent Unimodal AssistanceCode2
PnLCalib: Sports Field Registration via Points and Lines OptimizationCode2
XHand: Real-time Expressive Hand AvatarCode2
FedGraph: A Research Library and Benchmark for Federated Graph LearningCode2
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View RepresentationCode2
ZenSVI: An Open-Source Software for the Integrated Acquisition, Processing and Analysis of Street View Imagery Towards Scalable Urban ScienceCode2
ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and DesignCode2
Editing Models with Task ArithmeticCode2
Learning Video Representations from Large Language ModelsCode2
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person PerspectivesCode2
Model-free quantification of completeness, uncertainties, and outliers in atomistic machine learning using information theoryCode2
Show:102550
← PrevPage 134 of 3547Next →