SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1710117150 of 474278 papers

TitleStatusHype
Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models0
FairDICE: Fairness-Driven Offline Multi-Objective Reinforcement Learning0
Domain Switching on the Pareto Front: Multi-Objective Deep Kernel Learning in Automated Piezoresponse Force Microscopy0
Unable to Forget: Proactive lnterference Reveals Working Memory Limits in LLMs Beyond Context Length0
A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation0
Scaling Laws of Motion Forecasting and Planning -- A Technical Report0
Seeing Voices: Generating A-Roll Video from Audio with Mirage0
FedGA-Tree: Federated Decision Tree using Genetic Algorithm0
Guideline Forest: Experience-Induced Multi-Guideline Reasoning with Stepwise Aggregation0
ArchiLense: A Framework for Quantitative Analysis of Architectural Styles Based on Vision Large Language Models0
Conservative Bias in Large Language Models: Measuring Relation Predictions0
QA-LIGN: Aligning LLMs through Constitutionally Decomposed QA0
EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments0
LLM-BT-Terms: Back-Translation as a Framework for Terminology Standardization and Dynamic Semantic Embedding0
Bingo: Boosting Efficient Reasoning of LLMs via Dynamic and Significance-based Reinforcement Learning0
Accelerating Spectral Clustering under Fairness Constraints0
A Machine Learning Approach to Generate Residual Stress Distributions using Sparse Characterization Data in Friction-Stir Processed Parts0
The Impact of Feature Scaling In Machine Learning: Effects on Regression and Classification Tasks0
Mondrian: Transformer Operators via Domain Decomposition0
Interpreting Agent Behaviors in Reinforcement-Learning-Based Cyber-Battle Simulation Platforms0
Learning-Based Multiuser Scheduling in MIMO-OFDM Systems with Hybrid Beamforming0
Generative Learning of Differentiable Object Models for Compositional Interpretation of Complex Scenes0
Using Satellite Images And Self-supervised Machine Learning Networks To Detect Water Hidden Under Vegetation0
Open World Scene Graph Generation using Vision Language ModelsCode2
Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future DirectionsCode1
CheMatAgent: Enhancing LLMs for Chemistry and Materials Science through Tree-Search Based Tool LearningCode1
Highly Compressed Tokenizer Can Generate Without TrainingCode3
MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert SpecializationCode0
ETT-CKGE: Efficient Task-driven Tokens for Continual Knowledge Graph EmbeddingCode0
Multilingual Hate Speech Detection in Social Media Using Translation-Based Approaches with Large Language Models0
Surgeon Style Fingerprinting and Privacy Risk Quantification via Discrete Diffusion Models in a Vision-Language-Action FrameworkCode0
Info-Coevolution: An Efficient Framework for Data Model CoevolutionCode0
Automatic Generation of Inference Making Questions for Reading Comprehension AssessmentsCode0
Thinking vs. Doing: Agents that Reason by Scaling Test-Time InteractionCode2
Dynamic Diffusion Schrödinger Bridge in Astrophysical Observational InversionsCode0
CuRe: Cultural Gaps in the Long Tail of Text-to-Image SystemsCode0
From Passive to Active Reasoning: Can Large Language Models Ask the Right Questions under Incomplete Information?Code1
Instruction-Tuned Video-Audio Models Elucidate Functional Specialization in the BrainCode0
Surgeons Awareness, Expectations, and Involvement with Artificial Intelligence: a Survey Pre and Post the GPT Era0
MoE-GPS: Guidlines for Prediction Strategy for Dynamic Expert Duplication in MoE Load Balancing0
Sparse Interpretable Deep Learning with LIES Networks for Symbolic RegressionCode0
Deep reinforcement learning for near-deterministic preparation of cubic- and quartic-phase gates in photonic quantum computing0
Cognitive Weave: Synthesizing Abstracted Knowledge with a Spatio-Temporal Resonance GraphCode0
SOP-Bench: Complex Industrial SOPs for Evaluating LLM Agents0
GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure RecognitionCode0
Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles0
Benchmarking Pre-Trained Time Series Models for Electricity Price Forecasting0
IGraSS: Learning to Identify Infrastructure Networks from Satellite Imagery by Iterative Graph-constrained Semantic Segmentation0
Ego-centric Learning of Communicative World Models for Autonomous Driving0
SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense0
Show:102550
← PrevPage 343 of 9486Next →