SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1580115850 of 474278 papers

TitleStatusHype
ViTaSCOPE: Visuo-tactile Implicit Representation for In-hand Pose and Extrinsic Contact Estimation0
Evaluating Sensitivity Parameters in Smartphone-Based Gaze Estimation: A Comparative Study of Appearance-Based and Infrared Eye Trackers0
MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space0
Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs0
A Fast, Reliable, and Secure Programming Language for LLM Agents with Code Actions0
Today's Cat Is Tomorrow's Dog: Accounting for Time-Based Changes in the Labels of ML Vulnerability Detection Approaches0
Bias Amplification in RAG: Poisoning Knowledge Retrieval to Steer LLMs0
Byzantine Outside, Curious Inside: Reconstructing Data Through Malicious Updates0
Large Language Models for History, Philosophy, and Sociology of Science: Interpretive Uses, Methodological Challenges, and Critical Perspectives0
Bias and Identifiability in the Bounded Confidence Model0
AgentSense: Virtual Sensor Data Generation Using LLM Agents in Simulated Home Environments0
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs0
Large Language Model-Powered Conversational Agent Delivering Problem-Solving Therapy (PST) for Family Caregivers: Enhancing Empathy and Therapeutic Alliance Using In-Context Learning0
Deep Learning-based mmWave MIMO Channel Estimation using sub-6 GHz Channel Information: CNN and UNet Approaches0
Investigating the Potential of Large Language Model-Based Router Multi-Agent Architectures for Foundation Design Automation: A Task Classification and Expert Selection Study0
Teleoperated Driving: a New Challenge for 3D Object Detection in Compressed Point Clouds0
SPLATART: Articulated Gaussian Splatting with Estimated Object Structure0
SAIL: Faster-than-Demonstration Execution of Imitation Learning Policies0
Enabling automatic transcription of child-centered audio recordings from real-world environments0
(SimPhon Speech Test): A Data-Driven Method for In Silico Design and Validation of a Phonetically Balanced Speech Test0
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario AnalysisCode2
Quizzard@INOVA Challenge 2025 -- Track A: Plug-and-Play Technique in Interleaved Multi-Image ModelCode0
Prioritizing Alignment Paradigms over Task-Specific Model Customization in Time-Series LLMsCode0
From Emergence to Control: Probing and Modulating Self-Reflection in Language ModelsCode0
Diffusion-Based Electrocardiography Noise Quantification via Anomaly DetectionCode1
PRO-V: An Efficient Program Generation Multi-Agent System for Automatic RTL VerificationCode1
Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning with Heterogeneous LoRA AllocationCode0
ICME 2025 Grand Challenge on Video Super-Resolution for Video ConferencingCode1
Interaction, Process, Infrastructure: A Unified Architecture for Human-Agent Collaboration0
LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation0
CLIP the Landscape: Automated Tagging of Crowdsourced Landscape ImagesCode0
Enter: Graduated Realism: A Pedagogical Framework for AI-Powered Avatars in Virtual Reality Teacher Training0
Dr. GPT Will See You Now, but Should It? Exploring the Benefits and Harms of Large Language Models in Medical Diagnosis using Crowdsourced Clinical Cases0
Learning Encodings by Maximizing State Distinguishability: Variational Quantum Error CorrectionCode0
Spectral Estimation with Free DecompressionCode0
Schema-R1: A reasoning training approach for schema linking in Text-to-SQL TaskCode1
On the performance of multi-fidelity and reduced-dimensional neural emulators for inference of physiologic boundary conditions0
Efficient Multi-Camera Tokenization with Triplanes for End-to-End Driving0
A correlation-permutation approach for speech-music encoders model merging0
Fidelity Isn't Accuracy: When Linearly Decodable Functions Fail to Match the Ground TruthCode0
CGVQM+D: Computer Graphics Video Quality Metric and DatasetCode2
CLEAN-MI: A Scalable and Efficient Pipeline for Constructing High-Quality Neurodata in Motor Imagery Paradigm0
FeNN: A RISC-V vector processor for Spiking Neural Network acceleration0
Recursive KalmanNet: Deep Learning-Augmented Kalman Filtering for State Estimation with Consistent Uncertainty QuantificationCode1
FAA Framework: A Large Language Model-Based Approach for Credit Card Fraud Investigations0
A Lightweight IDS for Early APT Detection Using a Novel Feature Selection Method0
Data-driven approaches to inverse problems0
FocalAD: Local Motion Planning for End-to-End Autonomous Driving0
Deep Learning Model Acceleration and Optimization Strategies for Real-Time Recommendation Systems0
EgoPrivacy: What Your First-Person Camera Says About You?Code0
Show:102550
← PrevPage 317 of 9486Next →