SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1520115250 of 474278 papers

TitleStatusHype
A Systematic Replicability and Comparative Study of BSARec and SASRec for Sequential Recommendation0
Vela: Scalable Embeddings with Voice Large Language Models for Multimodal Retrieval0
Call To Speak To Someone At Expedia Through Various Contact Options: The Ultimate Step Guide0
Light Aircraft Game : Basic Implementation and training results analysisCode0
TriGuard: Testing Model Safety with Attribution Entropy, Verification, and DriftCode0
Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning ModelsCode0
Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding0
LexiMark: Robust Watermarking via Lexical Substitutions to Enhance Membership Verification of an LLM's Textual Training DataCode0
Accurate and scalable exchange-correlation with deep learning0
Doppelganger Method: Breaking Role Consistency in LLM Agent via Prompt-based Transferable Adversarial Attack0
M3SD: Multi-modal, Multi-scenario and Multi-language Speaker Diarization Dataset0
Uncertainty-Driven Radar-Inertial Fusion for Instantaneous 3D Ego-Velocity Estimation0
Fair for a few: Improving Fairness in Doubly Imbalanced Datasets0
How to Speak to a Real Person at Singapore Airlines®: 15 Easy Methods Explained0
Call To Speak To Someone At Frontier™️ Airlines Through Various Contact Options: The Ultimate Step Guide0
Acoustic scattering AI for non-invasive object classifications: A case study on hair assessment0
Steering Robots with Inference-Time Interactions0
Exploring Speaker Diarization with Mixture of Experts0
RMIT-ADM+S at the SIGIR 2025 LiveRAG ChallengeCode1
VideoMAR: Autoregressive Video Generatio with Continuous Tokens0
Convergence-Privacy-Fairness Trade-Off in Personalized Federated Learning0
Human-Centered Editable Speech-to-Sign-Language Generation via Streaming Conformer-Transformer and Resampling Hook0
Image Segmentation with Large Language Models: A Survey with Perspectives for Intelligent Transportation Systems0
RadFabric: Agentic AI System with Reasoning Capability for Radiology0
Expressive Score-Based Priors for Distribution Matching with Geometry-Preserving RegularizationCode0
Interpreting Biomedical VLMs on High-Imbalance Out-of-Distributions: An Insight into BiomedCLIP on RadiologyCode0
LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops0
GAMORA: A Gesture Articulated Meta Operative Robotic Arm for Hazardous Material Handling in Containment-Level Environments0
Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees0
Sampling from Your Language Model One Byte at a TimeCode1
Comprehensive Verilog Design Problems: A Next-Generation Benchmark Dataset for Evaluating Large Language Models and Agents on RTL Design and VerificationCode2
A Variational Framework for Improving Naturalness in Generative Spoken Language ModelsCode1
Navigating the growing field of research on AI for software testing -- the taxonomy for AI-augmented software testing and an ontology-driven literature surveyCode0
SceneAware: Scene-Constrained Pedestrian Trajectory Prediction with LLM-Guided WalkabilityCode0
WISVA: Generative AI for 5G Network Optimization in Smart Warehouses0
Dynamic Graph Condensation0
LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential RecommendationCode2
Deep Learning-Based Multi-Object Tracking: A Comprehensive Survey from Foundations to State-of-the-Art0
A Comprehensive Survey on Deep Learning Solutions for 3D Flood Mapping0
Leveraging In-Context Learning for Language Model Agents0
STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation0
A Comprehensive Survey on Continual Learning in Generative ModelsCode2
RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table AnalysisCode1
SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point CloudsCode0
Seq2Bind Webserver for Decoding Binding Hotspots directly from Sequences using Fine-Tuned Protein Language Models0
Performance Analysis of Communication Signals for Localization in Underwater Sensor Networks0
Joint Spectrum Sensing and Resource Allocation for OFDMA-based Underwater Acoustic Communications0
CBTOPE2: An improved method for predicting of conformational B-cell epitopes in an antigen from its primary sequence0
Beyond Black Boxes: Enhancing Interpretability of Transformers Trained on Neural Data0
BlastDiffusion: A Latent Diffusion Model for Generating Synthetic Embryo Images to Address Data Scarcity in In Vitro Fertilization0
Show:102550
← PrevPage 305 of 9486Next →