SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 36513700 of 661570 papers

TitleStatusHype
Impact of architecture on robustness and interpretability of multispectral deep neural networksCode3
Are Language Models Actually Useful for Time Series Forecasting?Code3
PDEBENCH: An Extensive Benchmark for Scientific Machine LearningCode3
Activating More Pixels in Image Super-Resolution TransformerCode3
The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and ResultsCode3
ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing systemCode3
The Manga Whisperer: Automatically Generating Transcriptions for ComicsCode3
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View SynthesisCode3
Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse PrimitivesCode3
Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and EvaluationCode3
Deep Neural Networks for Rank-Consistent Ordinal Regression Based On Conditional ProbabilitiesCode3
Channel Permutations for N:M SparsityCode3
PP-MSVSR: Multi-Stage Video Super-ResolutionCode3
QOC: Quantum On-Chip Training with Parameter Shift and Gradient PruningCode3
Pastiche Master: Exemplar-Based High-Resolution Portrait Style TransferCode3
Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools SegmentationCode3
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on WikipediaCode3
Deep Learning for Trajectory Data Management and Mining: A Survey and BeyondCode3
DeepCAVE: An Interactive Analysis Tool for Automated Machine LearningCode3
Plotly-Resampler: Effective Visual Analytics for Large Time SeriesCode3
MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-MakingCode3
The Common Core OntologiesCode3
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent TasksCode3
PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation ModelsCode3
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic AgentsCode3
SEED-Bench: Benchmarking Multimodal Large Language ModelsCode3
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMsCode3
Reasoning with Language Model Prompting: A SurveyCode3
How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything ModelCode3
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement LearningCode3
ThoughtSource: A central hub for large language model reasoning dataCode3
VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene CompletionCode3
Foundation Models for Music: A SurveyCode3
Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid AlgorithmsCode3
GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian SplattingCode3
Visual Causal Scene Refinement for Video Question AnsweringCode3
TimeMachine: A Time Series is Worth 4 Mambas for Long-term ForecastingCode3
Caravan MultiMet: Extending Caravan with Multiple Weather Nowcasts and ForecastsCode3
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal ModelsCode3
Conceptual Framework for Autonomous Cognitive EntitiesCode3
NoMaD: Goal Masked Diffusion Policies for Navigation and ExplorationCode3
One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape OptimizationCode3
Sequential Modeling Enables Scalable Learning for Large Vision ModelsCode3
UniGS: Unified Representation for Image Generation and SegmentationCode3
Physical Symbolic OptimizationCode3
XuanCe: A Comprehensive and Unified Deep Reinforcement Learning LibraryCode3
Universal Time-Series Representation Learning: A SurveyCode3
Small LLMs Are Weak Tool Learners: A Multi-LLM AgentCode3
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-AlignmentCode3
Marabou 2.0: A Versatile Formal Analyzer of Neural NetworksCode3
Show:102550
← PrevPage 74 of 13232Next →