SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1010110150 of 661570 papers

TitleStatusHype
Federated Learning with New Knowledge: Fundamentals, Advances, and FuturesCode2
Cross-view Masked Diffusion Transformers for Person Image SynthesisCode2
DeepAAT: Deep Automated Aerial Triangulation for Fast UAV-based MappingCode2
Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of ElectrocardiogramCode2
A Single Simple Patch is All You Need for AI-generated Image DetectionCode2
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?Code2
Improving Sequential Recommendations with LLMsCode2
LitLLM: A Toolkit for Scientific Literature ReviewCode2
TrustAgent: Towards Safe and Trustworthy LLM-based AgentsCode2
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler FeedbackCode2
Efficient and Effective Time-Series Forecasting with Spiking Neural NetworksCode2
InfMAE: A Foundation Model in the Infrared ModalityCode2
EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language ModelsCode2
Towards Efficient Exact Optimization of Language Model AlignmentCode2
A Survey on Hallucination in Large Vision-Language ModelsCode2
Graph Domain Adaptation: Challenges, Progress and ProspectsCode2
Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based AgentsCode2
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-TuningCode2
On the Challenges of Fuzzing Techniques via Large Language ModelsCode2
CapHuman: Capture Your Moments in Parallel UniversesCode2
Developing A Multi-Agent and Self-Adaptive Framework with Deep Reinforcement Learning for Dynamic Portfolio Risk ManagementCode2
PAM: Prompting Audio-Language Models for Audio Quality AssessmentCode2
CF4J: Collaborative Filtering for JavaCode2
Improved Scene Landmark Detection for Camera LocalizationCode2
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio CaptioningCode2
On Prompt-Driven Safeguarding for Large Language ModelsCode2
SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text TasksCode2
Fin-GAN: forecasting and classifying financial time series via generative adversarial networksCode2
AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction ErrorCode2
ControlCap: Controllable Region-level CaptioningCode2
Local Feature Matching Using Deep Learning: A SurveyCode2
LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity EnhancementCode2
SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian DecompositionCode2
M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text RetrievalCode2
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial NetworksCode2
Rethinking Channel Dependence for Multivariate Time Series Forecasting: Learning from Leading IndicatorsCode2
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing DomainCode2
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion TokensCode2
TeenyTinyLlama: open-source tiny language models trained in Brazilian PortugueseCode2
Weak-to-Strong Jailbreaking on Large Language ModelsCode2
Finetuning Large Language Models for Vulnerability DetectionCode2
Robust Prompt Optimization for Defending Language Models Against Jailbreaking AttacksCode2
Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language ModelsCode2
Multi-granularity Correspondence Learning from Long-term Noisy VideosCode2
LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and DistillationCode2
An Open Software Suite for Event-Based VideoCode2
MF-MOS: A Motion-Focused Model for Moving Object SegmentationCode2
MouSi: Poly-Visual-Expert Vision-Language ModelsCode2
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex ScenariosCode2
MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language ModelsCode2
Show:102550
← PrevPage 203 of 13232Next →