SOTAVerified

2k

Papers

Showing 150 of 288 papers

TitleStatusHype
MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group QuantizationCode2
MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group QuantizationCode2
Understanding and Improving Length Generalization in Recurrent Models0
A strengthened bound on the number of states required to characterize maximum parsimony distance0
Structured Variational D-Decomposition for Accurate and Stable Low-Rank Approximation0
Latent Wavelet Diffusion: Enabling 4K Image Synthesis for Free0
Tradeoffs between Mistakes and ERM Oracle Calls in Online and Transductive Online Learning0
Test-Time Training Done Right0
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language ModelsCode1
MMP-2K: A Benchmark Multi-Labeled Macro Photography Image Quality Assessment DatabaseCode1
Twin-2K-500: A dataset for building digital twins of over 2,000 people based on their answers to over 500 questionsCode1
PIIvot: A Lightweight NLP Anonymization Framework for Question-Anchored Tutoring Dialogues0
Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning0
UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning0
ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative AnnotationCode0
Calibrating Translation Decoding with Quality Estimation on LLMsCode0
aiXamine: Simplified LLM Safety and Security0
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis0
Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading0
On Linear Representations and Pretraining Data Frequency in Language Models0
Seedream 3.0 Technical Report0
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration0
FlashDepth: Real-time Streaming Video Depth Estimation at 2K ResolutionCode3
FastVAR: Linear Visual Autoregressive Modeling via Cached Token PruningCode2
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual ScenesCode2
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers0
Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior0
Ultra-Resolution Adaptation with EaseCode2
REPA: Russian Error Types Annotation for Evaluating Text Generation and Judgment Capabilities0
Evaluating the Suitability of Different Intraoral Scan Resolutions for Deep Learning-Based Tooth Segmentation0
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models0
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks0
Exact Recovery of Sparse Binary Vectors from Generalized Linear Measurements0
Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning0
MaskGWM: A Generalizable Driving World Model with Video Mask ReconstructionCode3
Improved Regret in Stochastic Decision-Theoretic Online Learning under Differential Privacy0
CascadeV: An Implementation of Wurstchen Architecture for Video GenerationCode1
Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains0
TimeLogic: A Temporal Logic Benchmark for Video QA0
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation0
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and VideosCode5
Toward Corpus Size Requirements for Training and Evaluating Depression Risk Models Using Spoken Language0
Social-LLaVA: Enhancing Robot Navigation through Human-Language Reasoning in Social Spaces0
Multimodal Preference Data Synthetic Alignment with Reward ModelCode0
AnalogXpert: Automating Analog Topology Synthesis by Incorporating Circuit Design Expertise into Large Language Models0
Block-Based Multi-Scale Image Rescaling0
Do Large Language Models Show Biases in Causal Learning?0
Elevating Flow-Guided Video Inpainting with Reference GenerationCode2
MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects0
Lightweight Multiplane Images Network for Real-Time Stereoscopic Conversion from Planar Video0
Show:102550
← PrevPage 1 of 6Next →

No leaderboard results yet.