SOTAVerified

Attribute

Papers

Showing 150 of 5387 papers

TitleStatusHype
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech TokensCode11
Yi: Open Foundation Models by 01.AICode9
aiXcoder-7B: A Lightweight and Effective Large Language Model for Code ProcessingCode7
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction FollowingCode5
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image GenerationCode5
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive AnnotationsCode5
Learning Flow Fields in Attention for Controllable Person Image GenerationCode5
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-FreeCode4
Region-Aware Text-to-Image Generation via Hard Binding and Soft RefinementCode4
Zero-shot forecasting of chaotic systemsCode4
SparseDrive: End-to-End Autonomous Driving via Sparse Scene RepresentationCode4
Learning to Generate Instruction Tuning Datasets for Zero-Shot Task AdaptationCode4
Deepfake Generation and Detection: A Benchmark and SurveyCode4
MIGC: Multi-Instance Generation Controller for Text-to-Image SynthesisCode4
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language AnnotationsCode4
MIGC++: Advanced Multi-Instance Generation Controller for Image SynthesisCode4
KTO: Model Alignment as Prospect Theoretic OptimizationCode4
HelpSteer2: Open-source dataset for training top-performing reward modelsCode4
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT ModulationCode4
DeepFakes and Beyond: A Survey of Face Manipulation and Fake DetectionCode4
A Closer Look at Deep Learning Methods on Tabular DatasetsCode4
ChangeMamba: Remote Sensing Change Detection With Spatiotemporal State Space ModelCode4
Sparse Tensor-based Point Cloud Attribute CompressionCode4
GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian SplattingCode3
FaceXFormer: A Unified Transformer for Facial AnalysisCode3
FateZero: Fusing Attentions for Zero-shot Text-based Video EditingCode3
Taming 3DGS: High-Quality Radiance Fields with Limited ResourcesCode3
Face Anonymization Made SimpleCode3
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign LanguagesCode3
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use CasesCode3
Scaling Diffusion Transformers to 16 Billion ParametersCode3
RemoteSAM: Towards Segment Anything for Earth ObservationCode3
Semi-supervised Credit Card Fraud Detection via Attribute-Driven Graph RepresentationCode3
Delay-penalized CTC implemented based on Finite State TransducerCode3
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image GenerationCode3
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D ModelsCode3
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window TransformerCode3
DisCo: Disentangled Control for Realistic Human Dance GenerationCode3
ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and ReasoningCode3
Large Spatial Model: End-to-end Unposed Images to Semantic 3DCode3
Bird-Eye Transformers for Text Generation ModelsCode3
A Comprehensive Survey on Composed Image RetrievalCode3
Harnessing the Universal Geometry of EmbeddingsCode3
Controllable Text Generation for Large Language Models: A SurveyCode3
HAC: Hash-grid Assisted Context for 3D Gaussian Splatting CompressionCode3
ATPrompt: Textual Prompt Learning with Embedded AttributesCode3
HAC++: Towards 100X Compression of 3D Gaussian SplattingCode3
ID-Animator: Zero-Shot Identity-Preserving Human Video GenerationCode3
Faceptor: A Generalist Model for Face PerceptionCode2
GenEval: An Object-Focused Framework for Evaluating Text-to-Image AlignmentCode2
Show:102550
← PrevPage 1 of 108Next →

No leaderboard results yet.