SOTAVerified

World Knowledge

Papers

Showing 51100 of 818 papers

TitleStatusHype
MeaCap: Memory-Augmented Zero-shot Image CaptioningCode2
WeakSAM: Segment Anything Meets Weakly-supervised Instance-level RecognitionCode2
GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question AnsweringCode2
Can AI Assistants Know What They Don't Know?Code2
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMsCode2
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style PluginCode2
CapsFusion: Rethinking Image-Text Data at ScaleCode2
FreshLLMs: Refreshing Large Language Models with Search Engine AugmentationCode2
Grasp-Anything: Large-scale Grasp Dataset from Foundation ModelsCode2
Topical-Chat: Towards Knowledge-Grounded Open-Domain ConversationsCode2
ExpeL: LLM Agents Are Experiential LearnersCode2
RETA-LLM: A Retrieval-Augmented Large Language Model ToolkitCode2
ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital HumanCode2
PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about ChangeCode2
GreaseLM: Graph REASoning Enhanced Language Models for Question AnsweringCode2
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied AgentsCode2
Measuring Massive Multitask Language UnderstandingCode2
Aligning AI With Shared Human ValuesCode2
A Survey on Knowledge Graphs: Representation, Acquisition and ApplicationsCode2
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning ChainsCode1
O^2-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question AnsweringCode1
WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba DiffusionCode1
F-ViTA: Foundation Model Guided Visible to Thermal TranslationCode1
Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure PriorsCode1
Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired TrainingCode1
FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data ClassificationCode1
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and RefinementCode1
VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language ModelsCode1
An Automatic Graph Construction Framework based on Large Language Models for RecommendationCode1
Knowledge Editing through Chain-of-ThoughtCode1
Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language ModelsCode1
Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge GraphsCode1
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] TokenCode1
Retrieval-Augmented Machine Translation with Unstructured KnowledgeCode1
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers ContentCode1
LLM Embeddings Improve Test-time Adaptation to Tabular Y|X-ShiftsCode1
CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language ModelsCode1
Diversify and Conquer: Diversity-Centric Data Selection with Iterative RefinementCode1
Can OOD Object Detectors Learn from Foundation Models?Code1
AgentMove: Predicting Human Mobility Anywhere Using Large Language Model based Agentic FrameworkCode1
BLADE: Benchmarking Language Model Agents for Data-Driven ScienceCode1
Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent CommunitiesCode1
Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMsCode1
Large Scale Knowledge WashingCode1
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language ModelsCode1
Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language modelsCode1
PAC-Bayesian Generalization Bounds for Knowledge Graph Representation LearningCode1
Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model BiasCode1
LEARN: Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial ApplicationCode1
A User-Centric Multi-Intent Benchmark for Evaluating Large Language ModelsCode1
Show:102550
← PrevPage 2 of 17Next →

No leaderboard results yet.