SOTAVerified

Language Modeling

Papers

Showing 13011350 of 14182 papers

TitleStatusHype
Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language ModelsCode1
LLaMo: Large Language Model-based Molecular Graph AssistantCode1
Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility PredictionCode1
Interpretable Language Modeling via Induction-head Ngram ModelsCode1
Online Intrinsic Rewards for Decision Making Agents from Large Language Model FeedbackCode1
Real-Time Personalization for LLM-based Recommendation with Customized In-Context LearningCode1
SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt TypesCode1
f-PO: Generalizing Preference Optimization with f-divergence MinimizationCode1
Long-context Protein Language Modeling Using Bidirectional Mamba with Shared Projection LayersCode1
LLMCBench: Benchmarking Large Language Model Compression for Efficient DeploymentCode1
TrajAgent: An Agent Framework for Unified Trajectory ModellingCode1
LOGO -- Long cOntext aliGnment via efficient preference OptimizationCode1
GCoder: Improving Large Language Model for Generalized Graph Problem SolvingCode1
Cross-model Control: Improving Multiple Large Language Models in One-time TrainingCode1
GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent CollaborationCode1
Automated Spinal MRI Labelling from Reports Using a Large Language ModelCode1
Scalable Influence and Fact Tracing for Large Language Model PretrainingCode1
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward PassesCode1
Building A Coding Assistant via the Retrieval-Augmented Language ModelCode1
SeisLM: a Foundation Model for Seismic WaveformsCode1
A Realistic Threat Model for Large Language Model JailbreaksCode1
Residual vector quantization for KV cache compression in large language modelCode1
M-RewardBench: Evaluating Reward Models in Multilingual SettingsCode1
Paths-over-Graph: Knowledge Graph Empowered Large Language Model ReasoningCode1
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task AutomationCode1
FIRE: Fact-checking with Iterative Retrieval and VerificationCode1
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation SystemsCode1
VividMed: Vision Language Model with Versatile Visual Grounding for MedicineCode1
HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World ClaimsCode1
CREAM: Consistency Regularized Self-Rewarding Language ModelsCode1
DISP-LLM: Dimension-Independent Structural Pruning for Large Language ModelsCode1
TopoLM: brain-like spatio-functional organization in a topographic language modelCode1
FVEval: Understanding Language Model Capabilities in Formal Verification of Digital HardwareCode1
Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited ResponsesCode1
HARDMath: A Benchmark Dataset for Challenging Problems in Applied MathematicsCode1
EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMsCode1
PoisonBench: Assessing Large Language Model Vulnerability to Data PoisoningCode1
Retraining-Free Merging of Sparse MoE via Hierarchical ClusteringCode1
Zeroth-Order Fine-Tuning of LLMs in Random SubspacesCode1
PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model AgentsCode1
Parameter-Efficient Fine-Tuning of State Space ModelsCode1
Do Unlearning Methods Remove Information from Language Model Weights?Code1
Bilinear MLPs enable weight-based mechanistic interpretabilityCode1
Multi-Agent Collaborative Data Selection for Efficient LLM PretrainingCode1
AgroGPT: Efficient Agricultural Vision-Language Model with Expert TuningCode1
OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model PromptingCode1
AuditWen:An Open-Source Large Language Model for AuditCode1
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM UnlearningCode1
Training-free Diffusion Model Alignment with Sampling DemonsCode1
Show:102550
← PrevPage 27 of 284Next →

No leaderboard results yet.