SOTAVerified

Large Language Model

Papers

Showing 10261050 of 6097 papers

TitleStatusHype
LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language TextsCode1
LLM-SR: Scientific Equation Discovery via Programming with Large Language ModelsCode1
LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling ResearchCode1
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer TextCode1
LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language ModelsCode1
AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Ten Modalities via Language as a Reference FrameworkCode1
LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT PluginsCode1
Adaptive KalmanNet: Data-Driven Kalman Filter with Fast AdaptationCode1
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For DrivingCode1
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt EngineerCode1
LLM-in-the-loop: Leveraging Large Language Model for Thematic AnalysisCode1
LLMDet: A Third Party Large Language Models Generated Text Detection ToolCode1
LLM experiments with simulation: Large Language Model Multi-Agent System for Simulation Model Parametrization in Digital TwinsCode1
DOMINO: A Dual-System for Multi-step Visual Language ReasoningCode1
Automated Spinal MRI Labelling from Reports Using a Large Language ModelCode1
LLMBind: A Unified Modality-Task Integration FrameworkCode1
Do Large Language Model Benchmarks Test Reliability?Code1
LLMCBench: Benchmarking Large Language Model Compression for Efficient DeploymentCode1
A Benchmark for Generalizing Across Diverse Team Strategies in Competitive PokémonCode1
LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools and Self-ExplanationsCode1
Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM AgentsCode1
DMoERM: Recipes of Mixture-of-Experts for Effective Reward ModelingCode1
DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing ConstraintsCode1
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model EvaluatorsCode1
LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial RelationsCode1
Show:102550
← PrevPage 42 of 244Next →

No leaderboard results yet.