SOTAVerified

Language Modeling

Papers

Showing 12011225 of 14182 papers

TitleStatusHype
MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech TokensCode1
LLMs Working in Harmony: A Survey on the Technological Aspects of Building Effective LLM-Based Multi Agent Systems0
NeurIPS 2023 LLM Efficiency Fine-tuning Competition0
SCE: Scalable Consistency Ensembles Make Blackbox Large Language Model Generation More Reliable0
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More0
GroundingSuite: Measuring Complex Multi-Granular Pixel GroundingCode2
MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation0
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and EditingCode3
MouseGPT: A Large-scale Vision-Language Model for Mouse Behavior Analysis0
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model0
TacticExpert: Spatial-Temporal Graph Language Model for Basketball Tactics0
Hybrid Agents for Image Restoration0
PRISM: Preference Refinement via Implicit Scene Modeling for 3D Vision-Language Preference-Based Reinforcement Learning0
Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search0
SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation0
OR-LLM-Agent: Automating Modeling and Solving of Operations Research Optimization Problem with Reasoning Large Language ModelCode2
Toward a method for LLM-enabled Indoor Navigation0
Leveraging Knowledge Graphs and LLMs for Context-Aware Messaging0
Medical Large Language Model Benchmarks Should Prioritize Construct Validity0
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs0
Why LLMs Cannot Think and How to Fix It0
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo0
Token Weighting for Long-Range Language ModelingCode0
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability0
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language ModelsCode4
Show:102550
← PrevPage 49 of 568Next →

No leaderboard results yet.