SOTAVerified

Causal Language Modeling

Papers

Showing 150 of 52 papers

TitleStatusHype
CodeGen2: Lessons for Training LLMs on Programming and Natural LanguagesCode5
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq ModelCode2
GPT or BERT: why not both?Code2
Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image CaptioningCode1
Video Pre-trained Transformer: A Multimodal Mixture of Pre-trained ExpertsCode1
Interpretable Language Modeling via Induction-head Ngram ModelsCode1
GRITHopper: Decomposition-Free Multi-Hop Dense RetrievalCode1
Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic PuzzlesCode1
Self-Supervised Learning of Brain Dynamics from Broad Neuroimaging DataCode1
Transcormer: Transformer for Sentence Scoring with Sliding Language ModelingCode1
What's the Magic Word? A Control Theory of LLM PromptingCode1
Wiki-40B: Multilingual Language Model Dataset0
A Closer Look at Parameter Contributions When Training Neural Language and Translation Models0
A Meta-Learning Perspective on Transformers for Causal Language Modeling0
Towards the Anonymization of the Language Modeling0
AntLM: Bridging Causal and Masked Language Models0
A Simple, Yet Effective Approach to Finding Biases in Code Generation0
Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling0
ElastiFormer: Learned Redundancy Reduction in Transformer via Self-Distillation0
Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning0
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation0
IntenT5: Search Result Diversification using Causal Language Models0
Multitask Finetuning for Improving Neural Machine Translation in Indian Languages0
Multi-Task Learning for Situated Multi-Domain End-to-End Dialogue Systems0
N-gram Prediction and Word Difference Representations for Language Modeling0
NIFTY Financial News Headlines Dataset0
Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning0
Predictability and Causality in Spanish and English Natural Language Generation0
Prix-LM: Pretraining for Multilingual Knowledge Base Construction0
ProtFIM: Fill-in-Middle Protein Sequence Design via Protein Language Models0
QuAILoRA: Quantization-Aware Initialization for LoRA0
SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models0
Trojan Detection Through Pattern Recognition for Large Language Models0
Understanding Token Probability Encoding in Output Embeddings0
AstroLLaMA: Towards Specialized Foundation Models in Astronomy0
Learning from flowsheets: A generative transformer model for autocompletion of flowsheets0
Linear Attention via Orthogonal Memory0
DavIR: Data Selection via Implicit Reward for Large Language Models0
Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization0
Generating Synthetic Free-text Medical Records with Low Re-identification Risk using Masked Language ModelingCode0
Language Models are General-Purpose InterfacesCode0
Large Product Key Memory for Pretrained Language ModelsCode0
Suffix Retrieval-Augmented Language ModelingCode0
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language ModelsCode0
Prix-LM: Pretraining for Multilingual Knowledge Base ConstructionCode0
Masked Mixers for Language Generation and RetrievalCode0
Cross-lingual Similarity of Multilingual Representations RevisitedCode0
Transformer based neural networks for emotion recognition in conversationsCode0
Conditional Language Learning with ContextCode0
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning AbilityCode0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.