SOTAVerified

Causal Language Modeling

Papers

Showing 150 of 52 papers

TitleStatusHype
CodeGen2: Lessons for Training LLMs on Programming and Natural LanguagesCode5
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq ModelCode2
GPT or BERT: why not both?Code2
Transcormer: Transformer for Sentence Scoring with Sliding Language ModelingCode1
Self-Supervised Learning of Brain Dynamics from Broad Neuroimaging DataCode1
Interpretable Language Modeling via Induction-head Ngram ModelsCode1
Video Pre-trained Transformer: A Multimodal Mixture of Pre-trained ExpertsCode1
What's the Magic Word? A Control Theory of LLM PromptingCode1
Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image CaptioningCode1
GRITHopper: Decomposition-Free Multi-Hop Dense RetrievalCode1
Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic PuzzlesCode1
Cross-lingual Similarity of Multilingual Representations RevisitedCode0
Masked Mixers for Language Generation and RetrievalCode0
A Simple Baseline for Predicting Events with Auto-Regressive Tabular TransformersCode0
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning AbilityCode0
Generating Synthetic Free-text Medical Records with Low Re-identification Risk using Masked Language ModelingCode0
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language ModelsCode0
Prix-LM: Pretraining for Multilingual Knowledge Base ConstructionCode0
Language Models are General-Purpose InterfacesCode0
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence ModellingCode0
Large Product Key Memory for Pretrained Language ModelsCode0
Suffix Retrieval-Augmented Language ModelingCode0
Transformer based neural networks for emotion recognition in conversationsCode0
Conditional Language Learning with ContextCode0
SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models0
Trojan Detection Through Pattern Recognition for Large Language Models0
Understanding Token Probability Encoding in Output Embeddings0
AstroLLaMA: Towards Specialized Foundation Models in Astronomy0
Wiki-40B: Multilingual Language Model Dataset0
A Closer Look at Parameter Contributions When Training Neural Language and Translation Models0
A Meta-Learning Perspective on Transformers for Causal Language Modeling0
Towards the Anonymization of the Language Modeling0
AntLM: Bridging Causal and Masked Language Models0
A Simple, Yet Effective Approach to Finding Biases in Code Generation0
Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling0
ElastiFormer: Learned Redundancy Reduction in Transformer via Self-Distillation0
Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning0
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation0
IntenT5: Search Result Diversification using Causal Language Models0
Learning from flowsheets: A generative transformer model for autocompletion of flowsheets0
Linear Attention via Orthogonal Memory0
DavIR: Data Selection via Implicit Reward for Large Language Models0
Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization0
Multitask Finetuning for Improving Neural Machine Translation in Indian Languages0
Multi-Task Learning for Situated Multi-Domain End-to-End Dialogue Systems0
N-gram Prediction and Word Difference Representations for Language Modeling0
NIFTY Financial News Headlines Dataset0
Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning0
Predictability and Causality in Spanish and English Natural Language Generation0
Prix-LM: Pretraining for Multilingual Knowledge Base Construction0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.