| Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization | Jun 16, 2025 | Causal Language ModelingInstruction Following | —Unverified | 0 |
| GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval | Mar 10, 2025 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| Trojan Detection Through Pattern Recognition for Large Language Models | Jan 20, 2025 | Causal Language ModelingIn-Context Learning | —Unverified | 0 |
| Towards the Anonymization of the Language Modeling | Jan 5, 2025 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models | Dec 17, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 0 |
| AntLM: Bridging Causal and Masked Language Models | Dec 4, 2024 | Causal Language ModelingDecoder | —Unverified | 0 |
| Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning | Dec 3, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| ElastiFormer: Learned Redundancy Reduction in Transformer via Self-Distillation | Nov 22, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Interpretable Language Modeling via Induction-head Ngram Models | Oct 31, 2024 | Causal Language ModelingHuman fMRI response prediction | CodeCode Available | 1 |
| GPT or BERT: why not both? | Oct 31, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 2 |
| A Simple Baseline for Predicting Events with Auto-Regressive Tabular Transformers | Oct 14, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 0 |
| QuAILoRA: Quantization-Aware Initialization for LoRA | Oct 9, 2024 | Causal Language ModelingGPU | —Unverified | 0 |
| Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles | Sep 16, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| Generating Synthetic Free-text Medical Records with Low Re-identification Risk using Masked Language Modeling | Sep 15, 2024 | Causal Language ModelingDe-identification | CodeCode Available | 0 |
| N-gram Prediction and Word Difference Representations for Language Modeling | Sep 5, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Masked Mixers for Language Generation and Retrieval | Sep 2, 2024 | Causal Language ModelingRetrieval | CodeCode Available | 0 |
| Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning | Aug 30, 2024 | Causal Language ModelingContinual Learning | —Unverified | 0 |
| Predictability and Causality in Spanish and English Natural Language Generation | Aug 26, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Conditional Language Learning with Context | Jun 4, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 0 |
| Understanding Token Probability Encoding in Output Embeddings | Jun 3, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Transformer based neural networks for emotion recognition in conversations | May 18, 2024 | Causal Language ModelingEmotion Classification | CodeCode Available | 0 |
| NIFTY Financial News Headlines Dataset | May 16, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models | Apr 27, 2024 | Causal Language ModelingHallucination | —Unverified | 0 |
| GROUNDHOG: Grounding Large Language Models to Holistic Segmentation | Feb 26, 2024 | Causal Language ModelingGeneralized Referring Expression Segmentation | —Unverified | 0 |
| Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling | Jan 25, 2024 | Causal Language ModelingDecoder | —Unverified | 0 |
| Linear Attention via Orthogonal Memory | Dec 18, 2023 | Causal Language ModelingComputational Efficiency | —Unverified | 0 |
| Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image Captioning | Dec 2, 2023 | Causal Language ModelingContrastive Learning | CodeCode Available | 1 |
| DavIR: Data Selection via Implicit Reward for Large Language Models | Oct 16, 2023 | Causal Language ModelingGSM8K | —Unverified | 0 |
| Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability | Oct 12, 2023 | Causal Language ModelingIn-Context Learning | CodeCode Available | 0 |
| A Meta-Learning Perspective on Transformers for Causal Language Modeling | Oct 9, 2023 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| What's the Magic Word? A Control Theory of LLM Prompting | Oct 2, 2023 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| AstroLLaMA: Towards Specialized Foundation Models in Astronomy | Sep 12, 2023 | AstronomyCausal Language Modeling | —Unverified | 0 |
| CodeGen2: Lessons for Training LLMs on Programming and Natural Languages | May 3, 2023 | Causal Language ModelingDecoder | CodeCode Available | 5 |
| ProtFIM: Fill-in-Middle Protein Sequence Design via Protein Language Models | Mar 29, 2023 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Video Pre-trained Transformer: A Multimodal Mixture of Pre-trained Experts | Mar 24, 2023 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| Cross-lingual Similarity of Multilingual Representations Revisited | Dec 4, 2022 | Causal Language ModelingCross-Lingual Transfer | CodeCode Available | 0 |
| Suffix Retrieval-Augmented Language Modeling | Nov 6, 2022 | Causal Language ModelingLanguage Modeling | CodeCode Available | 0 |
| A Simple, Yet Effective Approach to Finding Biases in Code Generation | Oct 31, 2022 | Causal Language ModelingCode Generation | —Unverified | 0 |
| A Closer Look at Parameter Contributions When Training Neural Language and Translation Models | Oct 1, 2022 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model | Aug 2, 2022 | Causal Language ModelingCommon Sense Reasoning | CodeCode Available | 2 |
| Learning from flowsheets: A generative transformer model for autocompletion of flowsheets | Aug 1, 2022 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Self-Supervised Learning of Brain Dynamics from Broad Neuroimaging Data | Jun 22, 2022 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| Language Models are General-Purpose Interfaces | Jun 13, 2022 | Causal Language ModelingFew-Shot Learning | CodeCode Available | 0 |
| Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling | May 25, 2022 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| Multitask Finetuning for Improving Neural Machine Translation in Indian Languages | Dec 3, 2021 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Prix-LM: Pretraining for Multilingual Knowledge Base Construction | Nov 16, 2021 | Bilingual Lexicon InductionCausal Language Modeling | —Unverified | 0 |
| Prix-LM: Pretraining for Multilingual Knowledge Base Construction | Oct 16, 2021 | Bilingual Lexicon InductionCausal Language Modeling | CodeCode Available | 0 |
| Multi-Task Learning for Situated Multi-Domain End-to-End Dialogue Systems | Oct 11, 2021 | Causal Language ModelingDiversity | —Unverified | 0 |
| IntenT5: Search Result Diversification using Causal Language Models | Aug 9, 2021 | Causal Language ModelingDiversity | —Unverified | 0 |
| Large Product Key Memory for Pretrained Language Models | Oct 8, 2020 | Causal Language ModelingLanguage Modeling | CodeCode Available | 0 |