| Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization | Jun 16, 2025 | Causal Language ModelingInstruction Following | —Unverified | 0 |
| GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval | Mar 10, 2025 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| Trojan Detection Through Pattern Recognition for Large Language Models | Jan 20, 2025 | Causal Language ModelingIn-Context Learning | —Unverified | 0 |
| Towards the Anonymization of the Language Modeling | Jan 5, 2025 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models | Dec 17, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 0 |
| AntLM: Bridging Causal and Masked Language Models | Dec 4, 2024 | Causal Language ModelingDecoder | —Unverified | 0 |
| Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning | Dec 3, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| ElastiFormer: Learned Redundancy Reduction in Transformer via Self-Distillation | Nov 22, 2024 | Causal Language ModelingLanguage Modeling | —Unverified | 0 |
| GPT or BERT: why not both? | Oct 31, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 2 |
| Interpretable Language Modeling via Induction-head Ngram Models | Oct 31, 2024 | Causal Language ModelingHuman fMRI response prediction | CodeCode Available | 1 |