| BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation Models | Jun 17, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| GeoRecon: Graph-Level Representation Learning for 3D Molecules via Reconstruction-Based Pretraining | Jun 16, 2025 | DenoisingLanguage Modeling | —Unverified | 0 |
| Diffusion Sequence Models for Enhanced Protein Representation and Generation | Jun 9, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Masked Language Models are Good Heterogeneous Graph Generalizers | Jun 6, 2025 | Graph LearningLanguage Modeling | CodeCode Available | 0 |
| Improving Low-Resource Morphological Inflection via Self-Supervised Objectives | Jun 5, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| GigaAM: Efficient Self-Supervised Learner for Speech Recognition | Jun 1, 2025 | Automatic Speech RecognitionLanguage Modeling | CodeCode Available | 4 |
| HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling | May 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Ankh3: Multi-Task Pretraining with Sequence Denoising and Completion Enhances Protein Representations | May 26, 2025 | DenoisingLanguage Modeling | —Unverified | 0 |
| ADALog: Adaptive Unsupervised Anomaly detection in Logs with Self-attention Masked Language Model | May 15, 2025 | Anomaly DetectionLanguage Modeling | —Unverified | 0 |
| CreoPep: A Universal Deep Learning Framework for Target-Specific Peptide Design and Optimization | May 5, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| CodeSSM: Towards State Space Models for Code Understanding | May 2, 2025 | Clone DetectionLanguage Modeling | —Unverified | 0 |
| In-Context Learning can distort the relationship between sequence likelihoods and biological fitness | Apr 23, 2025 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them | Mar 27, 2025 | Continual PretrainingLanguage Modeling | —Unverified | 0 |
| Low-Resource Transliteration for Roman-Urdu and Urdu Using Transformer-Based Models | Mar 27, 2025 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| LakotaBERT: A Transformer-based Model for Low Resource Lakota Language | Mar 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Shushing! Let's Imagine an Authentic Speech from the Silent Video | Mar 19, 2025 | cross-modal alignmentLanguage Modeling | —Unverified | 0 |
| ASMA-Tune: Unlocking LLMs' Assembly Code Comprehension via Structural-Semantic Instruction Tuning | Mar 14, 2025 | Code GenerationDecoder | CodeCode Available | 0 |
| Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text | Feb 18, 2025 | Authorship AttributionLanguage Modeling | CodeCode Available | 0 |
| Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More | Feb 11, 2025 | DecoderInformation Retrieval | CodeCode Available | 0 |
| Enabling Autoregressive Models to Fill In Masked Tokens | Feb 9, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray | Feb 7, 2025 | 4kGeneral Knowledge | CodeCode Available | 3 |
| SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling | Jan 22, 2025 | Audio CompressionLanguage Modeling | —Unverified | 0 |
| Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search | Dec 19, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Bias Vector: Mitigating Biases in Language Models with Task Arithmetic Approach | Dec 16, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A Progressive Transformer for Unifying Binary Code Embedding and Knowledge Transfer | Dec 15, 2024 | Feature EngineeringLanguage Modeling | —Unverified | 0 |