| Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling | Jan 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders | Jan 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks | Jan 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Large Language Model Critics for Execution-Free Evaluation of Code Changes | Jan 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Multiple Abstraction Level Retrieve Augment Generation | Jan 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| VLMaterial: Procedural Material Generation with Large Vision-Language Models | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Atla Selene Mini: A General Purpose Evaluation Model | Jan 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Cross-Domain Semantic Segmentation with Large Language Model-Assisted Descriptor Generation | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| BiFold: Bimanual Cloth Folding with Language Guidance | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction | Jan 27, 2025 | Code GenerationInductive Bias | —Unverified | 0 |
| Classification Error Bound for Low Bayes Error Conditions in Machine Learning | Jan 27, 2025 | Automatic Speech RecognitionClassification | —Unverified | 0 |
| Is It Navajo? Accurate Language Detection in Endangered Athabaskan Languages | Jan 27, 2025 | DiversityLanguage Identification | CodeCode Available | 0 |
| Integration of LLM Quality Assurance into an NLG System | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CILP-FGDI: Exploiting Vision-Language Model for Generalizable Person Re-Identification | Jan 27, 2025 | Generalizable Person Re-identificationLanguage Modeling | CodeCode Available | 0 |
| PRISMe: A Novel LLM-Powered Tool for Interactive Privacy Policy Assessment | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MEL: Legal Spanish Language Model | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Challenging Assumptions in Learning Generic Text Style Embeddings | Jan 27, 2025 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| SEAL: Speech Embedding Alignment Learning for Speech Large Language Model with Retrieval-Augmented Generation | Jan 26, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Complete Chess Games Enable LLM Become A Chess Master | Jan 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning | Jan 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| How Green are Neural Language Models? Analyzing Energy Consumption in Text Summarization Fine-tuning | Jan 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer | Jan 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Ocean-OCR: Towards General OCR Application via a Vision-Language Model | Jan 26, 2025 | document understandingLanguage Modeling | CodeCode Available | 1 |
| Improving Estonian Text Simplification through Pretrained Language Models and Custom Datasets | Jan 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |