Gemstones: A Model Suite for Multi-Faceted Scaling Laws Feb 7, 2025 Experimental Design Language Modeling
Code Code Available 1Position-aware Automatic Circuit Discovery Feb 7, 2025 Language Modeling Language Modelling
Code Code Available 1ADIFF: Explaining audio difference using natural language Feb 6, 2025 AudioCaps Audio captioning
Code Code Available 1Great Models Think Alike and this Undermines AI Oversight Feb 6, 2025 Language Modeling Language Modelling
Code Code Available 1Robotouille: An Asynchronous Planning Benchmark for LLM Agents Feb 6, 2025 Language Modeling Language Modelling
Code Code Available 1Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient On-Device Agents Feb 6, 2025 Language Modeling Language Modelling
Code Code Available 1Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics Feb 5, 2025 image-classification Image Classification
Code Code Available 1Intent Representation Learning with Large Language Model for Recommendation Feb 5, 2025 Language Modeling Language Modelling
Code Code Available 1Enhancing Reasoning to Adapt Large Language Models for Domain-Specific Applications Feb 5, 2025 In-Context Learning Language Modeling
Code Code Available 1Do Large Language Model Benchmarks Test Reliability? Feb 5, 2025 Language Modeling Language Modelling
Code Code Available 1CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing Feb 4, 2025 Collaborative Inference Language Modeling
Code Code Available 1Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods Feb 3, 2025 Language Modeling Language Modelling
Code Code Available 1Polynomial, trigonometric, and tropical activations Feb 3, 2025 image-classification Image Classification
Code Code Available 1Simulating Rumor Spreading in Social Networks using LLM Agents Feb 3, 2025 Language Modeling Language Modelling
Code Code Available 1Speculative Ensemble: Fast Large Language Model Ensemble via Speculation Feb 1, 2025 Language Modeling Language Modelling
Code Code Available 1Scalable-Softmax Is Superior for Attention Jan 31, 2025 Information Retrieval Language Modeling
Code Code Available 1Low-Rank Adapting Models for Sparse Autoencoders Jan 31, 2025 Language Modeling Language Modelling
Code Code Available 1WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Jan 30, 2025 Language Modeling Language Modelling
Code Code Available 12SSP: A Two-Stage Framework for Structured Pruning of LLMs Jan 29, 2025 Language Modeling Language Modelling
Code Code Available 1RadioLLM: Introducing Large Language Model into Cognitive Radio via Hybrid Prompt and Token Reprogrammings Jan 28, 2025 Denoising Domain Generalization
Code Code Available 1Atla Selene Mini: A General Purpose Evaluation Model Jan 27, 2025 Language Modeling Language Modelling
Code Code Available 1ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer Jan 26, 2025 Language Modeling Language Modelling
Code Code Available 1Ocean-OCR: Towards General OCR Application via a Vision-Language Model Jan 26, 2025 document understanding Language Modeling
Code Code Available 1DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing Jan 24, 2025 Language Modeling Language Modelling
Code Code Available 1RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques Jan 24, 2025 Language Modeling Language Modelling
Code Code Available 1Enhancing Biomedical Relation Extraction with Directionality Jan 23, 2025 Benchmarking Document-level Relation Extraction
Code Code Available 1PAINT: Paying Attention to INformed Tokens to Mitigate Hallucination in Large Vision-Language Model Jan 21, 2025 Hallucination Image Captioning
Code Code Available 1Glinthawk: A Two-Tiered Architecture for Offline LLM Inference Jan 20, 2025 CPU Language Modeling
Code Code Available 1EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery Jan 20, 2025 Language Modeling Language Modelling
Code Code Available 1AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model Jan 19, 2025 In-Context Learning Language Modeling
Code Code Available 1LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport Jan 16, 2025 AudioCaps Audio captioning
Code Code Available 1WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning Jan 15, 2025 cross-modal alignment Language Modeling
Code Code Available 13UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding Jan 14, 2025 Language Modeling Language Modelling
Code Code Available 1Gandalf the Red: Adaptive Security for LLMs Jan 14, 2025 Blocking Language Modeling
Code Code Available 1VASparse: Towards Efficient Visual Hallucination Mitigation for Large Vision-Language Model via Visual-Aware Sparsification Jan 11, 2025 Hallucination Language Modeling
Code Code Available 1Merging Feed-Forward Sublayers for Compressed Transformers Jan 10, 2025 image-classification Image Classification
Code Code Available 1Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation Jan 6, 2025 Language Model Evaluation Language Modeling
Code Code Available 1Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model Jan 6, 2025 Language Modeling Language Modelling
Code Code Available 1Establishing baselines for generative discovery of inorganic crystals Jan 4, 2025 Band Gap Language Modeling
Code Code Available 1Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding Jan 3, 2025 Hallucination Language Modeling
Code Code Available 1Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Jan 1, 2025 Arithmetic Reasoning Language Modeling
Code Code Available 1LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts Dec 31, 2024 Language Modeling Language Modelling
Code Code Available 1TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment Dec 31, 2024 Instruction Following Language Modeling
Code Code Available 1Toward Intelligent and Secure Cloud: Large Language Model Empowered Proactive Defense Dec 30, 2024 Cloud Computing Code Generation
Code Code Available 1Facilitating large language model Russian adaptation with Learned Embedding Propagation Dec 30, 2024 Language Modeling Language Modelling
Code Code Available 1No Preference Left Behind: Group Distributional Preference Optimization Dec 28, 2024 Diversity Language Modeling
Code Code Available 1An Engorgio Prompt Makes Large Language Model Babble on Dec 27, 2024 Language Modeling Language Modelling
Code Code Available 1Learning to engineer protein flexibility Dec 24, 2024 Language Modeling Language Modelling
Code Code Available 1Brain-to-Text Benchmark '24: Lessons Learned Dec 23, 2024 Language Modeling Language Modelling
Code Code Available 1Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain Testing Dec 23, 2024 ArabicMMLU Dialect Identification
Code Code Available 1