Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization Jun 17, 2024 Language Modeling Language Modelling
— Unverified 0Reframing linguistic bootstrapping as joint inference using visually-grounded grammar induction models Jun 17, 2024 Language Acquisition Language Modeling
Code Code Available 0Preserving Knowledge in Large Language Model with Model-Agnostic Self-Decompression Jun 17, 2024 Language Modeling Language Modelling
— Unverified 0Prompts as Auto-Optimized Training Hyperparameters: Training Best-in-Class IR Models from Scratch with 10 Gold Labels Jun 17, 2024 Dataset Generation Information Retrieval
— Unverified 0STEVE Series: Step-by-Step Construction of Agent Systems in Minecraft Jun 17, 2024 Knowledge Distillation Language Modeling
— Unverified 0RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content Jun 17, 2024 Benchmarking General Knowledge
Code Code Available 0Self-training Large Language Models through Knowledge Detection Jun 17, 2024 Hallucination Language Modeling
Code Code Available 0Self-Train Before You Transcribe Jun 17, 2024 Domain Adaptation Language Modelling
Code Code Available 0A General Framework for Load Forecasting based on Pre-trained Large Language Model Jun 17, 2024 Language Modeling Language Modelling
— Unverified 0Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers Jun 17, 2024 Diversity Language Modeling
— Unverified 0LiLiuM: eBay's Large Language Models for e-commerce Jun 17, 2024 Language Modeling Language Modelling
— Unverified 0Retrieval-Augmented Feature Generation for Domain-Specific Classification Jun 17, 2024 Classification domain classification
— Unverified 0Promises, Outlooks and Challenges of Diffusion Language Modeling Jun 17, 2024 ARC HellaSwag
— Unverified 0Reminding Multimodal Large Language Models of Object-aware Knowledge with Retrieved Tags Jun 16, 2024 Image to text Instruction Following
— Unverified 0ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation Jun 16, 2024 Continual Learning GSM8K
Code Code Available 0RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning Jun 16, 2024 knowledge editing Language Modeling
Code Code Available 0Large Language Models for Dysfluency Detection in Stuttered Speech Jun 16, 2024 Automatic Speech Recognition Language Modeling
— Unverified 0Logit Separability-Driven Samples and Multiple Class-Related Words Selection for Advancing In-Context Learning Jun 16, 2024 In-Context Learning Language Modeling
Code Code Available 0Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions Jun 16, 2024 Decision Making Language Modelling
Code Code Available 0Optimization of Armv9 architecture general large language model inference performance based on Llama.cpp Jun 16, 2024 Compiler Optimization Language Modeling
Code Code Available 0Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens Jun 16, 2024 Language Modeling Language Modelling
— Unverified 0WundtGPT: Shaping Large Language Models To Be An Empathetic, Proactive Psychologist Jun 16, 2024 Language Modelling Large Language Model
— Unverified 0CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics Jun 16, 2024 Classification Informativeness
Code Code Available 0City-LEO: Toward Transparent City Management Using LLM with End-to-End Optimization Jun 16, 2024 Language Modelling Large Language Model
— Unverified 0Avoiding Copyright Infringement via Large Language Model Unlearning Jun 16, 2024 General Knowledge Language Modeling
Code Code Available 0Augmenting Biomedical Named Entity Recognition with General-domain Resources Jun 15, 2024 Language Modelling Multi-Task Learning
Code Code Available 0Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models Jun 15, 2024 Federated Learning Language Modelling
— Unverified 0Intertwining CP and NLP: The Generation of Unreasonably Constrained Sentences Jun 15, 2024 Combinatorial Optimization Language Modelling
— Unverified 0CancerLLM: A Large Language Model in Cancer Domain Jun 15, 2024 GPU Language Modeling
— Unverified 0Reactor Mk.1 performances: MMLU, HumanEval and BBH test results Jun 15, 2024 Benchmarking HumanEval
— Unverified 0RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics Jun 15, 2024 Language Modeling Language Modelling
— Unverified 0Spuriousness-Aware Meta-Learning for Learning Robust Classifiers Jun 15, 2024 Attribute Language Modelling
Code Code Available 0Mental Disorder Classification via Temporal Representation of Text Jun 15, 2024 Classification Language Modelling
— Unverified 0Large Language Model Enhanced Clustering for News Event Detection Jun 15, 2024 Clustering Event Detection
— Unverified 0Task Facet Learning: A Structured Approach to Prompt Optimization Jun 15, 2024 Language Modelling Large Language Model
— Unverified 0MALLM-GAN: Multi-Agent Large Language Model as Generative Adversarial Network for Synthesizing Tabular Data Jun 15, 2024 Generative Adversarial Network Language Modeling
— Unverified 0VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It Jun 15, 2024 Language Modeling Language Modelling
— Unverified 0Vision Language Modeling of Content, Distortion and Appearance for Image Quality Assessment Jun 14, 2024 Image Quality Assessment Language Modeling
— Unverified 0UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages Jun 14, 2024 Cross-Lingual Transfer Language Modeling
Code Code Available 0A Probability--Quality Trade-off in Aligned Language Models and its Relation to Sampling Adaptors Jun 14, 2024 Language Modeling Language Modelling
— Unverified 0Group and Shuffle: Efficient Structured Orthogonal Parametrization Jun 14, 2024 Computational Efficiency Language Modeling
Code Code Available 03D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding Jun 14, 2024 Language Modeling Language Modelling
— Unverified 0Datasets for Multilingual Answer Sentence Selection Jun 14, 2024 Language Modeling Language Modelling
— Unverified 0GEB-1.3B: Open Lightweight Large Language Model Jun 14, 2024 CPU Language Modeling
— Unverified 0TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners Jun 14, 2024 Language Modeling Language Modelling
— Unverified 0Let the Poem Hit the Rhythm: Using a Byte-Based Transformer for Beat-Aligned Poetry Generation Jun 14, 2024 Language Modeling Language Modelling
Code Code Available 0PRISM: A Design Framework for Open-Source Foundation Model Safety Jun 14, 2024 Language Modelling
— Unverified 0Precision Empowers, Excess Distracts: Visual Question Answering With Dynamically Infused Knowledge In Language Models Jun 14, 2024 Decoder Knowledge Graphs
— Unverified 0RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model Jun 14, 2024 Language Modeling Language Modelling
— Unverified 0OSPC: Detecting Harmful Memes with Large Language Model as a Catalyst Jun 14, 2024 Image Captioning Language Modeling
— Unverified 0