| Inadequacies of Large Language Model Benchmarks in the Era of Generative Artificial Intelligence | Feb 15, 2024 | DiversityLanguage Modeling | —Unverified | 0 |
| Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization | Feb 15, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| DE-COP: Detecting Copyrighted Content in Language Models Training Data | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Multi-Fidelity Methods for Optimization: A Survey | Feb 15, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Grounding Language Model with Chunking-Free In-Context Retrieval | Feb 15, 2024 | ChunkingLanguage Modeling | —Unverified | 0 |
| Generative Representational Instruction Tuning | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Fine-tuning Large Language Model (LLM) Artificial Intelligence Chatbots in Ophthalmology and LLM-based evaluation using GPT-4 | Feb 15, 2024 | ChatbotLanguage Modeling | —Unverified | 0 |
| Visually Dehallucinative Instruction Generation: Know What You Don't Know | Feb 15, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| Fast Vocabulary Transfer for Language Model Compression | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Quantized Embedding Vectors for Controllable Diffusion Language Models | Feb 15, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Camouflage is all you need: Evaluating and Enhancing Language Model Robustness Against Camouflage Adversarial Attacks | Feb 15, 2024 | AllDecoder | —Unverified | 0 |
| Generalization in Healthcare AI: Evaluation of a Clinical Large Language Model | Feb 14, 2024 | DescriptiveLanguage Modeling | —Unverified | 0 |
| A Language Model for Particle Tracking | Feb 14, 2024 | Deep LearningLanguage Modeling | —Unverified | 0 |
| Large Language Model Simulator for Cold-Start Recommendation | Feb 14, 2024 | Collaborative FilteringLanguage Modeling | —Unverified | 0 |
| AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails | Feb 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Large Language Model-Based Interpretable Machine Learning Control in Building Energy Systems | Feb 14, 2024 | Decision MakingIn-Context Learning | —Unverified | 0 |
| Multi-Query Focused Disaster Summarization via Instruction-Based Prompting | Feb 14, 2024 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| Pretraining Vision-Language Model for Difference Visual Question Answering in Longitudinal Chest X-rays | Feb 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Rethinking Large Language Model Architectures for Sequential Recommendations | Feb 14, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous Systems | Feb 14, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Switch EMA: A Free Lunch for Better Flatness and Sharpness | Feb 14, 2024 | Attributeimage-classification | CodeCode Available | 1 |
| Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents | Feb 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Large Language Model with Graph Convolution for Recommendation | Feb 14, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems | Feb 14, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| FGeo-DRL: Deductive Reasoning for Geometric Problems through Deep Reinforcement Learning | Feb 14, 2024 | AI AgentDeep Reinforcement Learning | CodeCode Available | 0 |
| Measuring and Controlling Instruction (In)Stability in Language Model Dialogs | Feb 13, 2024 | ChatbotLanguage Modeling | CodeCode Available | 1 |
| Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs | Feb 13, 2024 | image-classificationImage Classification | —Unverified | 0 |
| Punctuation Restoration Improves Structure Understanding Without Supervision | Feb 13, 2024 | ChunkingLanguage Modeling | CodeCode Available | 0 |
| VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search | Feb 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks | Feb 13, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill | Feb 13, 2024 | DiversityImitation Learning | —Unverified | 0 |
| The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale | Feb 13, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| Visually Dehallucinative Instruction Generation | Feb 13, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| Privacy-Preserving Language Model Inference with Instance Obfuscation | Feb 13, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models | Feb 13, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Lumos : Empowering Multimodal LLMs with Scene Text Recognition | Feb 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Active Preference Learning for Large Language Models | Feb 12, 2024 | Active LearningLanguage Modeling | —Unverified | 0 |
| Careless Whisper: Speech-to-Text Hallucination Harms | Feb 12, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets | Feb 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A RAG-Based Multi-Agent LLM System for Natural Hazard Resilience and Adaptation | Feb 12, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Suppressing Pink Elephants with Direct Principle Feedback | Feb 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| BreakGPT: A Large Language Model with Multi-stage Structure for Financial Breakout Detection | Feb 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models | Feb 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning | Feb 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model | Feb 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Pushing The Limit of LLM Capacity for Text Classification | Feb 12, 2024 | ClassificationLanguage Modeling | —Unverified | 0 |
| Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts | Feb 12, 2024 | Continual PretrainingGSM8K | CodeCode Available | 2 |
| Assessing Generalization for Subpopulation Representative Modeling via In-Context Learning | Feb 12, 2024 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| SemTra: A Semantic Skill Translator for Cross-Domain Zero-Shot Policy Adaptation | Feb 12, 2024 | Autonomous VehiclesContrastive Learning | —Unverified | 0 |