| Adaptive Reasoning and Acting in Medical Language Agents | Oct 13, 2024 | Decision MakingDiagnostic | —Unverified | 0 |
| Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code | Oct 13, 2024 | Code GenerationHallucination | —Unverified | 0 |
| EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation | Oct 13, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| LoRE: Logit-Ranked Retriever Ensemble for Enhancing Open-Domain Question Answering | Oct 13, 2024 | Answer GenerationLanguage Modeling | —Unverified | 0 |
| COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement | Oct 12, 2024 | Code GenerationComputational Efficiency | CodeCode Available | 0 |
| Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation | Oct 12, 2024 | Code GenerationLanguage Modeling | —Unverified | 0 |
| LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning | Oct 12, 2024 | Knowledge GraphsLanguage Modeling | CodeCode Available | 0 |
| Enterprise Benchmarks for Large Language Model Evaluation | Oct 11, 2024 | BenchmarkingLanguage Model Evaluation | CodeCode Available | 0 |
| LLMD: A Large Language Model for Interpreting Longitudinal Medical Records | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| nach0-pc: Multi-task Language Model with Molecular Point Cloud Encoder | Oct 11, 2024 | Drug DiscoveryLanguage Modeling | —Unverified | 0 |
| Language-Model-Assisted Bi-Level Programming for Reward Learning from Internet Videos | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ACER: Automatic Language Model Context Extension via Retrieval | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Same But Different: Structural Similarities and Differences in Multilingual Language Modeling | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization | Oct 11, 2024 | GSM8KLanguage Modeling | CodeCode Available | 2 |
| Can a large language model be a gaslighter? | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents | Oct 11, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 1 |
| Distributionally robust self-supervised learning for tabular data | Oct 11, 2024 | DecoderLanguage Modeling | CodeCode Available | 0 |
| Efficiently Scanning and Resampling Spatio-Temporal Tasks with Irregular Observations | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Generation with Dynamic Vocabulary | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| MedMobile: A mobile-sized language model with expert-level clinical capabilities | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Retraining-Free Merging of Sparse MoE via Hierarchical Clustering | Oct 11, 2024 | ClusteringLanguage Modeling | CodeCode Available | 1 |
| Parameter-Efficient Fine-Tuning of State Space Models | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| SocialGaze: Improving the Integration of Human Social Norms in Large Language Models | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Baichuan-Omni Technical Report | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Lifelong Event Detection via Optimal Transport | Oct 11, 2024 | Event DetectionLanguage Modeling | —Unverified | 0 |
| Zeroth-Order Fine-Tuning of LLMs in Random Subspaces | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Preferential Normalizing Flows | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective | Oct 11, 2024 | Conformal PredictionKnowledge Graphs | —Unverified | 0 |
| Do Unlearning Methods Remove Information from Language Model Weights? | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation | Oct 11, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| uto\!L: Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks | Oct 11, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model | Oct 11, 2024 | Common Sense ReasoningLanguage Modeling | —Unverified | 0 |
| Emergent social conventions and collective bias in LLM populations | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning | Oct 11, 2024 | Data PoisoningLanguage Modeling | CodeCode Available | 1 |
| SimpleStrat: Diversifying Language Model Generation with Stratification | Oct 11, 2024 | DiversityLanguage Modeling | —Unverified | 0 |
| Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both | Oct 11, 2024 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| A Framework for Collaborating a Large Language Model Tool in Brainstorming for Triggering Creative Thoughts | Oct 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Large Language Model GreekLegalRoBERTa | Oct 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation Prediction | Oct 10, 2024 | Binary ClassificationCitation Prediction | CodeCode Available | 0 |
| LecPrompt: A Prompt-based Approach for Logical Error Correction with CodeBERT | Oct 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting | Oct 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining | Oct 10, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Bilinear MLPs enable weight-based mechanistic interpretability | Oct 10, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Language model developers should report train-test overlap | Oct 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Uncovering Overfitting in Large Language Model Editing | Oct 10, 2024 | AttributeIn-Context Learning | —Unverified | 0 |
| Mechanistic Permutability: Match Features Across Layers | Oct 10, 2024 | DecoderLanguage Modeling | —Unverified | 0 |
| AgroGPT: Efficient Agricultural Vision-Language Model with Expert Tuning | Oct 10, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |