| Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language Models | Nov 1, 2024 | Decision MakingInformativeness | CodeCode Available | 1 |
| LLaMo: Large Language Model-based Molecular Graph Assistant | Oct 31, 2024 | Instruction FollowingIUPAC Name Prediction | CodeCode Available | 1 |
| Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction | Oct 31, 2024 | Disaster ResponseLanguage Modeling | CodeCode Available | 1 |
| Interpretable Language Modeling via Induction-head Ngram Models | Oct 31, 2024 | Causal Language ModelingHuman fMRI response prediction | CodeCode Available | 1 |
| Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback | Oct 30, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning | Oct 30, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 1 |
| SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types | Oct 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| f-PO: Generalizing Preference Optimization with f-divergence Minimization | Oct 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Long-context Protein Language Modeling Using Bidirectional Mamba with Shared Projection Layers | Oct 29, 2024 | Drug DesignLanguage Modeling | CodeCode Available | 1 |
| LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment | Oct 28, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| TrajAgent: An Agent Framework for Unified Trajectory Modelling | Oct 27, 2024 | Future predictionLanguage Modeling | CodeCode Available | 1 |
| LOGO -- Long cOntext aliGnment via efficient preference Optimization | Oct 24, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| GCoder: Improving Large Language Model for Generalized Graph Problem Solving | Oct 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Cross-model Control: Improving Multiple Large Language Models in One-time Training | Oct 23, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 1 |
| GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration | Oct 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Automated Spinal MRI Labelling from Reports Using a Large Language Model | Oct 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Scalable Influence and Fact Tracing for Large Language Model Pretraining | Oct 22, 2024 | AttributeLanguage Modeling | CodeCode Available | 1 |
| Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes | Oct 22, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Building A Coding Assistant via the Retrieval-Augmented Language Model | Oct 21, 2024 | Code CompletionCode Generation | CodeCode Available | 1 |
| SeisLM: a Foundation Model for Seismic Waveforms | Oct 21, 2024 | Event DetectionLanguage Modeling | CodeCode Available | 1 |
| A Realistic Threat Model for Large Language Model Jailbreaks | Oct 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Residual vector quantization for KV cache compression in large language model | Oct 21, 2024 | Audio CompressionLanguage Modeling | CodeCode Available | 1 |
| M-RewardBench: Evaluating Reward Models in Multilingual Settings | Oct 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning | Oct 18, 2024 | HallucinationKnowledge Base Question Answering | CodeCode Available | 1 |
| MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts | Oct 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation | Oct 17, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| FIRE: Fact-checking with Iterative Retrieval and Verification | Oct 17, 2024 | Claim VerificationFact Checking | CodeCode Available | 1 |
| MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems | Oct 17, 2024 | Answer GenerationLanguage Modeling | CodeCode Available | 1 |
| VividMed: Vision Language Model with Versatile Visual Grounding for Medicine | Oct 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims | Oct 16, 2024 | Fact CheckingLanguage Modeling | CodeCode Available | 1 |
| CREAM: Consistency Regularized Self-Rewarding Language Models | Oct 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models | Oct 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| TopoLM: brain-like spatio-functional organization in a topographic language model | Oct 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware | Oct 15, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 1 |
| Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses | Oct 15, 2024 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics | Oct 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs | Oct 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning | Oct 11, 2024 | Data PoisoningLanguage Modeling | CodeCode Available | 1 |
| Retraining-Free Merging of Sparse MoE via Hierarchical Clustering | Oct 11, 2024 | ClusteringLanguage Modeling | CodeCode Available | 1 |
| Zeroth-Order Fine-Tuning of LLMs in Random Subspaces | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents | Oct 11, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 1 |
| Parameter-Efficient Fine-Tuning of State Space Models | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Do Unlearning Methods Remove Information from Language Model Weights? | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Bilinear MLPs enable weight-based mechanistic interpretability | Oct 10, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining | Oct 10, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AgroGPT: Efficient Agricultural Vision-Language Model with Expert Tuning | Oct 10, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting | Oct 10, 2024 | Entity LinkingFew-Shot Learning | CodeCode Available | 1 |
| AuditWen:An Open-Source Large Language Model for Audit | Oct 9, 2024 | Answer GenerationLanguage Modeling | CodeCode Available | 1 |
| Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning | Oct 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Training-free Diffusion Model Alignment with Sampling Demons | Oct 8, 2024 | DenoisingImage Generation | CodeCode Available | 1 |