| Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction | Jun 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| AgentReview: Exploring Peer Review Dynamics with LLM Agents | Jun 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| mDPO: Conditional Preference Optimization for Multimodal Large Language Models | Jun 17, 2024 | HallucinationLanguage Modeling | CodeCode Available | 2 |
| Large Scale Transfer Learning for Tabular Data via Language Modeling | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities | Jun 17, 2024 | Audio Question AnsweringInstruction Following | CodeCode Available | 2 |
| Explore the Limits of Omni-modal Pretraining at Scale | Jun 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| On Softmax Direct Preference Optimization for Recommendation | Jun 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Enhancing Diagnostic Accuracy in Rare and Common Fundus Diseases with a Knowledge-Rich Vision-Language Model | Jun 13, 2024 | DiagnosticImage Retrieval | CodeCode Available | 2 |
| StreamBench: Towards Benchmarking Continuous Improvement of Language Agents | Jun 13, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| Discovering Preference Optimization Algorithms with and for Large Language Models | Jun 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent | Jun 11, 2024 | AI AgentDescriptive | CodeCode Available | 2 |
| LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model | Jun 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data | Jun 6, 2024 | DenoisingLanguage Modeling | CodeCode Available | 2 |
| BLSP-Emo: Towards Empathetic Large Speech-Language Models | Jun 6, 2024 | Emotion RecognitionInstruction Following | CodeCode Available | 2 |
| Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis | Jun 6, 2024 | DecoderInductive Bias | CodeCode Available | 2 |
| Simplified and Generalized Masked Diffusion for Discrete Data | Jun 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences | Jun 5, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models | Jun 5, 2024 | Few-Shot LearningLanguage Modeling | CodeCode Available | 2 |
| Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models | Jun 5, 2024 | DiversityLanguage Modeling | CodeCode Available | 2 |
| Block Transformer: Global-to-Local Language Modeling for Fast Inference | Jun 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow | Jun 3, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM | Jun 3, 2024 | DecoderGPU | CodeCode Available | 2 |
| Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer | Jun 3, 2024 | Audio GenerationIn-Context Learning | CodeCode Available | 2 |
| GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model | Jun 3, 2024 | geo-localizationLanguage Modeling | CodeCode Available | 2 |