Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding May 22, 2025 Language Modeling Language Modelling
Code Code Available 2CAD-Coder: An Open-Source Vision-Language Model for Computer-Aided Design Code Generation May 20, 2025 Code Generation Language Modeling
Code Code Available 2G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning May 19, 2025 Language Modeling Language Modelling
Code Code Available 2Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space May 19, 2025 Language Modeling Language Modelling
Code Code Available 2SLOT: Sample-specific Language Model Optimization at Test-time May 18, 2025 GSM8K Language Modeling
Code Code Available 2LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners May 17, 2025 Language Modeling Language Modelling
Code Code Available 2Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents May 17, 2025 Language Modeling Language Modelling
Code Code Available 2WorldPM: Scaling Human Preference Modeling May 15, 2025 Language Modeling Language Modelling
Code Code Available 2Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement May 13, 2025 Benchmarking Language Modeling
Code Code Available 2Behind Maya: Building a Multilingual Vision Language Model May 13, 2025 Language Modeling Language Modelling
Code Code Available 2DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation May 12, 2025 Language Modeling Language Modelling
Code Code Available 2GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance May 11, 2025 Language Modeling Language Modelling
Code Code Available 2MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based Agents May 4, 2025 Language Modeling Language Modelling
Code Code Available 2RWKV-X: A Linear Complexity Hybrid Language Model Apr 30, 2025 Language Modeling Language Modelling
Code Code Available 2Towards Practical Second-Order Optimizers in Deep Learning: Insights from Fisher Information Analysis Apr 26, 2025 Computational Efficiency image-classification
Code Code Available 2The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer Apr 14, 2025 Language Modeling Language Modelling
Code Code Available 2ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model Apr 13, 2025 Diagnostic Language Modeling
Code Code Available 2Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation Apr 13, 2025 Domain Adaptation Language Modeling
Code Code Available 2SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model Apr 13, 2025 Language Modeling Language Modelling
Code Code Available 2PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models Apr 11, 2025 Clustering Language Modeling
Code Code Available 2GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation Apr 10, 2025 Contrastive Learning Language Modeling
Code Code Available 2TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling Apr 9, 2025 Language Modeling Language Modelling
Code Code Available 2Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation Apr 3, 2025 Computational Efficiency GPU
Code Code Available 2Unicorn: Text-Only Data Synthesis for Vision Language Model Training Mar 28, 2025 Language Modeling Language Modelling
Code Code Available 2Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model Mar 27, 2025 EgoSchema Language Modeling
Code Code Available 2Rethinking Vision-Language Model in Face Forensics: Multi-Modal Interpretable Forged Face Detector Mar 26, 2025 Binary Classification DeepFake Detection
Code Code Available 2Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis Mar 25, 2025 Contrastive Learning Image-text Retrieval
Code Code Available 2MC-LLaVA: Multi-Concept Personalized Vision-Language Model Mar 24, 2025 Language Modeling Language Modelling
Code Code Available 2CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities Mar 21, 2025 Language Modeling Language Modelling
Code Code Available 2FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models Mar 21, 2025 Language Modeling Language Modelling
Code Code Available 2Modifying Large Language Model Post-Training for Diverse Creative Writing Mar 21, 2025 Diversity Language Modeling
Code Code Available 2Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model Mar 20, 2025 Language Modeling Language Modelling
Code Code Available 2VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning Mar 19, 2025 Benchmarking Language Modeling
Code Code Available 2Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels Mar 18, 2025 GPU Language Modeling
Code Code Available 2MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling Mar 17, 2025 GPU Language Modeling
Code Code Available 2Generative Modeling for Mathematical Discovery Mar 14, 2025 Language Modeling Language Modelling
Code Code Available 2GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding Mar 13, 2025 Diversity Language Modeling
Code Code Available 2OR-LLM-Agent: Automating Modeling and Solving of Operations Research Optimization Problem with Reasoning Large Language Model Mar 13, 2025 AI Agent Language Modeling
Code Code Available 2Mellow: a small audio language model for reasoning Mar 11, 2025 Audio captioning Language Modeling
Code Code Available 2LongProLIP: A Probabilistic Vision-Language Model with Long Context Text Mar 11, 2025 Language Modeling Language Modelling
Code Code Available 2When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning Mar 10, 2025 Language Modeling Language Modelling
Code Code Available 2DiffCLIP: Differential Attention Meets CLIP Mar 9, 2025 Language Modeling Language Modelling
Code Code Available 2Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model Mar 8, 2025 Image Quality Assessment Language Modeling
Code Code Available 2A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval Mar 7, 2025 Information Retrieval Language Modeling
Code Code Available 2PromptPex: Automatic Test Generation for Language Model Prompts Mar 7, 2025 Language Modeling Language Modelling
Code Code Available 2AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM Mar 6, 2025 Anomaly Detection Language Modeling
Code Code Available 2An Egocentric Vision-Language Model based Portable Real-time Smart Assistant Mar 6, 2025 Language Modeling Language Modelling
Code Code Available 2Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model Mar 6, 2025 General Knowledge Image Captioning
Code Code Available 2Scaling Rich Style-Prompted Text-to-Speech Datasets Mar 6, 2025 Language Modeling Language Modelling
Code Code Available 2Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities Mar 6, 2025 Language Modeling Language Modelling
Code Code Available 2