Grandes Modelos de Linguagem Multimodais (MLLMs): Da Teoria à Prática

2026-02-11Code Available0· sign in to hype

Neemias da Silva, Júlio C. W. Scholz, John Harrison, Marina Borges, Paulo Ávila, Frances A Santos, Myriam Delgado, Rodrigo Minetto, Thiago H Silva

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/neemiasbsilva/mllms-teoria-e-pratica
OfficialIn paper★ 5

Abstract

Multimodal Large Language Models (MLLMs) combine the natural language understanding and generation capabilities of LLMs with perception skills in modalities such as image and audio, representing a key advancement in contemporary AI. This chapter presents the main fundamentals of MLLMs and emblematic models. Practical techniques for preprocessing, prompt engineering, and building multimodal pipelines with LangChain and LangGraph are also explored. For further practical study, supplementary material is publicly available online: https://github.com/neemiasbsilva/MLLMs-Teoria-e-Pratica. Finally, the chapter discusses the challenges and highlights promising trends.

Grandes Modelos de Linguagem Multimodais (MLLMs): Da Teoria à Prática

Code

Abstract

Reproductions