| Structured Object Language Modeling (SoLM): Native Structured Objects Generation Conforming to Complex Schemas with Self-Supervised Denoising | Nov 28, 2024 | DenoisingLanguage Modeling | —Unverified | 0 |
| HyperSeg: Towards Universal Visual Segmentation with Large Language Model | Nov 26, 2024 | Language ModelingLarge Language Model | CodeCode Available | 2 |
| Functionality understanding and segmentation in 3D scenes | Nov 25, 2024 | AI AgentLanguage Modeling | —Unverified | 0 |
| PIANIST: Learning Partially Observable World Models with LLMs for Multi-Agent Decision Making | Nov 24, 2024 | Decision MakingWorld Knowledge | —Unverified | 0 |
| GOT4Rec: Graph of Thoughts for Sequential Recommendation | Nov 22, 2024 | General KnowledgeSequential Recommendation | —Unverified | 0 |
| LEADRE: Multi-Faceted Knowledge Enhanced LLM Empowered Display Advertisement Recommender System | Nov 21, 2024 | Learning-To-RankPrompt Engineering | —Unverified | 0 |
| GLOVER: Generalizable Open-Vocabulary Affordance Reasoning for Task-Oriented Grasping | Nov 19, 2024 | Common Sense ReasoningHuman-Object Interaction Detection | —Unverified | 0 |
| Past, Present, and Future of Sensor-Based Human Activity Recognition Using Wearables: A Surveying Tutorial on a Still Challenging Task | Nov 11, 2024 | Activity RecognitionHuman Activity Recognition | —Unverified | 0 |
| LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation | Nov 7, 2024 | Contrastive LearningImage Captioning | CodeCode Available | 4 |
| Gradient Localization Improves Lifelong Pretraining of Language Models | Nov 7, 2024 | Continual LearningWorld Knowledge | —Unverified | 0 |
| Vision Language Models are In-Context Value Learners | Nov 7, 2024 | In-Context LearningWorld Knowledge | —Unverified | 0 |
| Pre-trained Visual Dynamics Representations for Efficient Policy Learning | Nov 5, 2024 | Reinforcement Learning (RL)Video Prediction | —Unverified | 0 |
| ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model | Nov 4, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| On the Exploration of LM-Based Soft Modular Robot Design | Nov 1, 2024 | World Knowledge | —Unverified | 0 |
| Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation | Nov 1, 2024 | EpidemiologyKnowledge Distillation | —Unverified | 0 |
| EMMA: End-to-End Multimodal Model for Autonomous Driving | Oct 30, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| GRADE: Quantifying Sample Diversity in Text-to-Image Models | Oct 29, 2024 | AttributeDiversity | —Unverified | 0 |
| ADAM: An Embodied Causal Agent in Open-World Environments | Oct 29, 2024 | Lifelong learningMinecraft | —Unverified | 0 |
| Learning and Unlearning of Fabricated Knowledge in Language Models | Oct 29, 2024 | Data PoisoningLanguage Modeling | —Unverified | 0 |
| ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval | Oct 24, 2024 | Image RetrievalRetrieval | CodeCode Available | 0 |
| All Entities are Not Created Equal: Examining the Long Tail for Fine-Grained Entity Typing | Oct 22, 2024 | AllEntity Typing | —Unverified | 0 |
| Rulebreakers Challenge: Revealing a Blind Spot in Large Language Models' Reasoning with Formal Logic | Oct 21, 2024 | Formal LogicWorld Knowledge | —Unverified | 0 |
| Roadmap towards Superhuman Speech Understanding using Large Language Models | Oct 17, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Comprehending Knowledge Graphs with Large Language Models for Recommender Systems | Oct 16, 2024 | Knowledge-Aware RecommendationKnowledge Graphs | —Unverified | 0 |
| Understanding the Role of LLMs in Multimodal Evaluation Benchmarks | Oct 16, 2024 | BenchmarkingLarge Language Model | CodeCode Available | 0 |