| Scaling Synthetic Data Creation with 1,000,000,000 Personas | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models | May 16, 2024 | In-Context LearningQuestion Answering | CodeCode Available | 7 | 5 |
| Mistral 7B | Oct 10, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 | 5 |
| Direct Preference Optimization: Your Language Model is Secretly a Reward Model | May 29, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| ChatDBG: Augmenting Debugging with Large Language Models | Mar 25, 2024 | C++ codeNavigate | CodeCode Available | 5 | 5 |
| ShareGPT4Video: Improving Video Understanding and Generation with Better Captions | Jun 6, 2024 | Video CaptioningVideo Generation | CodeCode Available | 5 | 5 |
| PointVLA: Injecting the 3D World into Vision-Language-Action Models | Mar 10, 2025 | Imitation LearningSpatial Reasoning | CodeCode Available | 4 | 5 |
| VILA: On Pre-training for Visual Language Models | Dec 12, 2023 | In-Context LearningLanguage Modelling | CodeCode Available | 4 | 5 |
| WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation | Mar 10, 2025 | Common Sense ReasoningImage Generation | CodeCode Available | 4 | 5 |
| LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation | Nov 7, 2024 | Contrastive LearningImage Captioning | CodeCode Available | 4 | 5 |
| LISA: Reasoning Segmentation via Large Language Model | Aug 1, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 | 5 |
| V?: Guided Visual Search as a Core Mechanism in Multimodal LLMs | Jan 1, 2024 | Visual GroundingWorld Knowledge | CodeCode Available | 4 | 5 |
| Text2SQL is Not Enough: Unifying AI and Databases with TAG | Aug 27, 2024 | RAGRetrieval-augmented Generation | CodeCode Available | 4 | 5 |
| Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks | May 22, 2020 | Fact VerificationQuestion Answering | CodeCode Available | 4 | 5 |
| HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling | Sep 19, 2024 | Large Language ModelRecommendation Systems | CodeCode Available | 4 | 5 |
| LLaRA: Supercharging Robot Learning Data for Vision-Language Policy | Jun 28, 2024 | Vision-Language-ActionWorld Knowledge | CodeCode Available | 3 | 5 |
| GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views | Apr 2, 2024 | 3DGSNovel View Synthesis | CodeCode Available | 3 | 5 |
| Cold-Start Recommendation towards the Era of Large Language Models (LLMs): A Comprehensive Survey and Roadmap | Jan 3, 2025 | Recommendation SystemsWorld Knowledge | CodeCode Available | 3 | 5 |
| HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation | Jan 24, 2025 | Autonomous DrivingLanguage Modeling | CodeCode Available | 3 | 5 |
| AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | Jun 16, 2025 | Action GenerationAutonomous Driving | CodeCode Available | 3 | 5 |
| How Can Recommender Systems Benefit from Large Language Models: A Survey | Jun 9, 2023 | EthicsFeature Engineering | CodeCode Available | 3 | 5 |
| Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation | Apr 15, 2024 | Contrastive LearningDescriptive | CodeCode Available | 3 | 5 |
| Unified Source-Free Domain Adaptation | Mar 12, 2024 | Domain AdaptationLanguage Modelling | CodeCode Available | 3 | 5 |
| Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models | Sep 3, 2023 | HallucinationWorld Knowledge | CodeCode Available | 3 | 5 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 | 5 |
| Are We on the Right Way for Evaluating Large Vision-Language Models? | Mar 29, 2024 | World Knowledge | CodeCode Available | 3 | 5 |
| DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge | Jul 6, 2025 | Image GenerationMultimodal Reasoning | CodeCode Available | 3 | 5 |
| GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation | Apr 3, 2025 | Image GenerationWorld Knowledge | CodeCode Available | 3 | 5 |
| Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks | Aug 7, 2024 | AttributeIn-Context Learning | CodeCode Available | 2 | 5 |
| CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning | Jun 7, 2024 | Instruction FollowingMath | CodeCode Available | 2 | 5 |
| PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change | Jun 21, 2022 | Common Sense ReasoningDiversity | CodeCode Available | 2 | 5 |
| One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos | Sep 29, 2024 | AllImage Segmentation | CodeCode Available | 2 | 5 |
| ConTextTab: A Semantics-Aware Tabular In-Context Learner | Jun 12, 2025 | In-Context LearningWorld Knowledge | CodeCode Available | 2 | 5 |
| On Softmax Direct Preference Optimization for Recommendation | Jun 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Agent Planning with World Knowledge Model | May 23, 2024 | modelWorld Knowledge | CodeCode Available | 2 | 5 |
| MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark | Dec 19, 2024 | MMLUMultiple-choice | CodeCode Available | 2 | 5 |
| MeaCap: Memory-Augmented Zero-shot Image Captioning | Mar 6, 2024 | Caption GenerationImage Captioning | CodeCode Available | 2 | 5 |
| Measuring Massive Multitask Language Understanding | Sep 7, 2020 | Elementary MathematicsMulti-task Language Understanding | CodeCode Available | 2 | 5 |
| Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models | May 24, 2024 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 2 | 5 |
| RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit | Jun 8, 2023 | Answer GenerationFact Checking | CodeCode Available | 2 | 5 |
| LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments | Jun 24, 2024 | World Knowledge | CodeCode Available | 2 | 5 |
| HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver | Jan 1, 2025 | Reasoning SegmentationSegmentation | CodeCode Available | 2 | 5 |
| HyperSeg: Towards Universal Visual Segmentation with Large Language Model | Nov 26, 2024 | Language ModelingLarge Language Model | CodeCode Available | 2 | 5 |
| Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents | Jan 18, 2022 | Robot Task PlanningWorld Knowledge | CodeCode Available | 2 | 5 |
| KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledge | May 26, 2024 | Graph EmbeddingInformativeness | CodeCode Available | 2 | 5 |
| ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human | Apr 16, 2023 | World Knowledge | CodeCode Available | 2 | 5 |
| GreaseLM: Graph REASoning Enhanced Language Models for Question Answering | Jan 21, 2022 | Knowledge GraphsMedical Question Answering | CodeCode Available | 2 | 5 |
| Grasp-Anything: Large-scale Grasp Dataset from Foundation Models | Sep 18, 2023 | DiversityRobotic Grasping | CodeCode Available | 2 | 5 |
| Language Representations Can be What Recommenders Need: Findings and Potentials | Jul 7, 2024 | Collaborative FilteringContrastive Learning | CodeCode Available | 2 | 5 |
| A Synthetic Dataset for Personal Attribute Inference | Jun 11, 2024 | AttributeAuthor Profiling | CodeCode Available | 2 | 5 |