Sailor: Open Language Models for South-East Asia Apr 4, 2024 Language Modeling Language Modelling
Code Code Available 4Knowledge Fusion of Large Language Models Jan 19, 2024 Code Generation Common Sense Reasoning
Code Code Available 4Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset May 17, 2024 16k Benchmarking
Code Code Available 3Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Mar 5, 2024 Image Generation
Code Code Available 3Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment Jan 23, 2024 All Instruction Following
Code Code Available 3Generative Data Augmentation using LLMs improves Distributional Robustness in Question Answering Sep 3, 2023 Data Augmentation Domain Adaptation
Code Code Available 3Language Models are Few-Shot Learners May 28, 2020 answerability prediction Articles
Code Code Available 3Pre-Training with Whole Word Masking for Chinese BERT Jun 19, 2019 Document Classification General Classification
Code Code Available 3Harmonizing Visual Text Comprehension and Generation Jul 23, 2024 multimodal generation Reading Comprehension
Code Code Available 2DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems Jul 15, 2024 Language Modelling Large Language Model
Code Code Available 2TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy Jun 3, 2024 Language Modelling Question Answering
Code Code Available 2MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering May 20, 2024 Benchmarking Question Answering
Code Code Available 2ST-LLM: Large Language Models Are Effective Temporal Learners Mar 30, 2024 MVBench Reading Comprehension
Code Code Available 2GPT4Point: A Unified Framework for Point-Language Understanding and Generation Dec 5, 2023 3D Generation Image Generation
Code Code Available 2MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens Oct 3, 2023 Image Generation multimodal generation
Code Code Available 2The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants Aug 31, 2023 Belebele Cross-Lingual Transfer
Code Code Available 2MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model Apr 3, 2023 Machine Reading Comprehension Reading Comprehension
Code Code Available 2PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development Jan 23, 2023 Question Answering Reading Comprehension
Code Code Available 2PaLM: Scaling Language Modeling with Pathways Apr 5, 2022 Auto Debugging Code Generation
Code Code Available 2Scaling Language Models: Methods, Analysis & Insights from Training Gopher Dec 8, 2021 Abstract Algebra Anachronisms
Code Code Available 2Learning Dense Representations of Phrases at Scale Dec 23, 2020 Open-Domain Question Answering Question Answering
Code Code Available 2What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams Sep 28, 2020 MedQA Multiple-choice
Code Code Available 2DeBERTa: Decoding-enhanced BERT with Disentangled Attention Jun 5, 2020 Common Sense Reasoning Coreference Resolution
Code Code Available 2CLUE: A Chinese Language Understanding Evaluation Benchmark Apr 13, 2020 General Classification Machine Reading Comprehension
Code Code Available 2TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing Feb 28, 2020 Knowledge Distillation Reading Comprehension
Code Code Available 2Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Sep 17, 2019 GPU LAMBADA
Code Code Available 2DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy Jul 2, 2025 Data Augmentation Generalized Referring Expression Segmentation
Code Code Available 1ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models May 25, 2025 Optical Character Recognition (OCR) Reading Comprehension
Code Code Available 1Training Language Models to Win Debates with Self-Play Improves Judge Accuracy Sep 25, 2024 Language Modeling Language Modelling
Code Code Available 1DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective Partitioning Sep 2, 2024 Code Completion Combinatorial Optimization
Code Code Available 1Multi-Grained Query-Guided Set Prediction Network for Grounded Multimodal Named Entity Recognition Jul 17, 2024 Grounded Multimodal Named Entity Recognition Machine Reading Comprehension
Code Code Available 1FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models Jun 23, 2024 Memorization Reading Comprehension
Code Code Available 1Improving Visual Commonsense in Language Models via Multiple Image Generation Jun 19, 2024 Common Sense Reasoning Image Generation
Code Code Available 1PMG : Personalized Multimodal Generation with Large Language Models Apr 7, 2024 multimodal generation Reading Comprehension
Code Code Available 1Latxa: An Open Language Model and Evaluation Suite for Basque Mar 29, 2024 Language Modeling Language Modelling
Code Code Available 1ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages Mar 26, 2024 Machine Reading Comprehension Optical Character Recognition (OCR)
Code Code Available 1ArabicaQA: A Comprehensive Dataset for Arabic Question Answering Mar 26, 2024 Benchmarking Machine Reading Comprehension
Code Code Available 1AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large Language Models Mar 11, 2024 Philosophy Reading Comprehension
Code Code Available 1PoTeC: A German Naturalistic Eye-tracking-while-reading Corpus Mar 1, 2024 Reading Comprehension
Code Code Available 1LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time-Sensitive Test Construction Dec 19, 2023 Language Model Evaluation Language Modeling
Code Code Available 1Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions Dec 5, 2023 Benchmarking Conversational Question Answering
Code Code Available 1Token-Level Adaptation of LoRA Adapters for Downstream Task Generalization Nov 17, 2023 ARC GSM8K
Code Code Available 1Debate Helps Supervise Unreliable Experts Nov 15, 2023 Reading Comprehension
Code Code Available 1Mirror: A Universal Framework for Various Information Extraction Tasks Nov 9, 2023 Machine Reading Comprehension Reading Comprehension
Code Code Available 1CreoleVal: Multilingual Multitask Benchmarks for Creoles Oct 30, 2023 Machine Translation Reading Comprehension
Code Code Available 1MPrompt: Exploring Multi-level Prompt Tuning for Machine Reading Comprehension Oct 27, 2023 Machine Reading Comprehension Reading Comprehension
Code Code Available 1DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading Oct 23, 2023 Document AI document understanding
Code Code Available 1In-context Pretraining: Language Modeling Beyond Document Boundaries Oct 16, 2023 In-Context Learning Language Modeling
Code Code Available 1Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models Oct 8, 2023 MMLU Natural Language Understanding
Code Code Available 1Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation Sep 19, 2023 Language Model Evaluation Language Modeling
Code Code Available 1