| Tsetlin Machine Embedding: Representing Words Using Logical Expressions | Jan 2, 2023 | Document ClassificationMachine Translation | CodeCode Available | 1 |
| HDLTex: Hierarchical Deep Learning for Text Classification | Sep 24, 2017 | ClassificationDeep Learning | CodeCode Available | 1 |
| Pre-training technique to localize medical BERT and enhance biomedical BERT | May 14, 2020 | Document ClassificationTransfer Learning | CodeCode Available | 1 |
| Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem | May 4, 2022 | Document ClassificationTraveling Salesman Problem | CodeCode Available | 1 |
| DocBERT: BERT for Document Classification | Apr 17, 2019 | ClassificationDocument Classification | CodeCode Available | 1 |
| A Comparative Study of Pretrained Language Models for Long Clinical Text | Jan 27, 2023 | Clinical KnowledgeDocument Classification | CodeCode Available | 1 |
| Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences | Jan 27, 2022 | Clinical KnowledgeDocument Classification | CodeCode Available | 1 |
| Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT | Apr 19, 2019 | Cross-Lingual NERCross-Lingual Transfer | CodeCode Available | 1 |
| Document Classification for COVID-19 Literature | Jun 15, 2020 | ArticlesClassification | CodeCode Available | 1 |
| Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles | Mar 22, 2020 | ArticlesDocument Classification | CodeCode Available | 1 |
| A Sentence-level Hierarchical BERT Model for Document Classification with Limited Labelled Data | Jun 12, 2021 | ClassificationDocument Classification | CodeCode Available | 1 |
| NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents | Feb 27, 2024 | Document ClassificationLanguage Modeling | CodeCode Available | 1 |
| ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models | Nov 15, 2023 | Document ClassificationQuestion Answering | CodeCode Available | 1 |
| Aspect-based Document Similarity for Research Papers | Oct 13, 2020 | Document ClassificationRecommendation Systems | CodeCode Available | 1 |
| Data Programming by Demonstration: A Framework for Interactively Learning Labeling Functions | Sep 3, 2020 | Document Classification | CodeCode Available | 1 |
| SPECTER: Document-level Representation Learning using Citation-informed Transformers | Apr 15, 2020 | Citation PredictionDocument Classification | CodeCode Available | 1 |
| HEAL: Hierarchical Embedding Alignment Loss for Improved Retrieval and Representation Learning | Dec 5, 2024 | Contrastive LearningDocument Classification | CodeCode Available | 1 |
| Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes | Oct 8, 2024 | ArticlesClassification | CodeCode Available | 1 |
| GeoGalactica: A Scientific Large Language Model in Geoscience | Dec 31, 2023 | Document ClassificationGeneral Knowledge | CodeCode Available | 1 |
| German's Next Language Model | Oct 21, 2020 | BenchmarkingDocument Classification | CodeCode Available | 1 |
| Semi-Supervised Classification with Graph Convolutional Networks | Sep 9, 2016 | Document ClassificationDrug Discovery | CodeCode Available | 1 |
| Hierarchical Metadata-Aware Document Categorization under Weak Supervision | Oct 26, 2020 | Data AugmentationDocument Classification | CodeCode Available | 1 |
| Hierarchical Transformers for Long Document Classification | Oct 23, 2019 | ClassificationDocument Classification | CodeCode Available | 1 |
| HiPool: Modeling Long Documents Using Graph Neural Networks | May 5, 2023 | Document ClassificationSentence | CodeCode Available | 1 |
| Bioformer: an efficient transformer language model for biomedical text mining | Feb 3, 2023 | ArticlesDocument Classification | CodeCode Available | 1 |
| Improving Document Classification with Multi-Sense Embeddings | Nov 18, 2019 | ClassificationClustering | CodeCode Available | 1 |
| Keyword Assisted Topic Models | Apr 13, 2020 | Document ClassificationTopic Models | CodeCode Available | 1 |
| L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages | Jan 4, 2024 | ArticlesClassification | CodeCode Available | 1 |
| LSD-C: Linearly Separable Deep Clusters | Jun 17, 2020 | ClusteringData Augmentation | CodeCode Available | 1 |
| MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network | Feb 24, 2020 | Document ClassificationGeneral Classification | CodeCode Available | 1 |
| Massively Multilingual Sparse Word Representations | May 1, 2020 | Dependency ParsingDocument Classification | CodeCode Available | 1 |
| Minimally Supervised Categorization of Text with Metadata | May 1, 2020 | Document Classification | CodeCode Available | 1 |
| Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning | Oct 13, 2015 | Document ClassificationRepresentation Learning | CodeCode Available | 1 |
| Can a Fruit Fly Learn Word Embeddings? | Jan 18, 2021 | Document ClassificationWord Embeddings | CodeCode Available | 1 |
| Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs | Oct 15, 2020 | Document ClassificationGeneral Classification | CodeCode Available | 1 |
| A Simple Approach to Learning Unsupervised Multilingual Embeddings | Apr 10, 2020 | Bilingual Lexicon InductionDependency Parsing | —Unverified | 0 |
| A Multi-task Approach to Learning Multilingual Representations | Jul 1, 2018 | Cross-Lingual Document ClassificationDocument Classification | —Unverified | 0 |
| A Semi-supervised Approach for Natural Language Call Routing | Aug 1, 2013 | Document ClassificationInformation Retrieval | —Unverified | 0 |
| A semi-automatic method for document classification in the shipping industry | Mar 29, 2023 | ClassificationDocument Classification | —Unverified | 0 |
| A Multiplicative Model for Learning Distributed Text-Based Attribute Representations | Jun 10, 2014 | AttributeAuthorship Attribution | —Unverified | 0 |
| A Semantic Cover Approach for Topic Modeling | Jun 1, 2019 | Document ClassificationGeneral Classification | —Unverified | 0 |
| A Multi-Modal Multilingual Benchmark for Document Image Classification | Oct 25, 2023 | ClassificationCross-Lingual Transfer | —Unverified | 0 |
| A Methodology for Evaluating Timeline Generation Algorithms based on Deep Semantic Units | Jul 1, 2015 | Document Classification | —Unverified | 0 |
| Argument Component Classification by Relation Identification by Neural Network and TextRank | Aug 1, 2019 | ArticlesClassification | —Unverified | 0 |
| Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs | Nov 9, 2024 | ClassificationClustering | —Unverified | 0 |
| Clustering of Multi-Word Named Entity variants: Multilingual Evaluation | May 1, 2014 | ClusteringDocument Classification | —Unverified | 0 |
| A Leveled Reading Corpus of Modern Standard Arabic | May 1, 2018 | Document ClassificationMachine Translation | —Unverified | 0 |
| A Comparative Study of Conversion Aided Methods for WordNet Sentence Textual Similarity | Aug 1, 2014 | Document ClassificationMachine Translation | —Unverified | 0 |
| Approximate Conditional Coverage & Calibration via Neural Model Approximations | May 28, 2022 | ClassificationDocument Classification | —Unverified | 0 |
| Applying Naive Bayes Classification to Google Play Apps Categorization | Aug 30, 2016 | ClassificationDocument Classification | —Unverified | 0 |