| Improving accuracy and speeding up Document Image Classification through parallel systems | Jun 16, 2020 | Document Classificationdocument-image-classification | CodeCode Available | 1 | 5 |
| HEAL: Hierarchical Embedding Alignment Loss for Improved Retrieval and Representation Learning | Dec 5, 2024 | Contrastive LearningDocument Classification | CodeCode Available | 1 | 5 |
| Pre-training technique to localize medical BERT and enhance biomedical BERT | May 14, 2020 | Document ClassificationTransfer Learning | CodeCode Available | 1 | 5 |
| Taken by Surprise: Contrast effect for Similarity Scores | Aug 18, 2023 | ClassificationDocument Classification | CodeCode Available | 1 | 5 |
| Are Large Language Models Ready for Healthcare? A Comparative Study on Clinical Language Understanding | Apr 9, 2023 | Document Classificationnamed-entity-recognition | CodeCode Available | 1 | 5 |
| A Comparative Study of Pretrained Language Models for Long Clinical Text | Jan 27, 2023 | Clinical KnowledgeDocument Classification | CodeCode Available | 1 | 5 |
| Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes | Oct 8, 2024 | ArticlesClassification | CodeCode Available | 1 | 5 |
| ANLS* -- A Universal Document Processing Metric for Generative Large Language Models | Feb 6, 2024 | Document Classification | CodeCode Available | 1 | 5 |
| Bioformer: an efficient transformer language model for biomedical text mining | Feb 3, 2023 | ArticlesDocument Classification | CodeCode Available | 1 | 5 |
| GeoGalactica: A Scientific Large Language Model in Geoscience | Dec 31, 2023 | Document ClassificationGeneral Knowledge | CodeCode Available | 1 | 5 |
| Aspect-based Document Similarity for Research Papers | Oct 13, 2020 | Document ClassificationRecommendation Systems | CodeCode Available | 1 | 5 |
| German's Next Language Model | Oct 21, 2020 | BenchmarkingDocument Classification | CodeCode Available | 1 | 5 |
| Hierarchical Metadata-Aware Document Categorization under Weak Supervision | Oct 26, 2020 | Data AugmentationDocument Classification | CodeCode Available | 1 | 5 |
| Improving Document Classification with Multi-Sense Embeddings | Nov 18, 2019 | ClassificationClustering | CodeCode Available | 1 | 5 |
| Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning | Oct 13, 2015 | Document ClassificationRepresentation Learning | CodeCode Available | 1 | 5 |
| HDLTex: Hierarchical Deep Learning for Text Classification | Sep 24, 2017 | ClassificationDeep Learning | CodeCode Available | 1 | 5 |
| Three-level Hierarchical Transformer Networks for Long-sequence and Multiple Clinical Documents Classification | Apr 17, 2021 | Document ClassificationGeneral Classification | CodeCode Available | 1 | 5 |
| Can a Fruit Fly Learn Word Embeddings? | Jan 18, 2021 | Document ClassificationWord Embeddings | CodeCode Available | 1 | 5 |
| ChordMixer: A Scalable Neural Attention Model for Sequences with Different Lengths | Jun 12, 2022 | ChunkingDocument Classification | CodeCode Available | 1 | 5 |
| HiPool: Modeling Long Documents Using Graph Neural Networks | May 5, 2023 | Document ClassificationSentence | CodeCode Available | 1 | 5 |
| Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences | Jan 27, 2022 | Clinical KnowledgeDocument Classification | CodeCode Available | 1 | 5 |
| Classification Benchmarks for Under-resourced Bengali Language based on Multichannel Convolutional-LSTM Network | Apr 11, 2020 | ArticlesClassification | CodeCode Available | 1 | 5 |
| Improving Language Understanding by Generative Pre-Training | Jun 11, 2018 | Cloze TestDocument Classification | CodeCode Available | 1 | 5 |
| Keyword Assisted Topic Models | Apr 13, 2020 | Document ClassificationTopic Models | CodeCode Available | 1 | 5 |
| Lbl2Vec: An Embedding-Based Approach for Unsupervised Document Retrieval on Predefined Topics | Oct 12, 2022 | Document ClassificationRetrieval | CodeCode Available | 1 | 5 |
| ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models | Nov 15, 2023 | Document ClassificationQuestion Answering | CodeCode Available | 1 | 5 |
| Minimally Supervised Categorization of Text with Metadata | May 1, 2020 | Document Classification | CodeCode Available | 1 | 5 |
| LSD-C: Linearly Separable Deep Clusters | Jun 17, 2020 | ClusteringData Augmentation | CodeCode Available | 1 | 5 |
| SPECTER: Document-level Representation Learning using Citation-informed Transformers | Apr 15, 2020 | Citation PredictionDocument Classification | CodeCode Available | 1 | 5 |
| MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer | Sep 2, 2021 | Cross-Lingual TransferDocument Classification | CodeCode Available | 1 | 5 |
| MultiFiT: Efficient Multi-lingual Language Model Fine-tuning | Sep 10, 2019 | Cross-Lingual Document ClassificationDocument Classification | CodeCode Available | 1 | 5 |
| Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs | Oct 15, 2020 | Document ClassificationGeneral Classification | CodeCode Available | 1 | 5 |
| DocBERT: BERT for Document Classification | Apr 17, 2019 | ClassificationDocument Classification | CodeCode Available | 1 | 5 |
| Specialized Document Embeddings for Aspect-based Similarity of Research Papers | Mar 28, 2022 | Document ClassificationRecommendation Systems | CodeCode Available | 1 | 5 |
| Multimodal Side-Tuning for Document Classification | Jan 16, 2023 | ClassificationDocument Classification | CodeCode Available | 1 | 5 |
| GloVe: Global Vectors for Word Representation | Oct 1, 2014 | Document ClassificationInformation Retrieval | CodeCode Available | 0 | 5 |
| Geometric deep learning on graphs and manifolds using mixture model CNNs | Nov 25, 2016 | Deep LearningDocument Classification | CodeCode Available | 0 | 5 |
| Glyce: Glyph-vectors for Chinese Character Representations | Jan 29, 2019 | Chinese Dependency ParsingChinese Named Entity Recognition | CodeCode Available | 0 | 5 |
| A Confidence-Calibrated MOBA Game Winner Predictor | Jun 28, 2020 | Document Classification | CodeCode Available | 0 | 5 |
| A Robust Hybrid Approach for Textual Document Classification | Sep 12, 2019 | BIG-bench Machine LearningClassification | CodeCode Available | 0 | 5 |
| Generative Topic Embedding: a Continuous Representation of Documents | Aug 1, 2016 | Document ClassificationTopic Models | CodeCode Available | 0 | 5 |
| Generalized Sobolev Transport for Probability Measures on a Graph | Feb 7, 2024 | Document ClassificationTopological Data Analysis | CodeCode Available | 0 | 5 |
| Generative Topic Embedding: a Continuous Representation of Documents (Extended Version with Proofs) | Jun 9, 2016 | Document ClassificationVariational Inference | CodeCode Available | 0 | 5 |
| AraDIC: Arabic Document Classification using Image-Based Character Embeddings and Class-Balanced Loss | Jun 20, 2020 | ClassificationDeep Learning | CodeCode Available | 0 | 5 |
| Exploring Topic Coherence over Many Models and Many Topics | Jul 1, 2012 | Document ClassificationInformation Retrieval | CodeCode Available | 0 | 5 |
| Explainable and Discourse Topic-aware Neural Language Understanding | Jun 18, 2020 | Document ClassificationLanguage Modeling | CodeCode Available | 0 | 5 |
| Corpus-level and Concept-based Explanations for Interpretable Document Classification | Apr 24, 2020 | ClassificationDecision Making | CodeCode Available | 0 | 5 |
| Exploring the Relationship Between Algorithm Performance, Vocabulary, and Run-Time in Text Classification | Apr 8, 2021 | ClassificationDocument Classification | CodeCode Available | 0 | 5 |
| FLAG: Financial Long Document Classification via AMR-based GNN | Oct 2, 2024 | Abstract Meaning RepresentationDocument Classification | CodeCode Available | 0 | 5 |
| A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors | May 14, 2018 | Document ClassificationDomain Adaptation | CodeCode Available | 0 | 5 |