| NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents | Feb 27, 2024 | Document ClassificationLanguage Modeling | CodeCode Available | 1 |
| Prompted Contextual Vectors for Spear-Phishing Detection | Feb 13, 2024 | Document Classification | CodeCode Available | 1 |
| NLP for Knowledge Discovery and Information Extraction from Energetics Corpora | Feb 10, 2024 | ArticlesDocument Classification | —Unverified | 0 |
| Efficient Models for the Detection of Hate, Abuse and Profanity | Feb 8, 2024 | Document Classificationnamed-entity-recognition | —Unverified | 0 |
| Generalized Sobolev Transport for Probability Measures on a Graph | Feb 7, 2024 | Document ClassificationTopological Data Analysis | CodeCode Available | 0 |
| ANLS* -- A Universal Document Processing Metric for Generative Large Language Models | Feb 6, 2024 | Document Classification | CodeCode Available | 1 |
| L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages | Jan 4, 2024 | ArticlesClassification | CodeCode Available | 1 |
| GeoGalactica: A Scientific Large Language Model in Geoscience | Dec 31, 2023 | Document ClassificationGeneral Knowledge | CodeCode Available | 1 |
| Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs | Dec 21, 2023 | Document ClassificationKnowledge Graphs | —Unverified | 0 |
| A Learning oriented DLP System based on Classification Model | Dec 21, 2023 | ClassificationDocument Classification | —Unverified | 0 |
| MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA | Dec 19, 2023 | Document ClassificationHallucination | CodeCode Available | 0 |
| Large language models in healthcare and medical domain: A review | Dec 12, 2023 | Document Classificationnamed-entity-recognition | —Unverified | 0 |
| Summarization-based Data Augmentation for Document Classification | Dec 1, 2023 | ClassificationData Augmentation | CodeCode Available | 0 |
| SUT: a new multi-purpose synthetic dataset for Farsi document image analysis | Nov 27, 2023 | Document Classificationdocument-image-classification | CodeCode Available | 0 |
| Learning Section Weights for Multi-Label Document Classification | Nov 26, 2023 | ArticlesClassification | —Unverified | 0 |
| Causality is all you need | Nov 21, 2023 | AllDocument Classification | —Unverified | 0 |
| ATLANTIC: Structure-Aware Retrieval-Augmented Language Model for Interdisciplinary Science | Nov 21, 2023 | Document ClassificationGraph Neural Network | —Unverified | 0 |
| ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models | Nov 15, 2023 | Document ClassificationQuestion Answering | CodeCode Available | 1 |
| Explainable Text Classification Techniques in Legal Document Review: Locating Rationales without Using Human Annotated Training Text Snippets | Nov 15, 2023 | Document Classificationtext-classification | —Unverified | 0 |
| A Multi-Modal Multilingual Benchmark for Document Image Classification | Oct 25, 2023 | ClassificationCross-Lingual Transfer | —Unverified | 0 |
| Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents | Oct 25, 2023 | AllDocument Classification | —Unverified | 0 |
| Optimal Transport for Measures with Noisy Tree Metric | Oct 20, 2023 | Document ClassificationTopological Data Analysis | CodeCode Available | 0 |
| BibRank: Automatic Keyphrase Extraction Platform Using~Metadata | Oct 13, 2023 | ClusteringDocument Classification | CodeCode Available | 0 |
| An Analysis on Large Language Models in Healthcare: A Case Study of BioBERT | Oct 11, 2023 | Document ClassificationInformation Retrieval | —Unverified | 0 |
| KoBigBird-large: Transformation of Transformer for Korean Language Understanding | Sep 19, 2023 | Document ClassificationQuestion Answering | —Unverified | 0 |
| Beyond Document Page Classification: Design, Datasets, and Challenges | Aug 24, 2023 | BenchmarkingClassification | CodeCode Available | 0 |
| Feature Extraction Using Deep Generative Models for Bangla Text Classification on a New Comprehensive Dataset | Aug 21, 2023 | Document ClassificationGenerative Adversarial Network | —Unverified | 0 |
| Taken by Surprise: Contrast effect for Similarity Scores | Aug 18, 2023 | ClassificationDocument Classification | CodeCode Available | 1 |
| Accelerated materials language processing enabled by GPT | Aug 18, 2023 | Document ClassificationExtractive Question-Answering | —Unverified | 0 |
| Large Language Model Prompt Chaining for Long Legal Document Classification | Aug 8, 2023 | Document ClassificationIn-Context Learning | —Unverified | 0 |
| LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning | Jul 30, 2023 | Android Malware DetectionClassification | —Unverified | 0 |
| Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs | Jul 27, 2023 | Document ClassificationKnowledge Distillation | —Unverified | 0 |
| UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition | Jul 20, 2023 | Document Classificationnamed-entity-recognition | —Unverified | 0 |
| Can Model Fusing Help Transformers in Long Document Classification? An Empirical Study | Jul 18, 2023 | ClassificationDocument Classification | CodeCode Available | 0 |
| Attention over pre-trained Sentence Embeddings for Long Document Classification | Jul 18, 2023 | Document ClassificationSentence | —Unverified | 0 |
| MDACE: MIMIC Documents Annotated with Code Evidence | Jul 7, 2023 | Document ClassificationExtreme Multi-Label Classification | CodeCode Available | 0 |
| Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts | Jul 5, 2023 | Document ClassificationSentiment Analysis | —Unverified | 0 |
| On Evaluation of Document Classification using RVL-CDIP | Jun 21, 2023 | BenchmarkingClassification | —Unverified | 0 |
| Weakly-Supervised Scientific Document Classification via Retrieval-Augmented Multi-Stage Training | Jun 12, 2023 | Document ClassificationRetrieval | CodeCode Available | 1 |
| Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers | Jun 7, 2023 | Document ClassificationLanguage Modeling | —Unverified | 0 |
| Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents | Jun 5, 2023 | DenoisingDocument Classification | —Unverified | 0 |
| End-to-End Document Classification and Key Information Extraction using Assignment Optimization | Jun 1, 2023 | ClassificationDocument Classification | —Unverified | 0 |
| GVdoc: Graph-based Visual Document Classification | May 26, 2023 | ClassificationDocument Classification | CodeCode Available | 0 |
| Neural Natural Language Processing for Long Texts: A Survey on Classification and Summarization | May 25, 2023 | Document ClassificationDocument Summarization | —Unverified | 0 |
| DUBLIN -- Document Understanding By Language-Image Network | May 23, 2023 | Document Classificationdocument understanding | —Unverified | 0 |
| DLUE: Benchmarking Document Language Understanding | May 16, 2023 | BenchmarkingDocument Classification | —Unverified | 0 |
| CWTM: Leveraging Contextualized Word Embeddings from BERT for Neural Topic Modeling | May 16, 2023 | Document ClassificationLanguage Modelling | CodeCode Available | 0 |
| A General-Purpose Multilingual Document Encoder | May 11, 2023 | Cross-Lingual TransferDocument Classification | CodeCode Available | 0 |
| Benchmarking large language models for biomedical natural language processing applications and recommendations | May 10, 2023 | BenchmarkingDocument Classification | CodeCode Available | 1 |
| HiPool: Modeling Long Documents Using Graph Neural Networks | May 5, 2023 | Document ClassificationSentence | CodeCode Available | 1 |