| Can Reasoning LLMs Enhance Clinical Document Classification? | Apr 10, 2025 | ClassificationDocument Classification | CodeCode Available | 0 |
| Text Chunking for Document Classification for Urban System Management using Large Language Models | Mar 31, 2025 | ChunkingDocument Classification | CodeCode Available | 0 |
| Evaluating Negative Sampling Approaches for Neural Topic Models | Mar 23, 2025 | Document ClassificationTopic Models | CodeCode Available | 0 |
| Converting Transformers into DGNNs Form | Feb 1, 2025 | Computational EfficiencyDocument Classification | CodeCode Available | 0 |
| Cross-Entropy Attacks to Language Models via Rare Event Simulation | Jan 21, 2025 | Document ClassificationSaliency Ranking | CodeCode Available | 0 |
| On Importance of Layer Pruning for Smaller BERT Models and Low Resource Languages | Jan 1, 2025 | ClassificationDocument Classification | —Unverified | 0 |
| Data-Driven Self-Supervised Graph Representation Learning | Dec 24, 2024 | Data AugmentationDocument Classification | CodeCode Available | 0 |
| Extreme Multi-label Completion for Semantic Document Labelling with Taxonomy-Aware Parallel Learning | Dec 18, 2024 | Document ClassificationMissing Labels | —Unverified | 0 |
| Zero-Shot Prompting and Few-Shot Fine-Tuning: Revisiting Document Image Classification Using Large Language Models | Dec 18, 2024 | Document Classificationdocument-image-classification | —Unverified | 0 |
| Label Errors in the Tobacco3482 Dataset | Dec 17, 2024 | Document Classificationvalid | CodeCode Available | 0 |
| WordVIS: A Color Worth A Thousand Words | Dec 13, 2024 | Document Classification | —Unverified | 0 |
| Can Large Language Models Serve as Effective Classifiers for Hierarchical Multi-Label Classification of Scientific Documents at Industrial Scale? | Dec 6, 2024 | ClassificationDocument Classification | —Unverified | 0 |
| HEAL: Hierarchical Embedding Alignment Loss for Improved Retrieval and Representation Learning | Dec 5, 2024 | Contrastive LearningDocument Classification | CodeCode Available | 1 |
| Language Model Meets Prototypes: Towards Interpretable Text Classification Models through Prototypical Networks | Dec 4, 2024 | ClassificationContrastive Learning | —Unverified | 0 |
| Enhancing Document AI Data Generation Through Graph-Based Synthetic Layouts | Nov 27, 2024 | Document AIDocument Classification | —Unverified | 0 |
| Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs | Nov 9, 2024 | ClassificationClustering | —Unverified | 0 |
| Weakly-supervised diagnosis identification from Italian discharge letters | Oct 19, 2024 | Document Classificationtext-classification | —Unverified | 0 |
| Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data | Oct 19, 2024 | Document ClassificationGraph Attention | —Unverified | 0 |
| ChuLo: Chunk-Level Key Information Representation for Long Document Processing | Oct 14, 2024 | ChunkingClassification | CodeCode Available | 0 |
| Text Classification using Graph Convolutional Networks: A Comprehensive Survey | Oct 12, 2024 | ClassificationDocument Classification | —Unverified | 0 |
| Orthogonal Nonnegative Matrix Factorization with the Kullback-Leibler divergence | Oct 10, 2024 | Document Classification | CodeCode Available | 0 |
| Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes | Oct 8, 2024 | ArticlesClassification | CodeCode Available | 1 |
| Manual Verbalizer Enrichment for Few-Shot Text Classification | Oct 8, 2024 | BenchmarkingClassification | —Unverified | 0 |
| Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document Classification | Oct 3, 2024 | Document ClassificationGraph Attention | —Unverified | 0 |
| FLAG: Financial Long Document Classification via AMR-based GNN | Oct 2, 2024 | Abstract Meaning RepresentationDocument Classification | CodeCode Available | 0 |
| Document Type Classification using File Names | Oct 2, 2024 | ClassificationDocument Classification | —Unverified | 0 |
| On Importance of Pruning and Distillation for Efficient Low Resource NLP | Sep 21, 2024 | Document ClassificationGPU | —Unverified | 0 |
| SubRegWeigh: Effective and Efficient Annotation Weighing with Subword Regularization | Sep 10, 2024 | Document Classificationnamed-entity-recognition | CodeCode Available | 0 |
| Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification | Aug 20, 2024 | Document AIDocument Classification | CodeCode Available | 0 |
| AutoML-guided Fusion of Entity and LLM-based Representations for Document Classification | Aug 19, 2024 | AutoMLClassification | CodeCode Available | 0 |
| Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification | Aug 13, 2024 | ClassificationDocument Classification | CodeCode Available | 0 |
| Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian | Jul 30, 2024 | Document ClassificationEntity Typing | —Unverified | 0 |
| An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry | Jul 19, 2024 | Document ClassificationKeyword Extraction | CodeCode Available | 0 |
| Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification | Jul 14, 2024 | Document ClassificationSentence | —Unverified | 0 |
| Rapid Biomedical Research Classification: The Pandemic PACT Advanced Categorisation Engine | Jul 14, 2024 | Decision MakingDocument Classification | —Unverified | 0 |
| SuperGLEBer: German Language Understanding Evaluation Benchmark | Jun 20, 2024 | Document ClassificationNatural Language Understanding | CodeCode Available | 1 |
| DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models | Jun 17, 2024 | Document ClassificationVisual Grounding | CodeCode Available | 3 |
| Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification | Jun 3, 2024 | Document Classification | —Unverified | 0 |
| Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification | May 29, 2024 | Document Classification | —Unverified | 0 |
| Evaluation of large language model performance on the Biomedical Language Understanding and Reasoning Benchmark | May 17, 2024 | Document ClassificationLanguage Modeling | —Unverified | 0 |
| Length-Aware Multi-Kernel Transformer for Long Document Classification | May 11, 2024 | Document ClassificationSentence | CodeCode Available | 0 |
| Improving Long Text Understanding with Knowledge Distilled from Summarization Model | May 8, 2024 | Abstractive Text SummarizationDocument Classification | —Unverified | 0 |
| CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification | May 6, 2024 | Document Classificationdocument-image-classification | —Unverified | 0 |
| Machine Unlearning for Document Classification | Apr 29, 2024 | ClassificationDocument Classification | CodeCode Available | 0 |
| L3Cube-MahaNews: News-based Short Text and Long Document Classification Datasets in Marathi | Apr 28, 2024 | ArticlesDocument Classification | CodeCode Available | 0 |
| GuideWalk: A Novel Graph-Based Word Embedding for Enhanced Text Classification | Apr 25, 2024 | ClassificationDocument Classification | —Unverified | 0 |
| BuDDIE: A Business Document Dataset for Multi-task Information Extraction | Apr 5, 2024 | Document Classificationdocument understanding | —Unverified | 0 |
| Developing Healthcare Language Model Embedding Spaces | Mar 28, 2024 | Contrastive LearningDocument Classification | —Unverified | 0 |
| Visually Guided Generative Text-Layout Pre-training for Document Intelligence | Mar 25, 2024 | Document Classificationdocument understanding | CodeCode Available | 2 |
| NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents | Feb 27, 2024 | Document ClassificationLanguage Modeling | CodeCode Available | 1 |