SOTAVerified

Chunking

Chunking, also known as shallow parsing, identifies continuous spans of tokens that form syntactic units such as noun phrases or verb phrases.

Example:

| Vinken | , | 61 | years | old | | --- | ---| --- | --- | --- | | B-NLP| I-NP | I-NP | I-NP | I-NP |

Papers

Showing 51100 of 447 papers

TitleStatusHype
SEER: Self-Aligned Evidence Extraction for Retrieval-Augmented GenerationCode0
Reconstructing Context: Evaluating Advanced Chunking Strategies for Retrieval-Augmented GenerationCode0
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMsCode0
Punctuation Restoration Improves Structure Understanding Without SupervisionCode0
Opening the black box of language acquisitionCode0
Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement LearningCode0
Query-Based Keyphrase Extraction from Long DocumentsCode0
Open Information Extraction via ChunksCode0
Augmenting Neural Networks with First-order LogicCode0
Chunking Historical GermanCode0
NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question AnsweringCode0
Neural Models for Sequence ChunkingCode0
A Tree Search Algorithm for Sequence LabelingCode0
Neural Sequence Segmentation as Determining the Leftmost SegmentsCode0
NNVLP: A Neural Network-Based Vietnamese Language Processing ToolkitCode0
Semi-supervised sequence tagging with bidirectional language modelsCode0
Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented GenerationCode0
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP TasksCode0
LLM-TA: An LLM-Enhanced Thematic Analysis Pipeline for Transcripts from Parents of Children with Congenital Heart DiseaseCode0
AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generationCode0
Named Entity Recognition in Tweets: An Experimental StudyCode0
Large-scale image segmentation based on distributed clustering algorithmsCode0
Keystroke dynamics as signal for shallow syntactic parsingCode0
Language-Agnostic Syllabification with Neural Sequence LabelingCode0
J2N -- Nominal Adjective Identification and its ApplicationCode0
Building Odia Shallow ParserCode0
Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-RankingCode0
Natural Language Processing (almost) from ScratchCode0
Boundary-based MWE segmentation with text partitioningCode0
FLAIR: An Easy-to-Use Framework for State-of-the-Art NLPCode0
Integrating Supertag Features into Neural Discontinuous Constituent ParsingCode0
CAG: Chunked Augmented Generation for Google Chrome's Built-in Gemini NanoCode0
Financial Report Chunking for Effective Retrieval Augmented GenerationCode0
KidneyTalk-open: No-code Deployment of a Private Large Language Model with Medical Documentation-Enhanced Knowledge Database for Kidney DiseaseCode0
A Feature-Rich Vietnamese Named-Entity Recognition ModelCode0
Large scale visual place recognition with sub-linear storage growthCode0
Fine-Grained Error Analysis and Fair Evaluation of Labeled SpansCode0
FlexChunk: Enabling 100M×100M Out-of-Core SpMV (~1.8 min, ~1.7 GB RAM) with Near-Linear ScalingCode0
Experiential Explanations for Reinforcement LearningCode0
Evaluation of Word Vector Representations by Subspace AlignmentCode0
Dynamic Chunking and Selection for Reading Comprehension of Ultra-Long Context in Large Language ModelsCode0
Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data?Code0
Evaluating Relaxations of Logic for Neural Networks: A Comprehensive StudyCode0
ChuLo: Chunk-Level Key Information Representation for Long Document ProcessingCode0
Gated Task Interaction Framework for Multi-task Sequence TaggingCode0
BIRA: Improved Predictive Exchange Word ClusteringCode0
Design Challenges and Misconceptions in Neural Sequence LabelingCode0
Chunking: Continual Learning is not just about Distribution ShiftCode0
Def2Vec: Extensible Word Embeddings from Dictionary DefinitionsCode0
Discourse Sense Classification from Scratch using Focused RNNsCode0
Show:102550
← PrevPage 2 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACEExact Span F197.3Unverified
2BERT-CRF (Replicated in AdaSeq)Exact Span F197.18Unverified
3ELMo + MAT + Multi-TaskExact Span F197.04Unverified
4CVT+Multi-Task+LargeExact Span F196.98Unverified
5ELMo + Multi-TaskExact Span F196.83Unverified
6FlairExact Span F196.72Unverified
7SeqVATExact Span F195.45Unverified
8Adversarial TrainingExact Span F195.25Unverified
9BiLSTM-CRFExact Span F195.18Unverified
#ModelMetricClaimedVerifiedStatus
1ACEF1 score97.3Unverified
2Flair embeddingsF1 score96.72Unverified
3JMTF1 score95.77Unverified
4Low supervisionF1 score95.57Unverified
5IntNet + BiLSTM-CRFF1 score95.29Unverified
6Suzuki and IsozakiF1 score95.15Unverified
7NCRF++F1 score95.06Unverified
8BI-LSTM-CRF (Senna) (ours)F1 score94.46Unverified
#ModelMetricClaimedVerifiedStatus
1ACEF195Unverified
2Wang et al., 2020F194.4Unverified
3AINF194.04Unverified
#ModelMetricClaimedVerifiedStatus
1Wang et al., 2020F192Unverified
2AINF191.71Unverified
#ModelMetricClaimedVerifiedStatus
1Def2VecAUC93.07Unverified