SOTAVerified

Chunking

Chunking, also known as shallow parsing, identifies continuous spans of tokens that form syntactic units such as noun phrases or verb phrases.

Example:

| Vinken | , | 61 | years | old | | --- | ---| --- | --- | --- | | B-NLP| I-NP | I-NP | I-NP | I-NP |

Papers

Showing 51100 of 447 papers

TitleStatusHype
Robust Multilingual Part-of-Speech Tagging via Adversarial TrainingCode0
Punctuation Restoration Improves Structure Understanding Without SupervisionCode0
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMsCode0
Query-Based Keyphrase Extraction from Long DocumentsCode0
Open Information Extraction via ChunksCode0
NNVLP: A Neural Network-Based Vietnamese Language Processing ToolkitCode0
NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question AnsweringCode0
Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement LearningCode0
Opening the black box of language acquisitionCode0
Augmenting Neural Networks with First-order LogicCode0
Neural Models for Sequence ChunkingCode0
Neural Sequence Segmentation as Determining the Leftmost SegmentsCode0
Chunking: Continual Learning is not just about Distribution ShiftCode0
NCRF++: An Open-source Neural Sequence Labeling ToolkitCode0
A Tree Search Algorithm for Sequence LabelingCode0
Natural Language Processing (almost) from ScratchCode0
SEER: Self-Aligned Evidence Extraction for Retrieval-Augmented GenerationCode0
ChuLo: Chunk-Level Key Information Representation for Long Document ProcessingCode0
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP TasksCode0
CUSIDE-array: A Streaming Multi-Channel End-to-End Speech Recognition System with Realistic EvaluationsCode0
AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generationCode0
LLM-TA: An LLM-Enhanced Thematic Analysis Pipeline for Transcripts from Parents of Children with Congenital Heart DiseaseCode0
Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented GenerationCode0
Large scale visual place recognition with sub-linear storage growthCode0
Keystroke dynamics as signal for shallow syntactic parsingCode0
KidneyTalk-open: No-code Deployment of a Private Large Language Model with Medical Documentation-Enhanced Knowledge Database for Kidney DiseaseCode0
Integrating Supertag Features into Neural Discontinuous Constituent ParsingCode0
Building Odia Shallow ParserCode0
GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence LabelingCode0
Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-RankingCode0
J2N -- Nominal Adjective Identification and its ApplicationCode0
CAG: Chunked Augmented Generation for Google Chrome's Built-in Gemini NanoCode0
Fine-Grained Error Analysis and Fair Evaluation of Labeled SpansCode0
Boundary-based MWE segmentation with text partitioningCode0
Language-Agnostic Syllabification with Neural Sequence LabelingCode0
Large-scale image segmentation based on distributed clustering algorithmsCode0
A Feature-Rich Vietnamese Named-Entity Recognition ModelCode0
Financial Report Chunking for Effective Retrieval Augmented GenerationCode0
FLAIR: An Easy-to-Use Framework for State-of-the-Art NLPCode0
Evaluation of Word Vector Representations by Subspace AlignmentCode0
Evaluating Relaxations of Logic for Neural Networks: A Comprehensive StudyCode0
Experiential Explanations for Reinforcement LearningCode0
Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data?Code0
Does Chinese BERT Encode Word Structure?Code0
Dynamic Chunking and Selection for Reading Comprehension of Ultra-Long Context in Large Language ModelsCode0
FlexChunk: Enabling 100M×100M Out-of-Core SpMV (~1.8 min, ~1.7 GB RAM) with Near-Linear ScalingCode0
BIRA: Improved Predictive Exchange Word ClusteringCode0
Def2Vec: Extensible Word Embeddings from Dictionary DefinitionsCode0
Chunking Historical GermanCode0
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASRCode0
Show:102550
← PrevPage 2 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACEExact Span F197.3Unverified
2BERT-CRF (Replicated in AdaSeq)Exact Span F197.18Unverified
3ELMo + MAT + Multi-TaskExact Span F197.04Unverified
4CVT+Multi-Task+LargeExact Span F196.98Unverified
5ELMo + Multi-TaskExact Span F196.83Unverified
6FlairExact Span F196.72Unverified
7SeqVATExact Span F195.45Unverified
8Adversarial TrainingExact Span F195.25Unverified
9BiLSTM-CRFExact Span F195.18Unverified
#ModelMetricClaimedVerifiedStatus
1ACEF1 score97.3Unverified
2Flair embeddingsF1 score96.72Unverified
3JMTF1 score95.77Unverified
4Low supervisionF1 score95.57Unverified
5IntNet + BiLSTM-CRFF1 score95.29Unverified
6Suzuki and IsozakiF1 score95.15Unverified
7NCRF++F1 score95.06Unverified
8BI-LSTM-CRF (Senna) (ours)F1 score94.46Unverified
#ModelMetricClaimedVerifiedStatus
1ACEF195Unverified
2Wang et al., 2020F194.4Unverified
3AINF194.04Unverified
#ModelMetricClaimedVerifiedStatus
1Wang et al., 2020F192Unverified
2AINF191.71Unverified
#ModelMetricClaimedVerifiedStatus
1Def2VecAUC93.07Unverified