WTR: A Test Collection for Web Table Retrieval
Zhiyu Chen, Shuo Zhang, Brian D. Davison
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/Zhiyu-Chen/Web-Table-Retrieval-BenchmarkOfficialIn papernone★ 10
Abstract
We describe the development, characteristics and availability of a test collection for the task of Web table retrieval, which uses a large-scale Web Table Corpora extracted from the Common Crawl. Since a Web table usually has rich context information such as the page title and surrounding paragraphs, we not only provide relevance judgments of query-table pairs, but also the relevance judgments of query-table context pairs with respect to a query, which are ignored by previous test collections. To facilitate future research with this benchmark, we provide details about how the dataset is pre-processed and also baseline results from both traditional and recently proposed table retrieval methods. Our experimental results show that proper usage of context labels can benefit previous table retrieval methods.