A large-scale multi-hop question answering dataset over heterogenesous information of both structured tabular and unstructured textual forms.
Mechanical Turk;
Strict Quality Control
13k Wikipedia Tables;
293K hyperlinked passages;
70K natural questions.
Semantic Understanding;
Symbolic Reasoning.
Reasoning over open domain Wikitables
@article{chen2020hybridqa,
title={HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data},
author={Chen, Wenhu and Zha, Hanwen and Chen, Zhiyu and Xiong, Wenhan and Wang, Hong and Wang, William},
journal={Findings of EMNLP 2020},
year={2020}
}