Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Question-Answering (QA) systems are vital for rapidly accessing and comprehending information in academic literature. However, some complex academic questions require information from multiple documents for resolution. Existing datasets are insufficient to support complex reasoning in such multi-document scenarios. To address this, we introduce a pipeline methodology for constructing a Multi-Document Academic QA (MDA-QA) dataset. By both detecting communities based on citation networks and leveraging Large Language Models (LLMs) to generate QA, we were able to generate QA pairs related to multi-document content automatically. We further develop an automated filtering mechanism to ensure multi-document reliability. Our resulting dataset consists of 6,804 QA pairs and serves as a benchmark for evaluating multi-document retrieval and QA systems. Our experimental results highlight that standard lexical and embedding-based retrieval methods struggle to locate all relevant documents, indicating a persistent gap in multi-document reasoning. We will release our dataset and source code for the community.