Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Semantic Overlap Summarization (SOS) is a constrained multi-document summarization task, where the constraint is to capture the common/overlapping information between two alternative narratives. In this work we conduct an evaluation of Large Language Models (LLMs) on the SOS task and introduce introduce the PrivacyPolicyPairs (3P) dataset with the intentions of expanding the space of SOS data in terms of both quantity and variety. With this dataset we provide 135 high quality SOS data samples sourced from privacy policy documents, an alternate domain of text from the original SOS dataset. We then use the TELeR taxonomy to create and evaluate 905,216 LLM generated summaries over our SOS datasets of different domains and we further conduct human evaluation on a subset of 540 samples. We conclude the paper by analyzing model performance and the reliability of automatic evaluation. The code and datasets used to conduct this study are available at https://anonymous.4open.science/r/llm_eval-E16D