Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
While large language models (LLMs) have demonstrated impressive creative capabilities, most research has predominantly focused on English texts, often overlooking non-English literary traditions and lacking standardized methods for assessing creativity. In this paper, we investigate the ability of various LLMs to comprehend and generate Persian literary text with culturally relevant expressions. To this end, we construct a dataset of user-generated Persian literary content spanning 20 diverse topics. We evaluate model outputs based on four key dimensions: originality, fluency, flexibility, and elaboration by adapting the Torrance Tests of Creative Thinking to the Persian cultural context. Furthermore, we assess the models’ understanding of four fundamental literary devices: simile, metaphor, hyperbole, and antithesis. Combining human judgment with an LLM-as-judge approach, we evaluate the creativity of the generated texts. Our analysis reveals a strong agreement between human and model assessments, highlighting the potential of LLMs to meaningfully engage with Persian literary culture. However, these models still require further refinement to fully grasp and interpret literary devices.