
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Quantifying image complexity at the entity level is straightforward, but the assessment of semantic complexity has been largely overlooked. As a matter of fact, there are differences in semantic complexity across images. For example, the "Cookie Theft" picture is widely used to assess human language and cognitive abilities. Compared to most images, it contains richer semantics, allowing it to tell a vivid and engaging story. There is a need for more images like "Cookie Theft" to cater to people from different cultural backgrounds and eras. Additionally, semantically rich images can benefit the development of vision models, as images with limited semantics are becoming less challenging for these models. Assessing the semantic complexity requires human experts and empirical evidence. Automatic evaluation of how semantically rich an image is will benefit not only researchers in human cognition but also AI models. In response, we propose the Image Semantic Assessment (ISA) task to address this problem. We introduce the first ISA dataset and a novel method that leverages language to solve this vision problem. Experiments on our dataset demonstrate the effectiveness of our approach.