Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/7ytm-ch53

poster

ACL 2024

August 22, 2024

Bangkok, Thailand

Towards a new research agenda for multimodal enterprise document understanding: What are we missing?

keywords:

visually rich document understanding

vrdu

document ai

calibration

visual question answering

vqa

grounding

information extraction

The field of multimodal document understanding has produced a suite of models that have achieved stellar performance across several tasks, even coming close to human performance on certain benchmarks. Nevertheless, the application of these models to real-world enterprise datasets remains constrained by a number of limitations. In this position paper, we discuss these limitations in the context of three key aspects of research: dataset curation, model development, and evaluation on downstream tasks. By analyzing 14 datasets and 7 SotA models, we identify major gaps in their utility in the context of a real-world scenario. We demonstrate how each limitation impedes the widespread use of SotA models in enterprise settings, and present a set of research challenges that are motivated by these limitations. Lastly, we propose a research agenda that is aimed at driving the field towards higher impact in enterprise applications.

Downloads

Transcript English (automatic)

Next from ACL 2024

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning
poster

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning

ACL 2024

+6Zhiyang Xu
Zhiyang Xu and 8 other authors

22 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved