Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
The management and annotation of complex, multi-modal scientific data remains a major obstacle for AI-driven research due to poor reusability and scalability of current solutions. We propose SciDataMAS, a novel LLM-powered multi-agent system (MAS), which automate scientific data management through a structured data lake with provenance-based organization and an adaptive metadata taxonomy. The system uses specialized workflows for automated dataset creation, data insertion and retrieval. Experiments show the system's proficiency, with modern LLMs like GPT-5 successfully generating rich metadata schemas and filling them with high accuracy. This work provides a foundational step towards fully automated, reusable, and scalable scientific data organization which may lead to generation and accumulation by scientific community well annotated AI-ready datasets.
