profile picture

Saied Alshahrani

corpora quality

wikipedia

wikipedia corpora

arabic

text classification

nlp

bert

chinese

gender bias

transparency

arabic language

arabic dialects

diachronic

profession words

natural language processing

7

presentations

2

number of views

SHORT BIO

Saied Alshahrani is a Computer Science PhD candidate at Clarkson University (CU), Potsdam, New York, USA. He is a member of a few research labs and groups, like a full-time member of Clarkson Accountability and Transparency (CAT) at CU, a part-time member of ASAS AI Lab in Saudi Arabia, and a contributing member of the Arabic Machine Learning community (ARBML). He works with his advisor, Prof. Jeanna Matthews, on studying the representativeness of Arabic NLP corpora/datasets, where he studies the implications of using unrepresentative (auto-generated or template-translated) corpora in Arabic NLP, introduces metrics and tools to promote transparency in these Arabic corpora, and proposes systematic approaches to detect these unrepresentative corpora, all to aid the NLP practitioners and researchers to make an informed decision on whether to use such corpora in training their NLP tasks and systems.

Presentations

[PUBLISHED] Leveraging Corpus Metadata to Detect Template-based Translation: An Exploratory Case Study of the Egyptian Arabic Wikipedia Edition

Saied Alshahrani and 1 other author

CIDAR: Culturally Relevant Instruction Dataset For Arabic

Zaid Alyafeai and 11 other authors

Arabic Synonym BERT-based Adversarial Examples for Text Classification

Norah Alshahrani and 3 other authors

Performance Implications of Using Unrepresentative Corpora in Arabic Natural Language Processing

Saied Alshahrani and 3 other authors

DEPTH+: An Enhanced Depth Metric for Wikipedia Corpora Quality

Saied Alshahrani and 2 other authors

Learning From Arabic Corpora But Not Always From Arabic Speakers: A Case Study of the Arabic Wikipedia Editions

Saied Alshahrani and 2 other authors

Roadblocks in Gender Bias Measurement for Diachronic Corpora

Saied Alshahrani and 4 other authors

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved