
7
presentations
2
number of views
SHORT BIO
Saied Alshahrani is a Computer Science PhD candidate at Clarkson University (CU), Potsdam, New York, USA. He is a member of a few research labs and groups, like a full-time member of Clarkson Accountability and Transparency (CAT) at CU, a part-time member of ASAS AI Lab in Saudi Arabia, and a contributing member of the Arabic Machine Learning community (ARBML). He works with his advisor, Prof. Jeanna Matthews, on studying the representativeness of Arabic NLP corpora/datasets, where he studies the implications of using unrepresentative (auto-generated or template-translated) corpora in Arabic NLP, introduces metrics and tools to promote transparency in these Arabic corpora, and proposes systematic approaches to detect these unrepresentative corpora, all to aid the NLP practitioners and researchers to make an informed decision on whether to use such corpora in training their NLP tasks and systems.
Presentations

[PUBLISHED] Leveraging Corpus Metadata to Detect Template-based Translation: An Exploratory Case Study of the Egyptian Arabic Wikipedia Edition
Saied Alshahrani and 1 other author

CIDAR: Culturally Relevant Instruction Dataset For Arabic
Zaid Alyafeai and 11 other authors

Arabic Synonym BERT-based Adversarial Examples for Text Classification
Norah Alshahrani and 3 other authors

Performance Implications of Using Unrepresentative Corpora in Arabic Natural Language Processing
Saied Alshahrani and 3 other authors

DEPTH+: An Enhanced Depth Metric for Wikipedia Corpora Quality
Saied Alshahrani and 2 other authors

Learning From Arabic Corpora But Not Always From Arabic Speakers: A Case Study of the Arabic Wikipedia Editions
Saied Alshahrani and 2 other authors

Roadblocks in Gender Bias Measurement for Diachronic Corpora
Saied Alshahrani and 4 other authors