profile picture

Mehwish Fatima

Graduate student @ Heidelberg Institute for Theoretical Studies

simplification

summarization

cross-lingual science journalism

2

presentations

2

number of views

SHORT BIO

I am working as a Guest NLP Scientist at Heidelberg Institute for Theoretical Studies (HITS), Germany, under Prof. Dr. Michael Strube, and enrolled as a Ph.D. scholar of Computational Linguistics at Universität Heidelberg. I am working on an R&D-based industry project for SPEKTRUM der Wissenschaft. The working title of the project/thesis is "Single Document Cross-lingual Abstractive Summarization for Scientific Texts".

During my Ph.D. project, I focused on:


1- Data Collection and Analysis

  • from online resources with various Python libraries such as Wiki Api, Beautiful Soup, Tika, NLTK and Pandas.
  • verification and analysis of curated datasets with linguistic and statistical features developed in Python with Spacy, NLTK, Pandas, MatplotLib and Seaborn.

2- Abstractive Summarization Models Development -

  • traditional recurrent neural networks - Pytorch-Cuda - server deployment
  • vanilla transformer summarizer - Pytorch-Cuda - server deployment
  • Huggingface library for pre-trained language models such as BERT, mBART, mT5, Pegasus, LongFormer Encoder-Decoder, XLSum, BigBird, etc. - Pytorch-Cuda - server deployment
  • Simplification model based on Reinforcement learning - Pytorch Cuda, Apex
  • Multi-task Learning model - Pytorch-Cuda-DeepSpeed - server deployment


3- Evaluation

  • Automatic evaluation using ROUGE, BERT Score, Flesch Kincaid Reading Ease - Python - Pytorch
  • Statistical testing of automatic results with Mann-Whitney t test - Python
  • Human judgments and their verification with Fleiss’s Kappa
  • In-depth linguistic analysis with Python

Tools and Technologies: Python, Pytorch, CUDA, DeepSpeed for model parallelization on GPU servers, WandB for computation analysis, Google Colab, Amazon AWS, TensorFlow

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved