Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Large Language Models (LLMs) often don't perform as expected under Domain Shift or after Instruct-tuning. A reliable indicator of LLM performance in these settings could assist in decision-making. We present a method that uses the known performance in high-resource domains and fine-tuning settings to predict performance in low-resource domains or base models respectively. In our paper, we formulate the task of performance prediction, construct a dataset for and train regression models to predict the said change in performance. Our proposed methodology is lightweight and, in practice, can help researchers & practitioners decide if resources should be allocated for data labeling and LLM Instruct-tuning.