Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
keywords:
datasets for low resource languages
large language models
benchmarking
We present JPharmatron, a Japanese domain-specific large language model (LLM) for the pharmaceutical field, developed through continual pre-training on two billion Japanese pharmaceutical tokens and eight billion English biomedical tokens. For rigorous evaluation, we introduce JPharmaBench, a benchmark suite consisting of three new benchmarks: YakugakuQA, based on national pharmacist licensing exams; NayoseQA, which tests cross-lingual synonym and terminology normalization; and SogoCheck, a novel task involving cross-document consistency checking. We evaluate our model against open-source medical LLMs and commercial models, including GPT-4o. Experimental results show that JPharmatron outperforms existing open models and achieves competitive performance with commercial ones. Interestingly, even GPT-4o performs poorly on SogoCheck, suggesting that cross-sentence consistency reasoning remains an open challenge. JPharmatron enables secure and local model deployment for pharmaceutical tasks, where privacy and legal constraints limit the use of closed models. Besides, JPharmaBench offers a reproducible framework for evaluating Japanese pharmaceutical natural language processing. Together, they demonstrate the feasibility of practical and cost-efficient language models for Japanese healthcare and pharmaceutical sectors. Our model, codes, and datasets are available on HuggingFace: https://huggingface.co/collections/EQUES/jpharmatron and https://huggingface.co/collections/EQUES/jpharmabench.