Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Integrating knowledge from scientific literature is essential in biomedical research. However, the rapid growth of scientific literature makes staying up to date increasingly challenging. Retrieval-Augmented Generation (RAG) offers a promising framework, but its effectiveness in specialized biomedical domains remains unclear. In this work, we propose a two-stage retrieval pipeline for RAG, with a focus on Bordetella pertussis as a case study. Our method first applies hard filtering with synonym expansion to eliminate irrelevant passages, and then performs hybrid search, followed by reranking. We evaluate our approach using a dataset of 58 pertussis-related queries with automatic relevance judgments from multiple large language models (LLMs). Experimental results show that our pipeline improves MAP@10 by 13.4-20.4 points compared with existing methods and achieves the highest MRR@10. Furthermore, consistent improvements across different LLMs highlight the effectiveness of our approach.