
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Large Language Models (LLMs) demonstrate impressive capabilities but exhibit inconsistent performance across diverse domains. We propose DFPE (Diverse Fingerprint Ensemble), a novel training-free method that systematically constructs subject-adaptive ensembles by balancing model diversity and competence. DFPE introduces three key innovations: (1) semantic fingerprinting using averaged response embeddings to capture distinct problem-solving patterns, (2) DBSCAN-based clustering with quantile-based competence filtering to ensure diverse yet capable model selection, and (3) exponentially-weighted aggregation adapted to subject-specific performance. Our method's effectiveness is highlighted on the challenging MMLU-pro benchmark, where DFPE achieves a striking 17.1 percentage point gain over the best single model, reaching 71.4\% accuracy. This strong performance is consistent across other standard benchmarks, with significant accuracy improvements of 4.4 points on AGIEval and 2.7 points on MMLU. Our results underscore that a systematic approach to ensemble construction - one that balances diversity, subject-specific competence, and adaptive weighting, can substantially enhance the generalization and robustness of LLMs on multifaceted language understanding tasks.
