Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background

AAAI 2026

June 08, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Self-hosting large language models (LLMs) is increasingly appealing for organizations seeking privacy, cost control, and customization. Yet deploying and maintaining in house models poses challenges in GPU utilization, workload routing, and reliability. We introduce Pick and Spin, a practical framework that makes self hosted LLM orchestration scalable and economical. Built on Kubernetes, it integrates a unified Helm based deployment system, adaptive scale-to-zero automation, and a hybrid routing module that balances cost, latency, and accuracy using both keyword heuristics and a lightweight DistilBERT classifier. We evaluate four models Llama 3 (90 B), Gemma 3 (27 B), Qwen 3 (235 B), and DeepSeek R1 (685 B) across eight public benchmark datasets, with five inference strategies, and two routing variants encompassing 3200 prompts and 1,60,000 inference runs. Pick and Spin achieves up to 10% higher accuracy, 30% lower latency, and 33% lower GPU cost per query compared with static deployments. These results show that intelligent orchestration and efficient scaling enable enterprise grade LLM performance on self hosted infrastructure, bringing high capacity AI within practical and affordable reach.

Next from AAAI 2026

A Task-Level Explanation Framework for Meta-Learning Algorithms
workshop paper

A Task-Level Explanation Framework for Meta-Learning Algorithms

AAAI 2026

08 June 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved