Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/smfs-dc20

poster

ACL 2024

August 14, 2024

Bangkok, Thailand

Cache & Distil: Optimising API Calls to Large Language Models

keywords:

llm calls optimisation

knowledge distillation

active learning

Large-scale deployment of generative AI tools often depends on costly API calls to a Large Language Model (LLM) to fulfil user queries, a process that also exposes the request stream to external providers. To curtail the frequency of these calls, one can employ a local smaller language model -a student- which is continuously trained on the responses of the LLM. This student gradually gains proficiency in independently handling an increasing number of user requests, a process we term neural caching. The crucial element in neural caching is a policy that decides which requests should be processed by the student alone and which should be redirected to the LLM, subsequently aiding the student’s learning. In this study, we focus on classification tasks, and we consider a range of classic Active Learning-based selection criteria as the policy. Our experiments suggest that Margin Sampling and Query by Committee bring consistent benefits over other policies and baselines across tasks and budgets.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Dual Prompt Tuning based Contrastive Learning for Hierarchical Text Classification
poster

Dual Prompt Tuning based Contrastive Learning for Hierarchical Text Classification

ACL 2024

+4
Sishi Xiong and 6 other authors

14 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved