Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Once language models (LMs) are deployed, they can interact with users long-term, ideally evolving continuously based on user feedback. Asking direct feedback from users can be costly and disruptive, motivating our research in harvesting implicit user feedback from user interaction logs. In this work, we study implicit user feedback in two user-LM interaction datasets (WildChat and LMSYS). First, we analyze user feedback in the human-LLM conversation trajectory, providing insights on the patterns of implicit user feedback. Second, we study harvesting learning signals from such implicit user feedback. We find that the contents of user feedback (e.g., user wanted clarification), not just the polarity (e.g., users were unhappy with the previous model response), can provide helpful signals for improving model performance in some settings but not universally. We also find that the usefulness of user feedback is largely tied to the quality of the user's initial prompt. Together, we provide an in-depth study in implicit user feedback, showing its potential and limitations.