Presentation
CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
DescriptionLarge Language Models (LLMs) are transforming multiple fields, yet verifying their answers remains challenging, especially for complex tasks like summarization, consolidation, and knowledge extraction. We introduce CheckEmbed, a scalable and straightforward approach to LLM verification. CheckEmbed leverages a simple idea: to compare LLM-generated answers with each other or a ground truth, use their answer-level embeddings obtained from models like GPT Text Embedding Large. This method reduces complex textual answers to single embeddings, enabling fast and meaningful verification. Our comprehensive pipeline implements the CheckEmbed methodology, including metrics like embedding heatmaps to assess answer truthfulness. We demonstrate how these metrics can be used to determine whether an LLM answer is satisfactory. Applied to real-world tasks, such as term extraction and document summarization, the CheckEmbed pipeline shows notable improvements in accuracy, cost-effectiveness, and runtime compared to existing methods like BERTScore or SelfCheckGPT.
Event Type
Workshop
TimeFriday, 22 November 20249am - 9:15am EST
LocationB206
Artificial Intelligence/Machine Learning
W
Archive
view

