Get Quantitative and Comprehensive Insights into your Search Quality with Auto-Evaluations
It’s a warm July here at Objective HQ, and the team is shipping another feature out this morning, with Auto-Evaluations moving into Beta! Auto-Evaluations give you the ability to quickly gain confidence in the quality of your search results by measuring their relevance with Objective’s built-in intelligent search quality ratings. These evaluations also have powerful implications for Finetuning, serving as the foundation from which to train new Finetuned indexes. If you already have an account, get started measuring your Indexes today! If you don’t have an account yet, go get one! We’ll add you as fast as we can.
Understanding & Iterating on Relevance
Improvement starts with measurement. Before deploying any search to production, our customers need to ensure that the results meet their high standards to delight their users. However, knowing where to start can be challenging. Many resort to spot-checking and “looks good to me” evaluations. That's where we saw an opportunity to supercharge their workflows. Auto-Evaluation provides a scalable, quantitatively driven starting point. It combines the ability to automatically generate example search queries with the capability to pull the top search results for specified queries and provide relevance ratings in the form of “GREAT,” “OK,” or “BAD.” This combination offers a comprehensive way to evaluate the relevance of your Index and identify your strengths and weaknesses. Further, you can use those insights to teach a new index about the subtleties of your business with Finetuning.
Step 1: Establish your Queries of Importance
Jump into the new Quality
tab of your Index to get started. You’ll see that Auto-Evaluation uses the Queries you have in your Query Store to automatically create grades for use in evaluation.
You can evaluate relevance with queries you’ve loaded into your account. They can take the form of the Top K most common queries, common queries from a particularly desirable user base, or any set of queries that are important to get right for the success of your business.
Alternatively, you can use our Query Generation functionality to have Objective create sample search queries for you. Generating Queries takes a few minutes and references the Objects in your Object Store to simulate potential queries from your users.
Step 2: Start Evaluating your Index Quality
After establishing your desired query set in the Query Store, head back to your Index’s Quality
tab and kick off the Auto-Evaluation process. It’ll generally take a few minutes, depending on the size of your Object Store and the number of queries you included in the Evaluation. Feel free to explore the rest of the platform as the evaluation runs in the background.
Step 3: Explore Strengths and Areas of Opportunity
Once the evaluation is complete, you'll see a snapshot view of how your index is performing across the defined queries in your Query Store. The platform simulates a ‘human-like’ assessment of the top 10 results for each query. These individual query evaluations are summarized at the top, showing the percentage of results graded as great, neutral, or bad.
Dig into the quality of each result by hovering over any of the 10 results for each query to see object details and the platform’s explanation for that individual grade.
Step 4: Refine a New Index with Auto-Evaluations & Compare
Auto-Evaluations help you assess the quality of your search results at scale, enabling frequent reviews and deep insights into areas where the search experience may need improvement. With automation, this process becomes quicker and more efficient. Once you identify opportunities for enhancement, you can use Finetuning to address them and automatically improve your search performance.
Creating a new Finetuned index with the queries in your Query Store will automatically trigger evaluations for both the new Finetuned index as well as your Base index for easy comparison. Simply click Compare
to view side by side Evaluations. You can also compare any two indexes using the Analyze tool under the Relevance
tab in the global navigation.
You can also dive into the details of a single query comparing the search results between the Base and Finetuned index. Unlike the partial snapshot provided with Evaluations, this functionality will pull all results as they would appear in production.
So what are you waiting for? Try running some evaluations of your own. We can’t wait to see what you find!