Overview
Metric Lab is a tool that allows you to create and manage custom metrics for your AI agents.
Advanced Features of Metric Lab
Metric Lab provides powerful tools for defining, testing, and optimizing custom metrics for your AI assistant. This guide walks you through the process of refining metrics to ensure they accurately reflect real-world performance.
Defining Custom Metrics
Metric Lab allows you to define custom metrics tailored to your specific use cases. For example, you might have a metric like “Time Zone Confirmation” that checks if your AI assistant confirms the time zone before scheduling a call.
Identifying Metric Performance Issues
After deploying your metrics, you may notice discrepancies between expected and actual results:
- The AI might not confirm the time zone, yet the metric marks it as successful
- The AI confirms the time zone, but the metric marks it as unsuccessful
These inconsistencies indicate that your metric definition needs refinement.
Creating Test Sets for Optimization
When you identify calls where metrics aren’t performing as expected, you can add them to a test set directly from the interface.
Step 1: Add problematic calls to a testset
Step 2: Annotate the calls
After adding calls to your test set, annotate them with the correct outcomes:
- Review each call
- Determine what actually happened (e.g., “AI did not confirm the time zone”)
- Mark the correct outcome (e.g., set to “false” if the AI didn’t confirm)
- Save your annotation
Optimizing Your Metrics
Once you’ve built a test set with correct annotations, you can refine your metric definition:
Step 1: Navigate to the optimization section
Go to the “Optimize” section and select the metric you want to improve.
Step 2: Edit the metric prompt
The interface will show your current accuracy score (e.g., “6 out of 8”). This indicates how many test cases your current metric correctly evaluates.
Step 3: Refine the metric definition
Edit the metric prompt to be more precise. For example, you might specify that the user must mention a specific time zone for the metric to pass.
Step 4: Test and iterate
After editing:
-
Save your changes
-
Run the updated metric against your test set
-
Observe the improvement in accuracy (e.g., from 6/8 to 7/8 or 8/8)
-
Continue refining until you reach your desired accuracy
Benefits of Metric Optimization
This iterative optimization process allows you to:
- Improve metric accuracy from as low as 50% to 95% or higher
- Ensure the labels you see from your AI assistant accurately reflect real performance
- Make data-driven decisions based on reliable metrics
By following this process, you can be confident that your metrics are providing accurate insights into your AI assistant’s performance.