Run View Test
User Documentation for Viewing Test Results
Overview
When running a test for a copilot agent, the results screen provides detailed information about the test execution, including the instructions, interactions between the user agent and the copilot, and the overall outcome of the test. This documentation will guide you through understanding and interpreting the test results.
Screen Elements
-
Header Section:
- Run Test Button:
- Location: Top-right corner of the screen.
- Function: Allows you to run the test again.
- Run Test Button:
-
Test Runs List:
- Columns:
- Ran At: The date and time when the test was run.
- Duration: The total time taken to execute the test.
- Iterations: The number of times the test instructions were executed.
- Result: The overall outcome of the test (e.g., Success, Failure).
- Function: Expandable items to view detailed messages exchanged during the test.
- Columns:
-
Detailed Messages:
- Sections: Expandable sections for each test run showing the interaction flow.
- Elements:
- Role: Indicates the entity involved in the interaction (e.g., instruction, userAgent, copilot, qaAnalysisCopilot).
- Content: The actual message or response content for each interaction step.
How to Use
-
Viewing Recent Test Runs:
- In the "Test Runs List" section, find the recent test runs displayed with their respective details.
- Click on the arrow next to a test run to expand and view the detailed messages.
-
Understanding the Interaction Flow:
- Instruction:
- This section shows the test instructions provided to the copilot agent.
- User Agent Messages:
- Displays the queries or commands sent by the user agent during the test.
- Copilot Responses:
- Shows the responses from the copilot agent to the user agent's queries.
- QA Analysis:
- Provides an analysis of the copilot's performance based on the test instructions and responses.
- Instruction:
-
Interpreting Detailed Messages:
- Review the interaction flow to understand how the copilot agent processed the instructions and responded to the user agent.
- Each message is labeled with the role and content to help you track the conversation.
- Check the responses against the instructions to ensure that the copilot agent followed the test scenarios correctly.
-
Analyzing Test Results:
- Look for any discrepancies or errors in the copilot's responses compared to the expected outcomes described in the instructions.
- Use the QA Analysis section to get a summary of the copilot's performance, which can highlight areas for improvement.
Example Test Results
-
Instruction:
Ask for what time of day from the copilot in New York City.
If the response back is in the morning, ask for the weather in New York City.
If the response back is in the afternoon, ask for the weather in Los Angeles.
If the response back is at night, ask for the weather in Boston.
If not raining, ask for the stock for Apple.
If raining, ask for the stock of Nvidia. -
User Agent:
- "What time of day is it in New York City?"
-
Copilot:
- "The current time in New York City is 11:07 AM on Wednesday, June 26, 2024."
-
User Agent:
- "What is the weather like in New York City?"
-
Copilot:
- "The current weather in New York City is clear with a temperature of 302.68 K (approximately 29.53°C or 85.15°F). It feels like 302.47 K (approximately 31.12°C or 88.02°F). The humidity is at 55%, and the wind is blowing from the west-southwest at 6.69 m/s (approximately 15 mph) with gusts up to 9.77 m/s (approximately 22 mph). The sky is clear with no clouds."
-
User Agent:
- "What is the stock price for Apple?"
-
Copilot:
- "The current stock price for Apple Inc. (AAPL) is $213.52 USD. Here are some additional details: Day High: $213.64, Day Low: $210.64, Open: $211.44, Previous Close: $209.07, Market Cap: $3.27 trillion, Fifty Day Average: $188.12, Two Hundred Day Average: $183.18, Year High: $220.20, Year Low: $164.08, Ten Day Average Volume: 118,103,520, Three Month Average Volume: 68,093,346. Here is the stock price plot for the past month:" (Graph displayed)
-
QA Analysis:
- The copilot was able to follow the instructions correctly. First, it provided the current time in New York City, which was 11:07 AM. Based on this, the next query was about the weather in New York City. The copilot responded with detailed weather information including clear skies. Since it was not raining, the final query was about the stock price for Apple, which the copilot provided accurately along with additional stock details.
Notes
- Regularly review and analyze test results to ensure the copilot agent performs accurately and meets the expected standards.
- Use the insights gained from the test results to make necessary adjustments and improvements to the agent's configuration.
This documentation should help you effectively view and interpret the results of tests run on your copilot agents. If you have any further questions or need assistance, please refer to the help section or contact support.