Answer AI

Answer AI Overview

Answer AI is a Retrieval-Augmented Generation (RAG) integration within the ai12z platform, designed to deliver accurate, context-based answers to user queries. It works by combining advanced reasoning (ReAct), a vector database for deep semantic search, and robust guardrails to ensure answers stay grounded in verified content.

Answer AI Block diagram

How It Works

User Query: The user submits a question via the AI search or chatbot interface.
ReAct Engine: The ReAct (Reasoning & Action with Context, History, and Vision) module acts as the intelligent brain, orchestrating the retrieval and reasoning steps, it creates the vector query.
Vector DB Search: The system performs a semantic search against the vector database, which stores content and metadata (including page text, tags, images, and source URLs) ingested from multiple sources—CMS, web scrapers, document repositories, knowledge bases, and cloud platforms.
Context Filtering: Only the most relevant documents (typically the top 20) are selected and supplied as context to the LLM.
System Prompt: The system prompt defines strict behavior for the LLM, ensuring that answers are generated strictly based on the provided context and with all required safety and compliance guardrails in place.
Image AI Match: If relevant, Answer AI can also match images from the source content to further enrich answers.
Streaming Results: The final answer—potentially including relevant images—is streamed back to the user in real time.

Key Features

Contextual RAG: Answers are always grounded in the retrieved content—no hallucinations.
Guardrails: Strict controls are enforced through the system prompt to ensure safety, compliance, and factual accuracy.
Integrated with ReAct: Leverages contextual history, user actions, and multimodal content.
Default Integration: Answer AI is the default integration, ReAct will evaluate which integratin it will use, if the answer can not come from another integration it will default to the Answer AI

Comparison Feature in Answer AI

When comparing multiple products, ReAct enables a parameter called requiresReasoning. In this mode, ReAct sends separate vector queries for each product, calling Answer AI N times in parallel. The results are then aggregated by ReAct to generate a comprehensive comparison, delivering fast and accurate responses.

If requiresReasoning is false, Answer AI streams the response directly to the user without waiting for additional reasoning steps, resulting in faster output for standard questions.

Use Cases

Knowledge base Q&A for websites and internal portals
Document retrieval and compliance
Product or support search, with live content grounding
Any scenario where you need trustworthy, explainable answers based on your own data

Answer AI

Dynamic Tokens

Dynamic tokens are placeholders within a prompt that are replaced with dynamic content when a prompt is processed. Here is a list of Dynamic tokens you can use:

{query}: Inserts the user's question.
{vector-query}: Inserts the user's question.
{history}: Inserts the conversation history.
{title}: Inserts the title of the page where the search is conducted.
{origin}: Inserts the URL of the page.
{language}: Inserts the language of the page.
{referrer}: Inserts the referring URL that directed the user to the current page.
{attributes}: Allows insertion of additional; javascript would add it to the search control.
{org_name}: Inserts the name of the organization for which the Agent is configured.
{purpose}: Inserts the stated purpose of the AI bot.
{org_url}: Inserts the domain URL of the organization the bot is configured for.
{context_data} : Used by Answer AI output from the vector db
{tz=America/New_York} : Timezone creates the text of what is the time and date for the LLM to use
{image_upload_description} : Image descriptions of images uploaded by the site visitor into the search box processed by vision AI

Models that ai12z support for RAG

GPT-4o:
A next-generation GPT variant focused on robust reasoning and versatile language understanding.
GPT-4o-mini: Defalult A scaled-down version of GPT-4o providing faster responses and reduced resource usage, while still maintaining good quality.
Llama 3.2 Instruct (1B):
A small, instruction-tuned Llama model optimized for task-following on a modest scale.
- Llama 3.2 Instruct (3B):
  A mid-sized instruct model that balances efficiency with improved fluency and accuracy.
Llama 3.2 Instruct (11B):
A larger Llama instruct model providing more coherent, context-aware responses than smaller counterparts.
Llama 3.2 Instruct (90B):
A highly capable, large-scale Llama instruct model designed for complex tasks and in-depth reasoning.
Claude 3.5 Sonnet:
A creative variant of Claude 3.5 tuned for expressive, structured writing like poetry or stylized prose.
Claude 3 Opus:
A versatile Claude 3 model offering enhanced context management and refined long-form content generation.
Claude 3 Haiku:
A succinct Claude 3 variant focusing on generating short, elegant, and poetic responses.
Gemini-1.5-Flash:
A fast, nimble generation model that emphasizes quick, responsive text outputs at version 1.5.
Gemini-1.5-Pro:
An upgraded Gemini model with professional-level language capabilities for more demanding tasks.
Gemini-1.5-Flash-8B:
A larger, high-parameter Flash variant of Gemini 1.5, combining rapid output with richer, more detailed responses.

Defining System and User Prompts

Auto Generate

You can re-edit the Agent information and check the check box, "Recreate Prompts" AI will use the information you entered when creating the Agent to recreate the system prompt. You can always go back and re-edit those properties. Your existing system prompt will be stored in history, so you can go back and compare.

Editing System Prompts

When editing the System prompt, consider the following guidelines:

Keep the instructions clear and concise so the AI can understand them.

Best Practices

Test the System prompts to see how the AI interprets them with different types of user queries.
Update prompts regularly to align with Agent goals or organization strategies changes.
Monitor the AI's responses to ensure they meet the expected quality and adjust the prompts accordingly.

Saving Changes

After making your changes:

Review the prompt to ensure accuracy and completeness.
Click the "Save" button to apply the changes.
Test the updated prompt with a few queries to ensure it functions as expected.

History

Every time you update the Answer AI System Prompt, a version of it is automatically saved to History so you can track changes over time or revert if needed.

In addition, a new version is saved to History whenever you edit the Agent and select the “Recreate Prompts” option. This ensures that any regenerated prompt versions tied to agent settings are also preserved.

Use this history to maintain version control and transparency over how your AI assistant’s behavior and tone evolve.

History of Answer AI

Conclusion

Managing prompts effectively is crucial for the optimal performance of your AI interface. By following the guidelines outlined in this document, you can ensure that your AI provides relevant, accurate responses and in line with your organization's objectives.

Answer AI Overview​

How It Works​

Key Features​

Comparison Feature in Answer AI​

Use Cases​

Dynamic Tokens​

Models that ai12z support for RAG​

Defining System and User Prompts​

Editing System Prompts​

Best Practices​

Saving Changes​

History​

Conclusion​