Answer AI
Answer AI Overview
Answer AI is a Retrieval-Augmented Generation (RAG) integration within the ai12z platform, designed to deliver accurate, context-based answers to user queries. It works by combining advanced reasoning (ReAct), a vector database for deep semantic search, and robust guardrails to ensure answers stay grounded in verified content.
How It Works
- User Query: The user submits a question via the AI search or chatbot interface.
- ReAct Engine: The ReAct (Reasoning & Action with Context, History, and Vision) module acts as the intelligent brain, orchestrating the retrieval and reasoning steps, it creates the vector query.
- Vector DB Search: The system performs a semantic search against the vector database, which stores content and metadata (including page text, tags, images, and source URLs) ingested from multiple sources—CMS, web scrapers, document repositories, knowledge bases, and cloud platforms.
- Context Filtering: Only the most relevant documents (typically the top 20) are selected and supplied as context to the LLM.
- System Prompt: The system prompt defines strict behavior for the LLM, ensuring that answers are generated strictly based on the provided context and with all required safety and compliance guardrails in place.
- Image AI Match: If relevant, Answer AI can also match images from the source content to further enrich answers.
- Streaming Results: The final answer—potentially including relevant images—is streamed back to the user in real time.
Key Features
- Contextual RAG: Answers are always grounded in the retrieved content—no hallucinations.
- Guardrails: Strict controls are enforced through the system prompt to ensure safety, compliance, and factual accuracy.
- Integrated with ReAct: Leverages contextual history, user actions, and multimodal content.
- Default Integration: Answer AI is the default integration, ReAct will evaluate which integratin it will use, if the answer can not come from another integration it will default to the Answer AI
Comparison Feature in Answer AI
When comparing multiple products, ReAct enables a parameter called requiresReasoning
. In this mode, ReAct sends separate vector queries for each product, calling Answer AI N times in parallel. The results are then aggregated by ReAct to generate a comprehensive comparison, delivering fast and accurate responses.
If requiresReasoning
is false, Answer AI streams the response directly to the user without waiting for additional reasoning steps, resulting in faster output for standard questions.
Use Cases
- Knowledge base Q&A for websites and internal portals
- Document retrieval and compliance
- Product or support search, with live content grounding
- Any scenario where you need trustworthy, explainable answers based on your own data
Dynamic Tokens
Dynamic tokens are placeholders within a prompt that are replaced with dynamic content when a prompt is processed. Here is a list of Dynamic tokens you can use:
{query}
: Inserts the user's question.{vector-query}
: Inserts the user's question.{history}
: Inserts the conversation history.{title}
: Inserts the title of the page where the search is conducted.{origin}
: Inserts the URL of the page.{language}
: Inserts the language of the page.{referrer}
: Inserts the referring URL that directed the user to the current page.{attributes}
: Allows insertion of additional; javascript would add it to the search control.{org_name}
: Inserts the name of the organization for which the Agent is configured.{purpose}
: Inserts the stated purpose of the AI bot.{org_url}
: Inserts the domain URL of the organization the bot is configured for.{context_data}
: Used by Answer AI output from the vector db{tz=America/New_York}
: Timezone creates the text of what is the time and date for the LLM to use{image_upload_description}
: Image descriptions of images uploaded by the site visitor into the search box processed by vision AI
Models that ai12z support for RAG
- GPT-4o:
A next-generation GPT variant focused on robust reasoning and versatile language understanding. - GPT-4o-mini: Defalult A scaled-down version of GPT-4o providing faster responses and reduced resource usage, while still maintaining good quality.
- Llama 3.2 Instruct (1B):
A small, instruction-tuned Llama model optimized for task-following on a modest scale. -
- Llama 3.2 Instruct (3B):
A mid-sized instruct model that balances efficiency with improved fluency and accuracy.
- Llama 3.2 Instruct (3B):
- Llama 3.2 Instruct (11B):
A larger Llama instruct model providing more coherent, context-aware responses than smaller counterparts. - Llama 3.2 Instruct (90B):
A highly capable, large-scale Llama instruct model designed for complex tasks and in-depth reasoning. - Claude 3.5 Sonnet:
A creative variant of Claude 3.5 tuned for expressive, structured writing like poetry or stylized prose. - Claude 3 Opus:
A versatile Claude 3 model offering enhanced context management and refined long-form content generation. - Claude 3 Haiku:
A succinct Claude 3 variant focusing on generating short, elegant, and poetic responses. - Gemini-1.5-Flash:
A fast, nimble generation model that emphasizes quick, responsive text outputs at version 1.5. - Gemini-1.5-Pro:
An upgraded Gemini model with professional-level language capabilities for more demanding tasks. - Gemini-1.5-Flash-8B:
A larger, high-parameter Flash variant of Gemini 1.5, combining rapid output with richer, more detailed responses.
Defining System and User Prompts
You can re-edit the Agent information and check the check box, "Recreate Prompts" AI will use the information you entered when creating the Agent to recreate the system prompt. You can always go back and re-edit those properties. Your existing system prompt will be stored in history, so you can go back and compare.
Editing System Prompts
When editing the System prompt, consider the following guidelines:
- Keep the instructions clear and concise so the AI can understand them.
Best Practices
- Test the System prompts to see how the AI interprets them with different types of user queries.
- Update prompts regularly to align with Agent goals or organization strategies changes.
- Monitor the AI's responses to ensure they meet the expected quality and adjust the prompts accordingly.
Saving Changes
After making your changes:
- Review the prompt to ensure accuracy and completeness.
- Click the "Save" button to apply the changes.
- Test the updated prompt with a few queries to ensure it functions as expected.
History
Every time you update the Answer AI System Prompt, a version of it is automatically saved to History so you can track changes over time or revert if needed.
In addition, a new version is saved to History whenever you edit the Agent and select the “Recreate Prompts” option. This ensures that any regenerated prompt versions tied to agent settings are also preserved.
Use this history to maintain version control and transparency over how your AI assistant’s behavior and tone evolve.
Conclusion
Managing prompts effectively is crucial for the optimal performance of your AI interface. By following the guidelines outlined in this document, you can ensure that your AI provides relevant, accurate responses and in line with your organization's objectives.