Skip to main content

Content Ingestion

Overview

The Documents section serves as the central hub for all content that your AI Agent uses to answer user queries. By uploading and managing comprehensive documentation here, you equip your Agent to deliver accurate, relevant, and helpful responses to customers, prospects, and employees.

How Content Ingestion Works

When you add content, the ingestion process automatically:

  • Extracts images and generates detailed descriptions for use by Image AI.
  • Splits documents into chunks for vectorization—creating embeddings, text, and metadata (including associated images).
  • Enhances understanding for web content with JSON-LD structured data, when present.
Best Practice for Websites: Use a CMS Connector

Connect your CMS directly for automated, ongoing sync of your site’s content. Enable CMS connectors in organization settings, then select the connector in your Agent configuration.

ai12z Documents tab showing options to upload files, add URLs, or ingest entire websites for AI knowledge base.

Ways to Add Content

Adding Files

  • Click Add a new file to upload documents (e.g., product guides, PDFs, sales decks).
  • Supported file types include: .pdf, .docx, .pptx, .xlsx, .csv, .txt, .json, .markdown, .md, and more.

Adding URLs

  • Use Add URL to ingest specific web pages or resources (including YouTube videos and public documents).
  • Paste the URL and the content will be fetched and added to your knowledge base.

Adding Entire Websites

  • Select Add Website to crawl and ingest a complete site—ideal for corporate sites, knowledge bases, and large catalogs.
  • The system will automatically discover and pull in all linked, relevant pages.

Adding and Managing Documents

  • In the Documents section, click Add Document and choose whether to add a file, URL, or website.
  • Each uploaded document appears in the list with its name, description, and last modified date.
  • Use the Action menu (three dots) for each document to:
    • Info: See document details (type, size, upload date, etc.).
    • Continue Ingest: Complete ingestion steps if you enabled features like histogram analysis.
    • Sync: Check for changes and update ingested websites automatically.
    • Delete: Remove obsolete or irrelevant content.

Document list view with action menu for managing individual documents.

Processing Multi-Step Ingestions

  • For some sites and large documents, ingestion may be multi-step:
    1. Select Continue Ingest when prompted (if, for example, you opted for advanced features like histograms).
    2. Complete the required steps to ensure all content is fully processed and searchable.

Document Status & Insights

Every document or asset has a Document Information panel with:

  • Basic details (IDs, upload status, last sync).
  • Tabs for Vector Documents (all processed chunks stored in the vector DB) and Settings (ingestion rules, language filters, etc.).

Document status and vector tab UI, showing document info, settings, and processed vector chunks. Vector tab with columns for Title, Description, Page Content, URL, Word Count, and action menu. Settings tab with include/exclude patterns and selected languages.

Best Practices for Document Management

  • Relevance: Only upload content that aligns with the questions your Agent will receive.
  • Organization: Tag and categorize documents for easier management and retrieval.
  • Keep Updated: Regularly sync and update content to ensure users always receive the latest information.
  • Review Frequently: Use the Info action to verify content types and details; remove outdated or duplicate files.

A well-maintained Documents section ensures your AI Agent can always access up-to-date, high-quality information—improving accuracy and user satisfaction.