Image AI
Overview
Image AI is an advanced feature designed to enhance the capability of Large Language Models (LLM) by integrating visual context into the text-based processing. With Image AI, users can obtain image recommendations that are relevant to the LLM's textual response, providing a multimodal experience. This document outlines how to utilize the Image AI feature to match images with text-based answers.
Use Case
Image AI is particularly useful when a user requires a visual representation or confirmation of concepts discussed within the LLM's answers. For example, if the LLM describes a historical event, Image AI can provide relevant images to give a visual context to the description.
How It Works
Upon receiving a query and the LLM's answer, Image AI analyzes the available images using their alt
text and src
attributes. It then selects the most relevant image, based on the best match.
-
Image Match AI:
- When enabled, this feature allows the system to display related images alongside bot responses. This can enhance the interaction by providing visual context to the textual information provided by the bot. Note that additional costs may apply when this feature is enabled.
-
Image Description AI:
- When enabled, this feature allows for automatic description generation for images using OpenAI Vision as they are ingested. This helps in creating detailed descriptions of images to enhance understanding and accessibility. Additional costs may also apply for this feature.
-
Minimum Image Width:
- This setting allows the user to define the smallest width (in pixels) an image must have to be processed. Images narrower than this threshold will be ignored. The default setting is 100 pixels, which helps in filtering out thumbnails or icons that typically do not contain useful information for analysis.
-
Minimum Image Size in Bytes:
- This sets the minimum file size (in bytes) for images to be processed. The default threshold is set at 2000 bytes, which helps in avoiding very low-quality images that are unlikely to contribute valuable insights.
-
Image Quality:
- Users can select the desired image quality from three options: Low Resolution, Auto Resolution, and High Resolution.
- Low Resolution: Processes images at a lower resolution (512px x 512px), using fewer tokens for representation. Suitable for quick overview tasks.
- Auto Resolution: The AI system decides the best detail level based on the input image size and content complexity. Balances detail and performance automatically.
- High Resolution: Provides the highest level of detail, consuming more tokens but necessary for fine-grained analysis.
- Users can select the desired image quality from three options: Low Resolution, Auto Resolution, and High Resolution.
-
Create Image Descriptions Button:
- This button triggers the processing of images in documents that were uploaded before the "Enable Image Description AI" feature was activated. Once enabled, newly added images in PDFs and web documents will automatically be processed to generate descriptions. This ensures that images in previously uploaded documents can also benefit from descriptive processing.
The job will only process images not previously processed.