11 KiB
Analyze Images with OpenAI Vision while Preserving Binary Data for Reuse
Analyze Images with OpenAI Vision while Preserving Binary Data for Reuse
1. Workflow Overview
This workflow enables users to upload an image document via a web form and perform AI-powered analysis on the image using OpenAI Vision capabilities. It preserves the original binary data for reuse in subsequent AI processing steps, ensuring both the original image and the analysis results are available for further operations. The workflow is structured into three main logical blocks:
- 1.1 Input Reception: Captures image uploads via an n8n Form Trigger node.
- 1.2 AI Image Analysis: Uses OpenAI Vision to analyze the uploaded image in base64 format.
- 1.3 Data Merging and Further AI Processing: Merges the original image binary data with the AI analysis result, then sends both to an AI Agent for additional examination.
This structure supports use cases requiring image content analysis while retaining access to the original image data, such as document processing, quality control, or content verification workflows.
2. Block-by-Block Analysis
2.1 Input Reception
Overview:
This block receives an image upload from a user through a web form, collecting the binary data to be used downstream.
Nodes Involved:
- Form Trigger1
Node Details:
- Form Trigger1
- Type & Role: n8n’s Form Trigger node, acts as a webhook endpoint that presents a form to the user for data input.
- Configuration:
- Path: Unique webhook path generated by n8n for form access.
- Form Title: "Image Document Upload"
- Form Description: "Upload a image document for AI analysis"
- Form Fields: One file upload field labeled "data".
- Key Expressions/Variables: None; captures the uploaded file under the field name "data".
- Input/Output: No input nodes; outputs the uploaded file data in binary form.
- Potential Failures: User upload errors, file size limits, unsupported file types, webhook trigger failures.
- Version: TypeVersion 2.
2.2 AI Image Analysis
Overview:
Processes the uploaded image with OpenAI’s Vision model to analyze the content and generate textual insights.
Nodes Involved:
- Analyze image
Node Details:
- Analyze image
- Type & Role: LangChain OpenAI node specialized for image analysis.
- Configuration:
- Model: "gpt-4o" (GPT-4O) selected from a cached list.
- Resource: "image" to specify image processing.
- Operation: "analyze" to extract visual information.
- Input Type: "base64" — the image is passed as a base64-encoded string.
- Text parameter: Evaluates to
=data, referencing the uploaded file’s base64 data.
- Credentials: Uses OpenAI API credentials named "OpenAi account 4".
- Input/Output: Input from Form Trigger1; outputs analyzed content as text.
- Potential Failures: API authentication errors, rate limits, invalid base64 encoding, unsupported image formats, timeouts.
- Version: TypeVersion 1.8.
2.3 Data Merging and Further AI Processing
Overview:
Merges the original binary image data with the analysis results, allowing the next AI Agent node to access both data sets simultaneously. The AI Agent then re-analyzes the image content to validate or extend the initial insights.
Nodes Involved:
- Merge1
- AI Agent
- OpenAI Chat Model
Node Details:
-
Merge1
- Type & Role: Merge node combining two input streams.
- Configuration:
- Mode: "combine"
- CombineBy: "combineByPosition" — merges items based on their order in each input stream, preserving the binary data alongside the analysis text.
- Input/Output:
- Inputs:
- From "Analyze image" (analysis result)
- From "Form Trigger1" (original binary upload)
- Output: Combined data with both original binary (
data) and analysis (content).
- Inputs:
- Potential Failures: Misalignment of inputs causing data mismatch, missing inputs, or data corruption.
- Version: TypeVersion 3.2.
-
AI Agent
- Type & Role: LangChain AI Agent node for advanced language model interactions.
- Configuration:
- Text parameter: Combines the original binary data and the analysis content with the expression
=data\n {{ $json.content }}. - System Message: "analyze the image again and see if you get the same result." to prompt re-examination.
- Prompt Type: "define" — uses a defined prompt structure.
- Text parameter: Combines the original binary data and the analysis content with the expression
- Input/Output: Input from Merge1; outputs refined AI analysis results.
- Potential Failures: Expression errors if data fields are missing, API errors, response delays.
- Version: TypeVersion 2.2.
-
OpenAI Chat Model
- Type & Role: Language model node that provides chat completions for the AI Agent.
- Configuration:
- Model: "gpt-4.1-mini" selected from a list.
- Options: Default.
- Credentials: Uses the same OpenAI API credential ("OpenAi account 4").
- Input/Output: Connected as the AI language model backend for the AI Agent node.
- Potential Failures: API key issues, rate limits, model unavailability.
- Version: TypeVersion 1.2.
Sticky Notes Context:
- One large sticky note explains the importance and mechanism of preserving the binary data after the OpenAI Vision analysis by merging it back using the Merge node (
combineByPosition). - Another sticky note provides contact information for customization assistance.
3. Summary Table
| Node Name | Node Type | Functional Role | Input Node(s) | Output Node(s) | Sticky Note |
|---|---|---|---|---|---|
| Form Trigger1 | Form Trigger | Receives image upload via form | None | Analyze image, Merge1 | 📬 Contact details for customization in Sticky Note3 (duplicated here for visibility) |
| Analyze image | LangChain OpenAI (Image) | Analyzes uploaded image content | Form Trigger1 | Merge1 | Explains preservation of binary data via Merge1 in Sticky Note |
| Merge1 | Merge | Combines original image data and analysis result | Form Trigger1, Analyze image | AI Agent | Explains binary preservation strategy |
| AI Agent | LangChain AI Agent | Re-analyzes image with combined data | Merge1 | OpenAI Chat Model | Explains binary preservation strategy |
| OpenAI Chat Model | LangChain OpenAI Chat Model | Provides chat completions for AI Agent | AI Agent (ai_languageModel) | None |
4. Reproducing the Workflow from Scratch
-
Create Form Trigger Node
- Type: Form Trigger
- Set webhook path (auto-generated or custom)
- Configure form title: "Image Document Upload"
- Configure form description: "Upload a image document for AI analysis"
- Add one field: File upload labeled "data"
- Save and activate webhook.
-
Create Analyze Image Node
- Type: LangChain OpenAI (OpenAi node specialized for images)
- Set resource to "image"
- Set operation to "analyze"
- Set model ID to "gpt-4o" (or equivalent GPT-4O model)
- Set inputType to "base64"
- Set text parameter to reference input data:
=data(the base64 content from Form Trigger) - Assign OpenAI API credentials (create or select existing, e.g., "OpenAi account 4").
- Connect Form Trigger's output to this node.
-
Create Merge Node
- Type: Merge
- Set mode to "combine"
- Set combineBy to "combineByPosition"
- Connect Analyze Image node output to Merge Node input 1.
- Connect Form Trigger output to Merge Node input 2 (ensuring original binary data is preserved).
-
Create AI Agent Node
- Type: LangChain AI Agent
- Set prompt type to "define"
- Set system message to: "analyze the image again and see if you get the same result."
- Set text parameter combining original data and analysis:
=data\n {{ $json.content }} - Connect Merge node output to AI Agent input.
-
Create OpenAI Chat Model Node
- Type: LangChain OpenAI Chat Model
- Set model to "gpt-4.1-mini" (or equivalent)
- Assign the same OpenAI API credentials as used above.
- Connect AI Agent node's ai_languageModel input to this node.
-
Workflow Connections Summary
- Form Trigger1 → Analyze image (main)
- Form Trigger1 → Merge1 (second input)
- Analyze image → Merge1 (first input)
- Merge1 → AI Agent
- AI Agent (ai_languageModel) → OpenAI Chat Model
-
Activate the workflow to allow image uploads and trigger the analysis pipeline.
5. General Notes & Resources
| Note Content | Context or Link |
|---|---|
| This workflow demonstrates how to preserve and reuse an uploaded file (binary/base64) after a downstream step by merging the outputs, enabling complex AI workflows on images. | Sticky Note in workflow explaining binary preservation |
| Contact for customization help: Robert Breen, email: rbreen@ynteractive.com, LinkedIn: https://www.linkedin.com/in/robert-breen-29429625/, Website: https://ynteractive.com | Sticky Note3 |
Disclaimer: The provided text is exclusively derived from an automated n8n workflow. It adheres strictly to current content policies and contains no illegal or offensive material. All handled data is legal and public.