mirror of
https://github.com/khoaliber/n8nworkflows.xyz.git
synced 2026-04-19 17:14:37 +00:00
creation
This commit is contained in:
@@ -0,0 +1,705 @@
|
||||
Generate 8-second product ad videos from Drive images with Gemini and Veo
|
||||
|
||||
https://n8nworkflows.xyz/workflows/generate-8-second-product-ad-videos-from-drive-images-with-gemini-and-veo-13920
|
||||
|
||||
|
||||
# Generate 8-second product ad videos from Drive images with Gemini and Veo
|
||||
|
||||
# 1. Workflow Overview
|
||||
|
||||
This workflow generates an 8-second advertising video from a single product image stored in Google Drive. It analyzes the image with Gemini, converts that analysis into a structured short-form ad prompt, sends the prompt plus the source image to Google Veo as a long-running video generation request, polls until the video is ready, downloads the MP4, and uploads the final result back to Google Drive.
|
||||
|
||||
Typical use cases:
|
||||
- Rapid creation of short product ad videos from static product shots
|
||||
- AI-assisted ad concept generation for e-commerce or social media
|
||||
- Automated creative production pipelines using Google Drive as input/output storage
|
||||
|
||||
## 1.1 Input Reception and Image Preparation
|
||||
|
||||
The workflow starts manually, downloads a selected image from Google Drive, and prepares two parallel forms of the same asset:
|
||||
- binary image data for Gemini image analysis
|
||||
- base64-encoded image data for the Veo API request
|
||||
|
||||
## 1.2 Image Understanding and Prompt Creation
|
||||
|
||||
Gemini first analyzes the uploaded product image and produces a structured visual brief. A second LLM chain then converts that brief into an advertising-oriented video prompt, with a structured output parser enforcing a JSON field called `video_prompt`.
|
||||
|
||||
## 1.3 Prompt/Image Merge and Video Generation Request
|
||||
|
||||
The workflow merges:
|
||||
- the base64 image from the extraction branch
|
||||
- the structured video prompt from the AI branch
|
||||
|
||||
It then sends both to the Veo long-running prediction endpoint with generation parameters such as aspect ratio, resolution, and duration.
|
||||
|
||||
## 1.4 Long-Running Job Polling
|
||||
|
||||
Because video generation is asynchronous, the workflow polls the operation URL returned by the Veo request. It waits 30 seconds between checks and loops until a generated video URI becomes available.
|
||||
|
||||
## 1.5 Video Download and Drive Upload
|
||||
|
||||
Once the video URI exists, the workflow downloads the generated MP4 and uploads it into a target Google Drive folder with a timestamped filename.
|
||||
|
||||
---
|
||||
|
||||
# 2. Block-by-Block Analysis
|
||||
|
||||
## 2.1 Input Reception and Image Preparation
|
||||
|
||||
### Overview
|
||||
This block triggers the workflow, retrieves the source image from Google Drive, and prepares the file for downstream AI and API usage. It splits into two branches: one for image analysis and one for base64 encoding.
|
||||
|
||||
### Nodes Involved
|
||||
- When clicking 'Test workflow'
|
||||
- Download ad image
|
||||
- Extract Model Image
|
||||
- Section – Input image
|
||||
|
||||
### Node Details
|
||||
|
||||
#### 1) When clicking 'Test workflow'
|
||||
- **Type and role:** `Manual Trigger`; starts the workflow manually from the editor.
|
||||
- **Configuration choices:** No custom parameters.
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:** No input; outputs to **Download ad image**.
|
||||
- **Version-specific requirements:** Type version `1`.
|
||||
- **Edge cases / failures:** None beyond normal manual execution constraints.
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 2) Download ad image
|
||||
- **Type and role:** `Google Drive`; downloads the source product image.
|
||||
- **Configuration choices:**
|
||||
- Operation: `download`
|
||||
- File selected explicitly by file ID
|
||||
- Downloaded binary stored under property `ad_img`
|
||||
- **Key expressions or variables used:** None in expressions; file chosen from Drive picker.
|
||||
- **Input and output connections:**
|
||||
- Input from **When clicking 'Test workflow'**
|
||||
- Outputs to **Extract Model Image** and **Creative Visualiser**
|
||||
- **Version-specific requirements:** Type version `3`.
|
||||
- **Edge cases / failures:**
|
||||
- Missing or revoked Google Drive OAuth2 credentials
|
||||
- File no longer exists or access revoked
|
||||
- Wrong file type relative to downstream hardcoded MIME assumption (`image/png` later)
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 3) Extract Model Image
|
||||
- **Type and role:** `Extract From File`; converts binary image content into a property for reuse.
|
||||
- **Configuration choices:**
|
||||
- Operation: binary to property
|
||||
- Source binary property: `ad_img`
|
||||
- Destination field: `ad_img_base64`
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:**
|
||||
- Input from **Download ad image**
|
||||
- Output to **Merge**
|
||||
- **Version-specific requirements:** Type version `1.1`.
|
||||
- **Edge cases / failures:**
|
||||
- Missing binary property `ad_img`
|
||||
- Non-binary or corrupted binary input
|
||||
- Large images may increase payload size to Veo
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 4) Section – Input image
|
||||
- **Type and role:** `Sticky Note`; documentation only.
|
||||
- **Configuration choices:** Comment: “Download the input image from Drive and convert it to base64.”
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:** None.
|
||||
- **Version-specific requirements:** Type version `1`.
|
||||
- **Edge cases / failures:** None.
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
---
|
||||
|
||||
## 2.2 Image Understanding and Prompt Creation
|
||||
|
||||
### Overview
|
||||
This block uses Gemini to analyze the product image, then passes the resulting description into a second AI chain that turns the analysis into a concise advertising-oriented video script/prompt. The structured parser ensures the chain returns a predictable `video_prompt` field.
|
||||
|
||||
### Nodes Involved
|
||||
- Creative Visualiser
|
||||
- Google Gemini Chat Model1
|
||||
- Product Video Prompt
|
||||
- Structured Output Parser
|
||||
- Section – Create video prompt
|
||||
- Sample Promt
|
||||
|
||||
### Node Details
|
||||
|
||||
#### 1) Creative Visualiser
|
||||
- **Type and role:** `Google Gemini` image analysis node; performs visual analysis on the source image.
|
||||
- **Configuration choices:**
|
||||
- Resource: `image`
|
||||
- Operation: `analyze`
|
||||
- Input type: `binary`
|
||||
- Binary property: `ad_img`
|
||||
- Model: `models/gemini-2.5-flash`
|
||||
- Prompt instructs Gemini to return a “Visual Technical Brief” with strict sections:
|
||||
- ENTITY
|
||||
- VISUAL ATTRIBUTES
|
||||
- LIGHTING SETUP
|
||||
- BACKGROUND & VIBE
|
||||
- TECHNICAL SPEC
|
||||
- **Key expressions or variables used:** Binary input `ad_img`.
|
||||
- **Input and output connections:**
|
||||
- Input from **Download ad image**
|
||||
- Output to **Product Video Prompt**
|
||||
- **Version-specific requirements:** Type version `1.1`; requires Gemini-compatible credentials in n8n.
|
||||
- **Edge cases / failures:**
|
||||
- Gemini credential/auth issues
|
||||
- Unsupported or damaged image
|
||||
- Model output may still vary despite strict formatting instructions
|
||||
- If binary property name changes, node fails
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 2) Google Gemini Chat Model1
|
||||
- **Type and role:** `Google Gemini Chat Model`; acts as the LLM backend for the chain node.
|
||||
- **Configuration choices:** Default options; no explicit model override shown here.
|
||||
- **Key expressions or variables used:** None directly.
|
||||
- **Input and output connections:**
|
||||
- AI language model connection into **Product Video Prompt**
|
||||
- **Version-specific requirements:** Type version `1`.
|
||||
- **Edge cases / failures:**
|
||||
- Credential issues
|
||||
- Model/provider rate limiting
|
||||
- Output inconsistency if model defaults change
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 3) Product Video Prompt
|
||||
- **Type and role:** `LangChain LLM Chain`; transforms the visual analysis into a short ad-oriented prompt.
|
||||
- **Configuration choices:**
|
||||
- Prompt type: define
|
||||
- Input text: `Image description: {{ $json.content.parts[0].values() }}`
|
||||
- Uses a long system-style instruction asking the model to behave as a Creative Director and Copywriter
|
||||
- Requires English output
|
||||
- Constrains pacing for an 8-second ad
|
||||
- Asks for tone, visuals, dialogue, and audio/SFX
|
||||
- Has an output parser attached
|
||||
- **Key expressions or variables used:**
|
||||
- `{{ $json.content.parts[0].values() }}`
|
||||
- **Input and output connections:**
|
||||
- Main input from **Creative Visualiser**
|
||||
- AI language model input from **Google Gemini Chat Model1**
|
||||
- AI output parser input from **Structured Output Parser**
|
||||
- Main output to **Merge**
|
||||
- **Version-specific requirements:** Type version `1.6`.
|
||||
- **Edge cases / failures:**
|
||||
- Expression fragility: `content.parts[0].values()` assumes a specific Gemini output shape
|
||||
- Prompt text contains contradictory commentary: sticky note says “8s Vietnamese script” but node prompt says “Use natural English”
|
||||
- If parsed output does not match parser schema, chain may fail
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 4) Structured Output Parser
|
||||
- **Type and role:** `Structured Output Parser`; forces chain output into a schema.
|
||||
- **Configuration choices:**
|
||||
- Manual schema
|
||||
- Requires JSON object with:
|
||||
- `video_prompt` string
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:**
|
||||
- AI output parser connection to **Product Video Prompt**
|
||||
- **Version-specific requirements:** Type version `1.3`.
|
||||
- **Edge cases / failures:**
|
||||
- The chain prompt asks for multi-section output, but the parser requires only `video_prompt`
|
||||
- If the model returns non-JSON or extraneous text, parsing can fail
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 5) Section – Create video prompt
|
||||
- **Type and role:** `Sticky Note`; explains the function of this block.
|
||||
- **Configuration choices:** Comment says Gemini turns the image brief into an 8s Vietnamese script.
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:** None.
|
||||
- **Version-specific requirements:** Type version `1`.
|
||||
- **Edge cases / failures:** None.
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 6) Sample Promt
|
||||
- **Type and role:** `Set`; disabled helper node containing sample prompts in pinned data.
|
||||
- **Configuration choices:** Disabled; no effective runtime role.
|
||||
- **Key expressions or variables used:** None in active config.
|
||||
- **Input and output connections:** None.
|
||||
- **Version-specific requirements:** Type version `3.4`.
|
||||
- **Edge cases / failures:** None, because disabled.
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
---
|
||||
|
||||
## 2.3 Prompt/Image Merge and Video Generation Request
|
||||
|
||||
### Overview
|
||||
This block combines the AI-generated prompt and the base64-encoded image into one item, then sends them to the Veo long-running video generation endpoint.
|
||||
|
||||
### Nodes Involved
|
||||
- Merge
|
||||
- Generate Video
|
||||
- Section – Generate video
|
||||
|
||||
### Node Details
|
||||
|
||||
#### 1) Merge
|
||||
- **Type and role:** `Merge`; combines both branches into a single item.
|
||||
- **Configuration choices:**
|
||||
- Mode: `combine`
|
||||
- Combine by: `position`
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:**
|
||||
- Input 1 from **Extract Model Image**
|
||||
- Input 2 from **Product Video Prompt**
|
||||
- Output to **Generate Video**
|
||||
- **Version-specific requirements:** Type version `3.1`.
|
||||
- **Edge cases / failures:**
|
||||
- Position-based merge assumes both branches emit matching item counts in matching order
|
||||
- If one branch fails or returns no item, merge result may be empty or malformed
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 2) Generate Video
|
||||
- **Type and role:** `HTTP Request`; calls Veo long-running generation API.
|
||||
- **Configuration choices:**
|
||||
- Method: `POST`
|
||||
- URL: `https://generativelanguage.googleapis.com/v1beta/models/veo-3.1-fast-generate-preview:predictLongRunning`
|
||||
- Authentication: generic header auth
|
||||
- Header: `Content-Type: application/json`
|
||||
- JSON body includes:
|
||||
- `image.bytesBase64Encoded` from `{{ $json.ad_img_base64 }}`
|
||||
- `image.mimeType` hardcoded to `image/png`
|
||||
- `prompt` from `{{ JSON.stringify($json.output.video_prompt) }}`
|
||||
- parameters:
|
||||
- `aspectRatio`: `16:9`
|
||||
- `resolution`: `720p`
|
||||
- `durationSeconds`: `8`
|
||||
- `sampleCount`: `1`
|
||||
- **Key expressions or variables used:**
|
||||
- `{{ $json.ad_img_base64 }}`
|
||||
- `{{ JSON.stringify($json.output.video_prompt) }}`
|
||||
- **Input and output connections:**
|
||||
- Input from **Merge**
|
||||
- Output to **Get URL Download**
|
||||
- **Version-specific requirements:** Type version `4.4`.
|
||||
- **Edge cases / failures:**
|
||||
- Invalid or expired API key in header auth credential
|
||||
- Request body mismatch with current Veo API requirements
|
||||
- Hardcoded `image/png` may be wrong if Drive file is JPEG/WebP
|
||||
- Large base64 payloads may hit API size limits
|
||||
- Preview model names may change or be region-restricted
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 3) Section – Generate video
|
||||
- **Type and role:** `Sticky Note`; explains the generation step.
|
||||
- **Configuration choices:** Comment: “Start the Veo job using the image + prompt.”
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:** None.
|
||||
- **Version-specific requirements:** Type version `1`.
|
||||
- **Edge cases / failures:** None.
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
---
|
||||
|
||||
## 2.4 Long-Running Job Polling
|
||||
|
||||
### Overview
|
||||
This block repeatedly checks the Veo operation endpoint until the generated video URI becomes available. It uses a wait loop and an IF condition to branch between “ready” and “not ready yet”.
|
||||
|
||||
### Nodes Involved
|
||||
- Get URL Download
|
||||
- Wait
|
||||
- If
|
||||
|
||||
### Node Details
|
||||
|
||||
#### 1) Get URL Download
|
||||
- **Type and role:** `HTTP Request`; polls the operation resource returned by Veo.
|
||||
- **Configuration choices:**
|
||||
- URL expression: `https://generativelanguage.googleapis.com/v1beta/{{ $json.name }}`
|
||||
- Method defaults to GET
|
||||
- Authentication: generic header auth
|
||||
- Header includes `Content-Type: application/json`
|
||||
- **Key expressions or variables used:**
|
||||
- `{{ $json.name }}`
|
||||
- **Input and output connections:**
|
||||
- Input from **Generate Video**
|
||||
- Also input from **If** false branch to continue polling
|
||||
- Output to **Wait**
|
||||
- **Version-specific requirements:** Type version `4.4`.
|
||||
- **Edge cases / failures:**
|
||||
- Assumes initial Veo response contains operation name in `name`
|
||||
- If later loop payload shape differs, URL expression may break
|
||||
- Auth failures or quota limits on repeated polling
|
||||
- Polling interval may be too short or too long for production usage
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 2) Wait
|
||||
- **Type and role:** `Wait`; pauses execution between polling attempts.
|
||||
- **Configuration choices:**
|
||||
- Amount: `30`
|
||||
- Default unit in this node/version is typically seconds unless otherwise set in UI
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:**
|
||||
- Input from **Get URL Download**
|
||||
- Output to **If**
|
||||
- **Version-specific requirements:** Type version `1.1`.
|
||||
- **Edge cases / failures:**
|
||||
- Long-running executions may exceed environment-level limits
|
||||
- Wait node resumes through n8n’s internal wait/resume mechanism; misconfigured instance URLs or webhook settings can affect resumptions in some setups
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 3) If
|
||||
- **Type and role:** `If`; checks whether the generated video URI exists yet.
|
||||
- **Configuration choices:**
|
||||
- Condition type: string `notEmpty`
|
||||
- Checked value: `{{ $json.response.generateVideoResponse.generatedSamples[0].video.uri }}`
|
||||
- **Key expressions or variables used:**
|
||||
- `{{ $json.response.generateVideoResponse.generatedSamples[0].video.uri }}`
|
||||
- **Input and output connections:**
|
||||
- Input from **Wait**
|
||||
- True output to **Get Dowload Video**
|
||||
- False output back to **Get URL Download**
|
||||
- **Version-specific requirements:** Type version `2.3`.
|
||||
- **Edge cases / failures:**
|
||||
- Expression may fail if intermediate objects are missing
|
||||
- If API response format changes, readiness check breaks
|
||||
- Infinite loop possible if job never completes and no timeout guard is added
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
---
|
||||
|
||||
## 2.5 Video Download and Drive Upload
|
||||
|
||||
### Overview
|
||||
Once Veo reports a final asset URI, this block downloads the generated video and uploads it into Google Drive.
|
||||
|
||||
### Nodes Involved
|
||||
- Get Dowload Video
|
||||
- Upload to Drive
|
||||
- Section – Download & upload
|
||||
|
||||
### Node Details
|
||||
|
||||
#### 1) Get Dowload Video
|
||||
- **Type and role:** `HTTP Request`; downloads the generated video from the URI returned by Veo.
|
||||
- **Configuration choices:**
|
||||
- URL expression: `{{ $json.response.generateVideoResponse.generatedSamples[0].video.uri }}`
|
||||
- Authentication: generic header auth
|
||||
- **Key expressions or variables used:**
|
||||
- `{{ $json.response.generateVideoResponse.generatedSamples[0].video.uri }}`
|
||||
- **Input and output connections:**
|
||||
- Input from **If** true branch
|
||||
- Output to **Upload to Drive**
|
||||
- **Version-specific requirements:** Type version `4.4`.
|
||||
- **Edge cases / failures:**
|
||||
- Node name contains a typo (“Dowload”)
|
||||
- If response format is not set to file in node UI, upload step may not receive expected binary data
|
||||
- Download URL may expire
|
||||
- Auth may be required depending on endpoint behavior
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 2) Upload to Drive
|
||||
- **Type and role:** `Google Drive`; uploads the generated video to a Drive folder.
|
||||
- **Configuration choices:**
|
||||
- Filename expression: `{{ 'ad_video_' + $now.toMillis() + '.mp4' }}`
|
||||
- Drive: `My Drive`
|
||||
- Destination folder explicitly selected
|
||||
- `onError`: continue regular output
|
||||
- **Key expressions or variables used:**
|
||||
- `{{ 'ad_video_' + $now.toMillis() + '.mp4' }}`
|
||||
- **Input and output connections:**
|
||||
- Input from **Get Dowload Video**
|
||||
- No downstream node
|
||||
- **Version-specific requirements:** Type version `3`.
|
||||
- **Edge cases / failures:**
|
||||
- OAuth permission issues
|
||||
- Wrong binary property if HTTP download node does not expose file as expected
|
||||
- Folder access may be lost
|
||||
- Because `onError` is set to continue, upload failures may not stop the workflow and can be missed unless execution logs are inspected
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
#### 3) Section – Download & upload
|
||||
- **Type and role:** `Sticky Note`; explains the finalization block.
|
||||
- **Configuration choices:** Comment: “Poll until ready, download MP4, upload to Drive.”
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:** None.
|
||||
- **Version-specific requirements:** Type version `1`.
|
||||
- **Edge cases / failures:** None.
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
---
|
||||
|
||||
## 2.6 Global Documentation Note
|
||||
|
||||
### Overview
|
||||
This is a top-level documentation note describing the workflow’s purpose, setup steps, and tunable parameters.
|
||||
|
||||
### Nodes Involved
|
||||
- Main overview
|
||||
|
||||
### Node Details
|
||||
|
||||
#### 1) Main overview
|
||||
- **Type and role:** `Sticky Note`; project overview and setup instructions.
|
||||
- **Configuration choices:** Contains functional summary and setup checklist:
|
||||
1. Connect Google Drive, Gemini, and Veo API credentials
|
||||
2. Set the input image in **Download ad image**
|
||||
3. Set output folder in **Upload to Drive**
|
||||
4. Optionally adjust `aspectRatio`, `resolution`, `durationSeconds` in **Generate Video**
|
||||
- **Key expressions or variables used:** None.
|
||||
- **Input and output connections:** None.
|
||||
- **Version-specific requirements:** Type version `1`.
|
||||
- **Edge cases / failures:** None.
|
||||
- **Sub-workflow reference:** None.
|
||||
|
||||
---
|
||||
|
||||
# 3. Summary Table
|
||||
|
||||
| Node Name | Node Type | Functional Role | Input Node(s) | Output Node(s) | Sticky Note |
|
||||
|---|---|---|---|---|---|
|
||||
| When clicking 'Test workflow' | Manual Trigger | Manual entry point for execution | | Download ad image | ## AI Product Advertising Video<br>### How it works<br>This workflow generates an 8-second product advertising video from a single input image. It downloads the image from Google Drive, converts it to base64 for the API request, analyzes it with Gemini (Creative Visualiser), then turns the description into a short video script/prompt. The prompt + image are sent to Veo to start a long-running video generation job. The workflow polls until a video URI is available, downloads the MP4, and uploads it back to Google Drive.<br>### Setup<br>1) Connect credentials used in this workflow: Google Drive + Google Gemini, and an API key for the Veo HTTP requests.<br>2) Set the input image file in **Download ad image**.<br>3) Set the output folder in **Upload to Drive**.<br>4) (Optional) Adjust `aspectRatio`, `resolution`, and `durationSeconds` in **Generate Video**, then execute the workflow. |
|
||||
| Download ad image | Google Drive | Download source image from Drive as binary | When clicking 'Test workflow' | Extract Model Image; Creative Visualiser | ## Input image<br>Download the input image from Drive and convert it to base64. |
|
||||
| Extract Model Image | Extract From File | Convert binary image to base64 property | Download ad image | Merge | ## Input image<br>Download the input image from Drive and convert it to base64. |
|
||||
| Creative Visualiser | Google Gemini | Analyze source image and produce visual brief | Download ad image | Product Video Prompt | ## Create video prompt<br>Gemini turns the image brief into an 8s Vietnamese script. |
|
||||
| Google Gemini Chat Model1 | Google Gemini Chat Model | LLM backend for ad prompt chain | | Product Video Prompt | ## Create video prompt<br>Gemini turns the image brief into an 8s Vietnamese script. |
|
||||
| Product Video Prompt | LangChain LLM Chain | Convert visual brief into structured video prompt | Creative Visualiser | Merge | ## Create video prompt<br>Gemini turns the image brief into an 8s Vietnamese script. |
|
||||
| Structured Output Parser | Structured Output Parser | Enforce schema with `video_prompt` string | | Product Video Prompt | ## Create video prompt<br>Gemini turns the image brief into an 8s Vietnamese script. |
|
||||
| Sample Promt | Set | Disabled sample prompt holder | | | |
|
||||
| Merge | Merge | Combine image base64 and AI prompt into one item | Extract Model Image; Product Video Prompt | Generate Video | |
|
||||
| Generate Video | HTTP Request | Start Veo long-running video generation job | Merge | Get URL Download | ## Generate video<br>Start the Veo job using the image + prompt. |
|
||||
| Get URL Download | HTTP Request | Poll Veo operation status endpoint | Generate Video; If | Wait | ## Download & upload<br>Poll until ready, download MP4, upload to Drive. |
|
||||
| Wait | Wait | Pause between polling attempts | Get URL Download | If | ## Download & upload<br>Poll until ready, download MP4, upload to Drive. |
|
||||
| If | If | Check if generated video URI is available | Wait | Get Dowload Video; Get URL Download | ## Download & upload<br>Poll until ready, download MP4, upload to Drive. |
|
||||
| Get Dowload Video | HTTP Request | Download generated MP4/video asset | If | Upload to Drive | ## Download & upload<br>Poll until ready, download MP4, upload to Drive. |
|
||||
| Upload to Drive | Google Drive | Upload generated video file back to Drive | Get Dowload Video | | ## Download & upload<br>Poll until ready, download MP4, upload to Drive. |
|
||||
| Main overview | Sticky Note | Global documentation and setup instructions | | | ## AI Product Advertising Video<br>### How it works<br>This workflow generates an 8-second product advertising video from a single input image. It downloads the image from Google Drive, converts it to base64 for the API request, analyzes it with Gemini (Creative Visualiser), then turns the description into a short video script/prompt. The prompt + image are sent to Veo to start a long-running video generation job. The workflow polls until a video URI is available, downloads the MP4, and uploads it back to Google Drive.<br>### Setup<br>1) Connect credentials used in this workflow: Google Drive + Google Gemini, and an API key for the Veo HTTP requests.<br>2) Set the input image file in **Download ad image**.<br>3) Set the output folder in **Upload to Drive**.<br>4) (Optional) Adjust `aspectRatio`, `resolution`, and `durationSeconds` in **Generate Video**, then execute the workflow. |
|
||||
| Section – Input image | Sticky Note | Visual documentation for input block | | | ## Input image<br>Download the input image from Drive and convert it to base64. |
|
||||
| Section – Create video prompt | Sticky Note | Visual documentation for prompt creation block | | | ## Create video prompt<br>Gemini turns the image brief into an 8s Vietnamese script. |
|
||||
| Section – Generate video | Sticky Note | Visual documentation for video generation block | | | ## Generate video<br>Start the Veo job using the image + prompt. |
|
||||
| Section – Download & upload | Sticky Note | Visual documentation for polling and upload block | | | ## Download & upload<br>Poll until ready, download MP4, upload to Drive. |
|
||||
|
||||
---
|
||||
|
||||
# 4. Reproducing the Workflow from Scratch
|
||||
|
||||
1. **Create a new workflow**
|
||||
- Name it something like: `Use AI to generate advertising Video (EN)`.
|
||||
- Keep execution order at default unless you specifically need `v1`.
|
||||
- If your n8n instance handles binary data separately, ensure binary mode is compatible with file download/upload flows.
|
||||
|
||||
2. **Add a Manual Trigger node**
|
||||
- Node type: **Manual Trigger**
|
||||
- Name it: `When clicking 'Test workflow'`
|
||||
- No additional configuration needed.
|
||||
|
||||
3. **Add a Google Drive node to download the source image**
|
||||
- Node type: **Google Drive**
|
||||
- Name it: `Download ad image`
|
||||
- Operation: **Download**
|
||||
- Authenticate with **Google Drive OAuth2**
|
||||
- Select the source file ID from Drive
|
||||
- In options, set binary property name to `ad_img`
|
||||
- Connect:
|
||||
- `When clicking 'Test workflow'` → `Download ad image`
|
||||
|
||||
4. **Add an Extract From File node to create a base64 field**
|
||||
- Node type: **Extract From File**
|
||||
- Name it: `Extract Model Image`
|
||||
- Operation: **Binary to Property**
|
||||
- Source binary property: `ad_img`
|
||||
- Destination key: `ad_img_base64`
|
||||
- Connect:
|
||||
- `Download ad image` → `Extract Model Image`
|
||||
|
||||
5. **Add an image analysis Gemini node**
|
||||
- Node type: **Google Gemini**
|
||||
- Name it: `Creative Visualiser`
|
||||
- Resource: **Image**
|
||||
- Operation: **Analyze**
|
||||
- Input type: **Binary**
|
||||
- Binary property name: `ad_img`
|
||||
- Model: `models/gemini-2.5-flash`
|
||||
- Authenticate with **Google Gemini / PaLM API credentials**
|
||||
- Paste the visual analysis instruction prompt that asks for:
|
||||
- ENTITY
|
||||
- VISUAL ATTRIBUTES
|
||||
- LIGHTING SETUP
|
||||
- BACKGROUND & VIBE
|
||||
- TECHNICAL SPEC
|
||||
- Connect:
|
||||
- `Download ad image` → `Creative Visualiser`
|
||||
|
||||
6. **Add a Gemini chat model node for the LangChain chain**
|
||||
- Node type: **Google Gemini Chat Model**
|
||||
- Name it: `Google Gemini Chat Model1`
|
||||
- Authenticate with the same or another valid Gemini credential
|
||||
- Default model options are acceptable unless you want stricter determinism.
|
||||
|
||||
7. **Add a Structured Output Parser node**
|
||||
- Node type: **Structured Output Parser**
|
||||
- Name it: `Structured Output Parser`
|
||||
- Schema type: **Manual**
|
||||
- Use a schema requiring:
|
||||
- object
|
||||
- property `video_prompt` of type string
|
||||
- mark `video_prompt` as required
|
||||
|
||||
8. **Add a LangChain chain node for prompt generation**
|
||||
- Node type: **Basic LLM Chain / Chain LLM**
|
||||
- Name it: `Product Video Prompt`
|
||||
- Prompt type: **Define**
|
||||
- Set the input text to:
|
||||
- `Image description: {{ $json.content.parts[0].values() }}`
|
||||
- Add the long instruction message that defines the model as a Creative Director/Copywriter and asks for ad-style output.
|
||||
- Important: because the parser expects only `video_prompt`, adjust the prompt if needed so the model returns a single field matching the parser. The current exported workflow may be logically inconsistent here.
|
||||
- Enable output parser support.
|
||||
- Connect:
|
||||
- Main: `Creative Visualiser` → `Product Video Prompt`
|
||||
- AI language model: `Google Gemini Chat Model1` → `Product Video Prompt`
|
||||
- AI output parser: `Structured Output Parser` → `Product Video Prompt`
|
||||
|
||||
9. **Add a Merge node**
|
||||
- Node type: **Merge**
|
||||
- Name it: `Merge`
|
||||
- Mode: **Combine**
|
||||
- Combine by: **Position**
|
||||
- Connect:
|
||||
- `Extract Model Image` → `Merge` input 1
|
||||
- `Product Video Prompt` → `Merge` input 2
|
||||
- This assumes both branches return one item in the same order.
|
||||
|
||||
10. **Add the Veo generation HTTP Request node**
|
||||
- Node type: **HTTP Request**
|
||||
- Name it: `Generate Video`
|
||||
- Method: **POST**
|
||||
- URL:
|
||||
- `https://generativelanguage.googleapis.com/v1beta/models/veo-3.1-fast-generate-preview:predictLongRunning`
|
||||
- Authentication: **Generic Credential Type**
|
||||
- Auth type: **HTTP Header Auth**
|
||||
- Credential should include your Gemini/Veo API key, typically as `x-goog-api-key` or whatever your saved credential defines.
|
||||
- Enable **Send Headers**
|
||||
- Add header:
|
||||
- `Content-Type: application/json`
|
||||
- Body type: **JSON**
|
||||
- JSON body should include:
|
||||
- `instances[0].image.bytesBase64Encoded` = `{{$json.ad_img_base64}}`
|
||||
- `instances[0].image.mimeType` = `image/png`
|
||||
- `instances[0].prompt` = `{{ JSON.stringify($json.output.video_prompt) }}`
|
||||
- `parameters.aspectRatio` = `16:9`
|
||||
- `parameters.resolution` = `720p`
|
||||
- `parameters.durationSeconds` = `8`
|
||||
- `parameters.sampleCount` = `1`
|
||||
- Connect:
|
||||
- `Merge` → `Generate Video`
|
||||
|
||||
11. **Add the polling HTTP Request node**
|
||||
- Node type: **HTTP Request**
|
||||
- Name it: `Get URL Download`
|
||||
- Method: **GET**
|
||||
- URL expression:
|
||||
- `https://generativelanguage.googleapis.com/v1beta/{{ $json.name }}`
|
||||
- Use the same HTTP header auth credential as the generation request
|
||||
- Optionally add `Content-Type: application/json`
|
||||
- Connect:
|
||||
- `Generate Video` → `Get URL Download`
|
||||
|
||||
12. **Add a Wait node**
|
||||
- Node type: **Wait**
|
||||
- Name it: `Wait`
|
||||
- Set amount to `30`
|
||||
- Use seconds unless your UI indicates another unit
|
||||
- Connect:
|
||||
- `Get URL Download` → `Wait`
|
||||
|
||||
13. **Add an If node for readiness testing**
|
||||
- Node type: **If**
|
||||
- Name it: `If`
|
||||
- Condition:
|
||||
- Value 1: `{{ $json.response.generateVideoResponse.generatedSamples[0].video.uri }}`
|
||||
- Operator: **is not empty**
|
||||
- Connect:
|
||||
- `Wait` → `If`
|
||||
|
||||
14. **Loop unfinished jobs back to polling**
|
||||
- Connect the **false** output of `If` back to `Get URL Download`
|
||||
- This creates the polling loop.
|
||||
|
||||
15. **Add the final video download HTTP Request**
|
||||
- Node type: **HTTP Request**
|
||||
- Name it: `Get Dowload Video`
|
||||
- Method: **GET**
|
||||
- URL expression:
|
||||
- `{{ $json.response.generateVideoResponse.generatedSamples[0].video.uri }}`
|
||||
- Use the same header auth credential if required
|
||||
- Prefer configuring the response format as **File** or equivalent in your n8n version so the next Google Drive upload receives binary data properly
|
||||
- Connect:
|
||||
- **true** output of `If` → `Get Dowload Video`
|
||||
|
||||
16. **Add a Google Drive upload node**
|
||||
- Node type: **Google Drive**
|
||||
- Name it: `Upload to Drive`
|
||||
- Operation: **Upload**
|
||||
- Authenticate with **Google Drive OAuth2**
|
||||
- Choose destination Drive: `My Drive`
|
||||
- Select the destination folder
|
||||
- File name expression:
|
||||
- `{{ 'ad_video_' + $now.toMillis() + '.mp4' }}`
|
||||
- Ensure the node uses the binary property produced by `Get Dowload Video`
|
||||
- Set error handling to continue if you want behavior matching the export (`onError: continueRegularOutput`)
|
||||
- Connect:
|
||||
- `Get Dowload Video` → `Upload to Drive`
|
||||
|
||||
17. **Add documentation sticky notes**
|
||||
- Add a sticky note named `Main overview` with setup and behavior summary
|
||||
- Add `Section – Input image`
|
||||
- Add `Section – Create video prompt`
|
||||
- Add `Section – Generate video`
|
||||
- Add `Section – Download & upload`
|
||||
|
||||
18. **Optionally add the disabled sample prompt node**
|
||||
- Node type: **Set**
|
||||
- Name it: `Sample Promt`
|
||||
- Leave it disabled
|
||||
- Use it only as a local prompt bank if desired
|
||||
|
||||
19. **Configure credentials**
|
||||
- **Google Drive OAuth2**
|
||||
- Must have permission to read the source file and write into the destination folder
|
||||
- **Google Gemini / PaLM API**
|
||||
- Required for `Creative Visualiser` and `Google Gemini Chat Model1`
|
||||
- **HTTP Header Auth**
|
||||
- Required for Veo and Gemini REST calls
|
||||
- Typically stores the API key in a header expected by Google’s generative language endpoints
|
||||
|
||||
20. **Test the workflow**
|
||||
- Run manually
|
||||
- Verify:
|
||||
- source image downloads correctly
|
||||
- Gemini image analysis returns valid content
|
||||
- `Product Video Prompt` returns structured data matching parser schema
|
||||
- Veo request returns an operation object with `name`
|
||||
- polling eventually returns `generatedSamples[0].video.uri`
|
||||
- final download produces binary video data
|
||||
- upload succeeds to the target Drive folder
|
||||
|
||||
21. **Recommended hardening before production**
|
||||
- Detect the source image MIME type dynamically instead of hardcoding `image/png`
|
||||
- Add a max retry count or timeout in the polling loop
|
||||
- Align the LLM prompt with the parser schema so output is guaranteed
|
||||
- Add explicit error branches or notifications for upload and API failures
|
||||
- Log the Veo operation name and final URI for troubleshooting
|
||||
|
||||
---
|
||||
|
||||
# 5. General Notes & Resources
|
||||
|
||||
| Note Content | Context or Link |
|
||||
|---|---|
|
||||
| The workflow description in the sticky note says the script is generated from a single image, analyzed by Gemini, then turned into a short video prompt for Veo. | Main overview |
|
||||
| Setup instructions embedded in the workflow: connect Google Drive, Google Gemini, and an API key for Veo HTTP requests. | Main overview |
|
||||
| Input file must be set directly in `Download ad image`. | Main overview |
|
||||
| Output folder must be set directly in `Upload to Drive`. | Main overview |
|
||||
| Optional generation parameters to tune: `aspectRatio`, `resolution`, `durationSeconds`. | Main overview |
|
||||
| There is a wording inconsistency: one sticky note says Gemini creates an “8s Vietnamese script,” while the active chain prompt explicitly requests “natural English.” | Prompt generation block |
|
||||
| The disabled `Sample Promt` node contains pinned example prompt text for product photography with human models, but it is not connected to execution. | Helper content only |
|
||||
|
||||
## Additional implementation cautions
|
||||
- The workflow contains no sub-workflow nodes and no additional entry points beyond the manual trigger.
|
||||
- The polling loop has no explicit stop condition other than successful video readiness.
|
||||
- The final download/upload path may require response-format tuning in the HTTP node depending on your n8n version.
|
||||
- The expression `{{$json.content.parts[0].values()}}` is fragile and should be validated against actual Gemini output in your environment.
|
||||
Reference in New Issue
Block a user