From 78ba8f0f5f790ab08b28f9b22c3257f2bf276ed1 Mon Sep 17 00:00:00 2001 From: nusquama Date: Wed, 11 Mar 2026 12:00:44 +0800 Subject: [PATCH] creation --- .../readme-13506.md | 2006 +++++++++++++++++ 1 file changed, 2006 insertions(+) create mode 100644 workflows/Build a WhatsApp AI shopping bot with virtual try-on using Gemini and GPT-13506/readme-13506.md diff --git a/workflows/Build a WhatsApp AI shopping bot with virtual try-on using Gemini and GPT-13506/readme-13506.md b/workflows/Build a WhatsApp AI shopping bot with virtual try-on using Gemini and GPT-13506/readme-13506.md new file mode 100644 index 000000000..048ee8641 --- /dev/null +++ b/workflows/Build a WhatsApp AI shopping bot with virtual try-on using Gemini and GPT-13506/readme-13506.md @@ -0,0 +1,2006 @@ +Build a WhatsApp AI shopping bot with virtual try-on using Gemini and GPT + +https://n8nworkflows.xyz/workflows/build-a-whatsapp-ai-shopping-bot-with-virtual-try-on-using-gemini-and-gpt-13506 + + +# Build a WhatsApp AI shopping bot with virtual try-on using Gemini and GPT + +# 1. Workflow Overview + +This workflow implements a **WhatsApp AI shopping bot for Bytez**. It receives inbound WhatsApp messages, detects whether the user is sending plain text, tapping an interactive button, or uploading an image, and then routes the request into one of three major business flows: + +- **Shopping/search flow**: classify user intent with OpenAI, search products via Redis cache and MongoDB Atlas vector search, then send interactive product cards back to WhatsApp. +- **Order flow**: when the user taps **Order Now**, an AI agent orchestrates product lookup, order creation in MongoDB, logging to Google Sheets, and confirmation back to the user. +- **Virtual try-on (VTO) flow**: when the user taps **Virtual Try-On**, the workflow stores product context in Redis, asks for a selfie, validates that exactly one real person is present using Gemini, generates a try-on image with Gemini image generation, and sends the result to WhatsApp. + +The workflow has a single external entry point, the **WhatsApp Trigger**, but internally it behaves like a multi-branch system with three practical entry paths: +1. **Incoming text message** +2. **Incoming interactive button tap** +3. **Incoming image message** + +## 1.1 Entry Reception and Message Routing + +The workflow starts from a WhatsApp trigger and immediately separates **interactive/image traffic** from **text traffic**. Interactive button taps and uploaded images are routed into the order/VTO handling branch, while text messages proceed into validation and AI classification. + +## 1.2 Text Validation and Session Handling + +Plain text is validated for emptiness, malformed content, spam patterns, suspicious script/XSS-like input, unsupported media types, and overlong messages. Valid text then loads or creates a Redis-backed user session and appends the user message into message history. + +## 1.3 AI Classification and Conversational Routing + +Validated text is sent to a LangChain AI agent backed by **GPT-5-nano** to determine whether the user wants a **product search**, **recommendation**, or just a normal conversational answer. The output is parsed into either structured JSON intent or plain text. + +## 1.4 Product Search, Cache, and Vector Retrieval + +If the intent is product-related, the workflow generates a Redis cache key from the interpreted search details, checks Redis first, and if no cached result exists, runs a **MongoDB Atlas vector search** using **OpenAI embeddings**. Matching products are enriched with base64-encoded product images downloaded from Google Drive and then cached. + +## 1.5 Product Message Delivery + +Each matched product is looped through one by one. The product image is converted to binary, uploaded to the WhatsApp media API, and then sent as an interactive WhatsApp message containing two buttons: +- **πŸ›’ Order Now** +- **πŸ‘— Virtual Try-On** + +## 1.6 Order Button Flow + +If the user taps the order button, the workflow parses the button payload, extracts the product ID and user identity, and passes them to an AI order orchestration agent backed by **GPT-4o**. That agent uses MongoDB and Google Sheets tools to complete the order lifecycle and then returns a confirmation message to WhatsApp. + +## 1.7 Virtual Try-On Flow + +If the user taps the VTO button, the workflow stores the selected product ID in Redis for 10 minutes and prompts the user to upload a selfie. When an image arrives later, the workflow checks whether VTO context exists, downloads both the user selfie and product image, validates the selfie with Gemini, generates the try-on result with Gemini image generation, uploads the result to WhatsApp, sends it to the user, and clears Redis context. + +--- + +# 2. Block-by-Block Analysis + +## 2.1 Block: Entry Reception and Routing + +### Overview +This block receives all inbound WhatsApp messages and determines the first-level execution path. It splits traffic into either the **interactive/image branch** or the **text validation branch**, then further separates button taps from user-uploaded images. + +### Nodes Involved +- `WhatsApp message trigger` +- `Check if message is button or image` +- `Route interactive vs image message` + +### Node Details + +#### WhatsApp message trigger +- **Type and role**: `n8n-nodes-base.whatsAppTrigger`; entry point for inbound WhatsApp webhook events. +- **Configuration**: + - Listens to `messages` updates. +- **Key data produced**: + - `messages[0].type` + - `contacts[0].wa_id` + - `contacts[0].profile.name` + - message payloads for text, image, or interactive button data. +- **Connections**: + - Output β†’ `Check if message is button or image` +- **Version-specific notes**: + - Type version `1`. + - Requires a configured WhatsApp Business integration in n8n. +- **Failure/edge cases**: + - Webhook misconfiguration + - Invalid WhatsApp credentials + - Payload shape differences if Meta changes schema + - Non-message webhook events are not handled here because only `messages` updates are selected + +#### Check if message is button or image +- **Type and role**: `IF`; first-level branch router. +- **Configuration**: + - True if `{{$json.messages[0].type}}` equals `interactive` OR `image` + - False otherwise, which effectively means text or unsupported types go to validation +- **Connections**: + - True β†’ `Route interactive vs image message` + - False β†’ `Validate incoming message` +- **Version-specific notes**: + - Type version `2.2` +- **Failure/edge cases**: + - Expression failure if `messages[0]` is missing + - Unsupported types still go to validation, where they are rejected more explicitly + +#### Route interactive vs image message +- **Type and role**: `Switch`; separates interactive button taps from uploaded images. +- **Configuration**: + - Output `Button click` if message type is `interactive` + - Output `Image received` if message type is `image` +- **Connections**: + - Button click β†’ `Parse button and user data` + - Image received β†’ `Get VTO context from Redis` +- **Version-specific notes**: + - Type version `3.2` +- **Failure/edge cases**: + - Any interactive subtype outside the expected button payload may later fail in parsing + - Image uploads without prior VTO context go to context-check fallback message + +--- + +## 2.2 Block: Text Message Validation + +### Overview +This block performs strict sanitation and validation on inbound text and interactive payloads, but in practice it is reached for non-interactive/non-image traffic. It blocks empty, meaningless, suspicious, spam-like, or unsupported messages before they reach the AI and session pipeline. + +### Nodes Involved +- `Validate incoming message` +- `Pass valid messages, block invalid` +- `Send validation error to user` + +### Node Details + +#### Validate incoming message +- **Type and role**: `Code`; custom validation and sanitization engine. +- **Configuration choices**: + - Detects missing payloads + - Handles `text`, `interactive`, list and button replies, and many unsupported media types + - Sanitizes whitespace, control chars, repeated punctuation, and bracket wrappers + - Flags suspicious patterns such as `