n8nworkflows.xyz/workflows/Create Data-Driven SEO Content Briefs with AI Analysis of SERP Data using Bright Data-8053/readme-8053.md at 35aff6d2ee9d3d978ec51e3063fb4aec1e0d9456

klbr/n8nworkflows.xyz

Fork 0

mirror of https://github.com/khoaliber/n8nworkflows.xyz.git synced 2026-04-28 00:29:22 +00:00

Files

T

nusquama 24a825eb19 creation

2025-11-12 16:17:05 +01:00

28 KiB

Raw Blame History

Create Data-Driven SEO Content Briefs with AI Analysis of SERP Data using Bright Data

https://n8nworkflows.xyz/workflows/create-data-driven-seo-content-briefs-with-ai-analysis-of-serp-data-using-bright-data-8053

1. Workflow Overview

This workflow automates the generation of data-driven SEO content briefs by analyzing the top 10 Google search results for a user-provided keyword. It combines real-time SERP scraping using Bright Data with AI-powered content analysis and strategic planning via OpenAI models accessed through OpenRouter. The workflow is structured into the following logical blocks:

1.1 Input Reception and Initial Interaction: Receives user keyword input through a chat interface and provides initial status responses.
1.2 SERP Data Retrieval: Queries Google Search via Bright Data to obtain the top 10 organic search results for the keyword.
1.3 URL Extraction and Processing Loop: Extracts individual URLs from the SERP data, iterates through each URL to scrape page contents.
1.4 Content Cleaning and Extraction: Cleans raw HTML content and extracts structured information such as titles, meta descriptions, headings, and word counts.
1.5 Content Analysis and Aggregation: Aggregates extracted data, analyzes the collective insights for search intent, core topics, and differentiation opportunities using AI language models.
1.6 Content Brief Generation and Formatting: Produces a structured strategic content plan and formats it into a human-readable Markdown summary.
1.7 User Feedback and Memory Management: Provides ongoing chat responses to the user and maintains conversational context via memory buffering.

2. Block-by-Block Analysis

1.1 Input Reception and Initial Interaction

Overview: Captures the user's target keyword through a chat interface and signals workflow progression with messages.
Nodes Involved: When chat message received, Respond to Chat1, Respond to Chat2, Respond to Chat
Node Details:
- When chat message received
  - Type: LangChain Chat Trigger
  - Role: Entry point; triggers workflow on chat input
  - Config: Public webhook; title "SEO Content Strategist"; input placeholder prompts for a target keyword; memory enabled to load prior sessions.
  - Expressions: Uses $json.chatInput to capture keyword.
  - Outputs: Connected to Respond to Chat1.
  - Edge cases: Missing or malformed input; webhook availability.
- Respond to Chat1
  - Type: LangChain Chat
  - Role: Provides immediate feedback that keyword analysis has started
  - Config: Message dynamically interpolates the user keyword for clarity.
  - Inputs: From "When chat message received"
  - Outputs: Triggers Google SERP node indirectly.
  - Edge cases: Expression failures if keyword missing.
- Respond to Chat2
  - Type: LangChain Chat
  - Role: Confirms processing of Google results.
  - Config: Outputs a status message with the keyword.
  - Inputs: From Merge1 node (later merge of analysis data)
  - Outputs: Leads to "Loop Over Items" for detailed processing.
- Respond to Chat
  - Type: LangChain Chat
  - Role: Notifies user that content extraction from pages is starting.
  - Inputs: From extract url code node.
  - Outputs: Leads to Limit node (currently disabled).

1.2 SERP Data Retrieval

Overview: Performs a Google search on the user keyword using Bright Data's Google SERP API, retrieving top 10 organic results as JSON.
Nodes Involved: Google SERP
Node Details:
- Google SERP
  - Type: Bright Data node
  - Role: Fetches search results JSON from Google via Bright Data proxy
  - Config: URL dynamically constructed with encoded user keyword; results limited to 10; uses Bright Data's "serp_api1" zone and US country setting; credentials are Bright Data API keys.
  - Input: Triggered by Respond to Chat1 node.
  - Output: JSON containing SERP data with an "organic" array.
  - Edge cases: Proxy failures, API rate limits, network errors, invalid keyword.

1.3 URL Extraction and Processing Loop

Overview: Extracts each organic result URL from the SERP JSON and processes each URL individually in batches.
Nodes Involved: extract url (Code), Respond to Chat, Limit (disabled), Loop Over Items, Aggregate, Access and extract data from a specific URL, url (Set), Merge1
Node Details:
- extract url
  - Type: Code node (JavaScript)
  - Role: Parses the "organic" array from SERP JSON to emit individual items per result
  - Key code: Uses optional chaining and array mapping to flatten and isolate each URL result.
  - Input: From Google SERP output
  - Output: One item per SERP result for downstream processing
  - Edge cases: Missing or malformed "organic" property, empty results.
- Respond to Chat
  - Role: Notifies user that content extraction starts
  - Input: From extract url
  - Output: Triggers Limit node (currently disabled)
- Limit
  - Disabled node; would limit batch size if enabled.
- Loop Over Items
  - Type: splitInBatches
  - Role: Iterates over each extracted URL result with controlled batch processing
  - Input: From Respond to Chat2
  - Outputs: Two branches: one aggregates all data, the other scrapes each URL page content and sets URL context.
- Aggregate
  - Type: Aggregate
  - Role: Collects all batch iteration outputs into a single aggregated data set
  - Input: From Loop Over Items
  - Output: Feeds into Respond to Chat3
- Access and extract data from a specific URL
  - Type: Bright Data node
  - Role: Scrapes the actual page content of each URL using Bright Data's web unlocker proxy
  - Config: URL dynamically set from current item’s "link" field; uses "web_unlocker1" zone and US country; retries on failure enabled
  - Input: From Loop Over Items batch branch
  - Output: Raw HTML content for cleaning
  - Edge cases: Page load failures, anti-bot measures, malformed URLs
- url
  - Type: Set node
  - Role: Sets a simple JSON property "url" with the current processed link for tracking
  - Input: From Access and extract data from a specific URL
  - Output: Merges into Merge1
- Merge1
  - Type: Merge
  - Role: Combines data streams from content analysis and URL setting nodes
  - Inputs: Three inputs – from analyse site, extract html1, and url nodes
  - Output: Flows into Respond to Chat2 and further downstream

1.4 Content Cleaning and Extraction

Overview: Cleans HTML content to remove noise and extracts structured elements including titles, meta descriptions, headings, and word counts.
Nodes Involved: clean html (Code), analyse site (Chain LLM), extract html1 (Code), Structured Output Parser
Node Details:
- clean html
  - Type: Code node
  - Role: Applies regex cleaning to raw HTML removing scripts, styles, SVGs, navs, ul/li tags, and extraneous attributes
  - Input: From Access and extract data from a specific URL
  - Output: Produces "cleanedHtml" property for further processing
  - Edge cases: Missing or empty HTML content
- analyse site
  - Type: LangChain Chain LLM node
  - Role: Uses AI to summarize the cleaned HTML content from each page into JSON summary
  - Config: Prompt instructs SEO expert role to provide a JSON summary without comments
  - Input: From clean html output (cleanedHtml)
  - Output: Parsed JSON summary of page content
  - Edge cases: Model timeouts, API errors, malformed input
- extract html1
  - Type: Code node
  - Role: Extracts page title, meta description, all headings (H1-H6), and counts words from cleaned HTML
  - Input: From clean html
  - Output: Structured JSON with these fields for analysis
  - Edge cases: Missing tags, malformed HTML, empty content
- Structured Output Parser
  - Type: LangChain Output Parser
  - Role: Parses AI output from analyse site into structured JSON format as defined by schema
  - Input: From analyse site
  - Output: Validated JSON data for merging

1.5 Content Analysis and Aggregation

Overview: Aggregates all extracted data and synthesizes a strategic SEO content brief by analyzing search intent, core topics, differentiation points, and proposing an article outline.
Nodes Involved: Aggregate, Respond to Chat3, Analysis (Chain LLM), Structured Output Parser1, OpenRouter Chat Model2, Respond to Chat5
Node Details:
- Aggregate
  - Collects all page summaries and extracted data into one set to provide comprehensive input for analysis
- Respond to Chat3
  - Alerts the user that data collection completed and synthesis is ongoing
- Analysis
  - Type: LangChain Chain LLM node
  - Role: Performs deep AI analysis on aggregated SERP and page data, guided by a detailed prompt to generate a JSON SEO content brief with fields for search intent, global intent, must-cover topics, differentiation suggestions, and suggested H2 outline
  - Config: Uses OpenRouter with GPT-4o model; error continuation enabled to avoid workflow stops
  - Input: Aggregated data and user keyword
  - Output: Structured JSON content plan
  - Edge cases: AI model errors, malformed data, timeout
- Structured Output Parser1
  - Parses the JSON output of the Analysis node to ensure correct schema adherence
- Respond to Chat5
  - Sends the raw JSON analysis back to the user for transparency or further use

1.6 Content Brief Generation and Formatting

Overview: Formats the AI-generated JSON content brief into a Markdown summary for easier human consumption.
Nodes Involved: Format Output, Respond to Chat4, OpenRouter Chat Model1
Node Details:
- Format Output
  - Type: LangChain Chain LLM node
  - Role: Converts the JSON content brief into a structured Markdown report with headings and bullet points for search intent, must-cover topics, differentiation, and H2 outline
  - Config: Uses GPT-5-nano model on OpenRouter
  - Input: JSON from Analysis node
  - Output: Markdown formatted content plan
- Respond to Chat4
  - Delivers the formatted Markdown content back to the chat user as the final content brief
- OpenRouter Chat Model1
  - Used as the LLM engine powering the formatting step

1.7 User Feedback and Memory Management

Overview: Maintains conversational memory and provides ongoing interactive user feedback throughout the workflow.
Nodes Involved: Simple Memory, When chat message received (memory connection), various Respond to Chat nodes
Node Details:
- Simple Memory
  - Type: LangChain Memory Buffer Window
  - Role: Maintains conversation context across chat messages to enable session continuity
  - Input/Output: Connected bi-directionally to chat trigger node
  - Edge cases: Memory overflow, context mismatch

3. Summary Table

Node Name	Node Type	Functional Role	Input Node(s)	Output Node(s)	Sticky Note
When chat message received	LangChain Chat Trigger	Entry point, receives user keyword	Simple Memory	Respond to Chat1	# SEO content Planner \n\n Google SERP Analysis \nThe n8n workflow retrieves the top 10 Google search results for a given keyword using Bright Data. \n👉 Get your free Bright Data API key here: https://get.brightdata.com/scrap
Respond to Chat1	LangChain Chat	Notifies user of keyword analysis start	When chat message received	Google SERP	See above
Google SERP	Bright Data	Retrieves top 10 Google organic results	Respond to Chat1	extract url	See above
extract url	Code (JS)	Extracts individual URLs from SERP JSON	Google SERP	Respond to Chat	See above
Respond to Chat	LangChain Chat	Notifies user extraction starts	extract url	Limit (disabled)	See above
Limit	Limit (disabled)	(Disabled) limit item count	Respond to Chat	Loop Over Items	See above
Loop Over Items	splitInBatches	Iterates over URLs batch-wise	Respond to Chat2	Aggregate, Access and extract data from a specific URL, url	See above
Aggregate	Aggregate	Aggregates batch results for analysis	Loop Over Items	Respond to Chat3	See above
Access and extract data from a specific URL	Bright Data	Scrapes page content for each URL	Loop Over Items	clean html	See above
url	Set	Sets current URL for tracking	Access and extract data from a specific URL	Merge1	See above
clean html	Code (JS)	Cleans raw HTML content	Access and extract data from a specific URL	analyse site, extract html1	See above
analyse site	LangChain Chain LLM	Summarizes page content via AI	clean html	Merge1	See above
extract html1	Code (JS)	Extracts title, description, headings, word count	clean html	Merge1	See above
Merge1	Merge	Merges analysis and extraction data	analyse site, extract html1, url	Respond to Chat2	See above
Respond to Chat2	LangChain Chat	Shows processing status of URLs	Merge1	Loop Over Items	See above
Respond to Chat3	LangChain Chat	Notifies user that data collection is complete	Aggregate	Analysis	See above
Analysis	LangChain Chain LLM	Synthesizes strategic SEO content brief	Respond to Chat3	Format Output, Respond to Chat5	See above
Structured Output Parser1	LangChain Output Parser	Parses AI JSON analysis output	Analysis	Analysis (chain continuation)	See above
Respond to Chat5	LangChain Chat	Sends raw JSON analysis to user	Analysis	-	See above
Format Output	LangChain Chain LLM	Formats JSON brief into Markdown summary	Analysis	Respond to Chat4	See above
Respond to Chat4	LangChain Chat	Sends formatted content brief to user	Format Output	-	See above
OpenRouter Chat Model	LangChain LLM via OpenRouter	Powers content cleaning analysis	clean html	analyse site	See above
OpenRouter Chat Model1	LangChain LLM via OpenRouter	Powers output formatting	Format Output	Respond to Chat4	See above
OpenRouter Chat Model2	LangChain LLM via OpenRouter	Powers strategic brief analysis	Analysis	Structured Output Parser1	See above
Structured Output Parser	LangChain Output Parser	Parses page summary output	analyse site	Merge1	See above
Simple Memory	LangChain Memory Buffer Window	Maintains chat conversation context	-	When chat message received	See above

4. Reproducing the Workflow from Scratch

Create a LangChain Chat Trigger node
- Name: When chat message received
- Configure as a public webhook with title "SEO Content Strategist" and subtitle "Generate a strategic content brief based on SERP analysis."
- Set input placeholder as "Enter your target keyword..."
- Enable memory loading (connect to a Memory Buffer node)
- Save webhook ID for external chat integration.
Add a LangChain Chat node
- Name: Respond to Chat1
- Message: =Processing: Analyzing top 10 Google results for "{{ $json.chatInput }}".
- Connect input from "When chat message received".
Add a Bright Data node for Google SERP API
- Name: Google SERP
- URL: =https://www.google.com/search?q={{ encodeURIComponent($json.chatInput) }}&num=10&brd_json=1
- Zone: Select or create "serp_api1" zone in Bright Data credentials
- Country: US
- Connect input from Respond to Chat1.
Add a Code node to extract URLs from SERP data
- Name: extract url
- Use provided JavaScript code to flatten the "organic" array into individual items
- Connect input from Google SERP output.
Add a LangChain Chat node
- Name: Respond to Chat
- Message: "Starting content extraction from top-ranking pages."
- Connect input from extract url.
(Optional) Add a Limit node
- Name: Limit
- Initially disable this node (for possible future batch size limiting)
- Connect input from Respond to Chat.
Add a splitInBatches node
- Name: Loop Over Items
- Connect input from Respond to Chat2 (which will be created later)
- Configure default batch size or leave default.
Add a Bright Data node for page scraping
- Name: Access and extract data from a specific URL
- URL: ={{ $json.link }} (from current batch item)
- Zone: "web_unlocker1"
- Country: US
- Enable retry on failure
- Connect input from Loop Over Items.
Add a Code node to clean HTML
- Name: clean html
- Insert given regex cleaning code to remove scripts, styles, SVG, nav, ul/li, and attributes from HTML content
- Connect input from Access and extract data from a specific URL.
Add a LangChain Chain LLM node
- Name: analyse site
- Prompt: "You are an SEO expert:|Output in JSON without any comments** summary: summarize this page."
- Input text: ={{ $json.cleanedHtml }}
- Enable output parser and link to Structured Output Parser node
- Connect input from clean html.
Add a Code node to extract structured HTML info
- Name: extract html1
- Use provided code to extract title, description, headings, and word count from cleaned HTML
- Connect input from clean html.
Add a Set node
- Name: url
- Assign a string field "url" with value ={{ $json.link }} for tracking
- Connect input from Access and extract data from a specific URL.
Add a Merge node
- Name: Merge1
- Set number of inputs to 3
- Connect inputs from analyse site, extract html1, and url nodes respectively.
Add a LangChain Chat node
- Name: Respond to Chat2
- Message: =Processing: Analyzing top 10 Google results for "{{ $json.chatInput }}".
- Connect input from Merge1.
Connect Respond to Chat2 output back to Loop Over Items input
- This closes the loop for each batch item processed.
Add an Aggregate node
- Name: Aggregate
- Aggregate all batch outputs into a single dataset
- Connect input from Loop Over Items.
Add a LangChain Chat node
- Name: Respond to Chat3
- Message: "All data collected. Synthesizing insights and generating your strategic content plan."
- Connect input from Aggregate.
Add a LangChain Chain LLM node for analysis
- Name: Analysis
- Use provided prompt to analyze aggregated SERP and page data for search intent and content plan
- Enable structured output parser (Structured Output Parser1)
- Connect input from Respond to Chat3.
Add a LangChain Output Parser node
- Name: Structured Output Parser1
- Schema: Use provided JSON schema defining search intent, global intent, must-cover topics, differentiation suggestions, and suggested H2 outline
- Connect input from Analysis node.
Add a LangChain Chain LLM node for formatting
- Name: Format Output
- Prompt: Convert JSON analysis to Markdown report as specified
- Connect input from Analysis node.
Add LangChain Chat node
- Name: Respond to Chat4
- Message: Use ={{ $json.text }} to output formatted Markdown
- Connect input from Format Output.
Add LangChain Chat node
- Name: Respond to Chat5
- Message: Use ={{ $json.text }} to output raw JSON analysis
- Connect input from Analysis node.
Add LangChain Memory Buffer Window node
- Name: Simple Memory
- Connect AI memory input/output to "When chat message received" node to maintain session memory.
Credential Setup
- Bright Data API: Configure with valid API keys for zones "serp_api1" and "web_unlocker1"
- OpenRouter API: Configure with valid credentials to access GPT-4o and GPT-5-nano models

5. General Notes & Resources

Note Content	Context or Link
The workflow retrieves the top 10 Google search results and scrapes their content using Bright Data proxies.	Bright Data: https://get.brightdata.com/scrap
The AI-driven analysis uses OpenRouter-hosted GPT models (GPT-4o, GPT-5-nano) for summarization, SEO analysis, and content plan generation.	OpenRouter API credentials required
The final content brief includes search intent classification, must-cover topics, differentiation suggestions, and a suggested H2 outline formatted in Markdown.	SEO Content Planning best practices
The workflow uses memory buffering to maintain chat context and enhance user interaction continuity.	LangChain memory buffers
The cleaning step removes common HTML noise elements such as scripts, styles, SVGs, navigation bars, and list tags to focus on meaningful text content for analysis.	Custom regex cleaning implemented in code node

Disclaimer: The text above is exclusively derived from an automated workflow constructed with n8n, adhering strictly to content policies and containing no illegal or offensive elements. All processed data is legal and publicly available.

28 KiB Raw Blame History Unescape Escape