From 0cb4fc515998799c04ce524e92b3e7413aada631 Mon Sep 17 00:00:00 2001 From: nusquama Date: Sun, 15 Mar 2026 12:02:03 +0800 Subject: [PATCH] creation --- .../readme-13991.md | 662 ++++++++++++++++++ 1 file changed, 662 insertions(+) create mode 100644 workflows/Track Redfin real estate listings with ScrapeOps, Google Sheets, and Slack-13991/readme-13991.md diff --git a/workflows/Track Redfin real estate listings with ScrapeOps, Google Sheets, and Slack-13991/readme-13991.md b/workflows/Track Redfin real estate listings with ScrapeOps, Google Sheets, and Slack-13991/readme-13991.md new file mode 100644 index 000000000..15589ecba --- /dev/null +++ b/workflows/Track Redfin real estate listings with ScrapeOps, Google Sheets, and Slack-13991/readme-13991.md @@ -0,0 +1,662 @@ +Track Redfin real estate listings with ScrapeOps, Google Sheets, and Slack + +https://n8nworkflows.xyz/workflows/track-redfin-real-estate-listings-with-scrapeops--google-sheets--and-slack-13991 + + +# Track Redfin real estate listings with ScrapeOps, Google Sheets, and Slack + +# 1. Workflow Overview + +This workflow monitors Redfin real estate listings on a recurring schedule, using ScrapeOps to fetch and parse listing pages, Google Sheets to store property rows, and Slack to send a completion message. + +Its main use case is automated property tracking for a specific Redfin search URL such as a city, ZIP code, or filtered results page. It is designed for users who want periodic snapshots of listings without manually browsing Redfin. + +The workflow is organized into four functional blocks: + +## 1.1 Trigger & Configuration +The workflow starts on a 6-hour schedule and defines the Redfin search page to scrape. + +## 1.2 Fetch & Parse Listings +It loads the Redfin page through ScrapeOps Proxy with JavaScript rendering enabled, then sends the resulting HTML to the ScrapeOps Parser API configured for the Redfin domain. + +## 1.3 Transform & Filter +The parsed response is enriched with search metadata, split into one item per property, normalized into sheet-friendly fields, and filtered to keep only valid listings. + +## 1.4 Save & Notify +Valid listings are appended to Google Sheets, and a Slack message is sent as a run summary. + +--- + +# 2. Block-by-Block Analysis + +## 2.1 Trigger & Configuration + +### Overview +This block defines when the workflow runs and which Redfin search page will be scraped. It acts as the entry point and runtime configuration layer. + +### Nodes Involved +- ` Schedule Trigger` +- `Set Search Parameters` + +### Node Details + +#### ` Schedule Trigger` +- **Type and technical role:** `n8n-nodes-base.scheduleTrigger` + Entry node that launches the workflow automatically at a fixed interval. +- **Configuration choices:** + - Runs every 6 hours. +- **Key expressions or variables used:** + - No custom expression in the node itself. + - Produces standard schedule metadata such as timestamp and readable date/time. +- **Input and output connections:** + - No input, as it is a trigger. + - Outputs to `Set Search Parameters`. +- **Version-specific requirements:** + - Uses type version `1.1`. +- **Edge cases or potential failure types:** + - Workflow must be activated for the schedule to run. + - Server timezone may affect perceived execution timing. + - Missed executions can happen if the n8n instance is offline. +- **Sub-workflow reference:** + - None. + +#### `Set Search Parameters` +- **Type and technical role:** `n8n-nodes-base.set` + Defines the Redfin URL that downstream nodes use for scraping and parsing. +- **Configuration choices:** + - Creates a single field: + - `redfin_url = https://www.redfin.com/city/21853/MD/California` +- **Key expressions or variables used:** + - Static string value for `redfin_url`. +- **Input and output connections:** + - Input from ` Schedule Trigger` + - Output to `ScrapeOps: Fetch Redfin Page (Proxy)` +- **Version-specific requirements:** + - Uses type version `3.2`. +- **Edge cases or potential failure types:** + - If the Redfin URL is invalid, changed, blocked, or points to an empty page, downstream scraping/parsing may fail or return no results. + - If Redfin page structure changes significantly, parser output may degrade. +- **Sub-workflow reference:** + - None. + +--- + +## 2.2 Fetch & Parse Listings + +### Overview +This block retrieves the target Redfin page with browser-like rendering and then extracts structured listing data using ScrapeOps parsing specialized for Redfin. + +### Nodes Involved +- `ScrapeOps: Fetch Redfin Page (Proxy)` +- `ScrapeOps: Parse Redfin Listings` + +### Node Details + +#### `ScrapeOps: Fetch Redfin Page (Proxy)` +- **Type and technical role:** `@scrapeops/n8n-nodes-scrapeops.ScrapeOps` + Uses ScrapeOps Proxy API to request the Redfin page with anti-bot and rendering options. +- **Configuration choices:** + - URL comes from `{{$json.redfin_url}}` + - Advanced options: + - `wait: 5000` + - `scroll: 2000` + - `country: us` + - `render_js: true` + - `device_type: desktop` + - `residential_proxy: true` +- **Key expressions or variables used:** + - `={{ $json.redfin_url }}` +- **Input and output connections:** + - Input from `Set Search Parameters` + - Output to `ScrapeOps: Parse Redfin Listings` +- **Version-specific requirements:** + - Uses type version `1`. + - Requires the ScrapeOps n8n node package to be installed in the environment. + - Requires ScrapeOps credentials configured in n8n. +- **Edge cases or potential failure types:** + - ScrapeOps authentication failure or missing API key. + - Redfin anti-bot response despite proxy usage. + - Rendering timeout or incomplete page load. + - Empty results if wait/scroll settings are insufficient. + - Network errors, quota limits, or provider-side outage. +- **Sub-workflow reference:** + - None. + +#### `ScrapeOps: Parse Redfin Listings` +- **Type and technical role:** `@scrapeops/n8n-nodes-scrapeops.ScrapeOps` + Sends fetched page HTML to the ScrapeOps Parser API and requests structured extraction for the Redfin domain. +- **Configuration choices:** + - `apiType = parserApi` + - `parserUrl` uses the original search URL from `Set Search Parameters` + - `parserHtml` uses the upstream fetched content + - `parserDomain = redfin` +- **Key expressions or variables used:** + - `={{ $('Set Search Parameters').item.json.redfin_url }}` + - `={{ $json }}` +- **Input and output connections:** + - Input from `ScrapeOps: Fetch Redfin Page (Proxy)` + - Output to `Add Search Metadata` +- **Version-specific requirements:** + - Uses type version `1`. + - Also depends on the ScrapeOps node package and credentials. +- **Edge cases or potential failure types:** + - Parser may fail if the upstream node does not return expected HTML/body format. + - Parser output structure may change if ScrapeOps updates Redfin parsing schema. + - Invalid parser domain or malformed HTML can produce empty `data.search_results`. +- **Sub-workflow reference:** + - None. + +--- + +## 2.3 Transform & Filter + +### Overview +This block converts parser output into operational records. It extracts search-level metadata, splits the listing array into separate items, maps listing fields into normalized columns, and filters out unusable properties. + +### Nodes Involved +- `Add Search Metadata` +- `Split Properties Into Items` +- `Format Property Fields` +- `Filter Valid Properties` + +### Node Details + +#### `Add Search Metadata` +- **Type and technical role:** `n8n-nodes-base.set` + Adds summary data from the parsed Redfin response, including timestamps and search context. +- **Configuration choices:** + - Creates fields intended to include: + - `timestamp` + - `search_title` + - `search_type` + - `total_properties` + - region information + - price range + - average price + - source URL + - `search_results` +- **Key expressions or variables used:** + - `={{ new Date().toISOString() }}` + - `={{ $json.data.search_information.search_title || 'N/A' }}` + - `={{ $json.data.search_information.search_type || 'N/A' }}` + - `={{ $json.data.search_information.total_count || 0 }}` + - `={{ $json.data.search_information.region.name + ', ' + $json.data.search_information.region.state || 'N/A' }}` + - `={{ ($json.data.search_information.min_price || 0) + ' - ' + ($json.data.search_information.max_price || 0) }}` + - `={{ $json.data.search_information.average_price || 0 }}` + - `={{ $json.url || 'N/A' }}` + - `={{ $json.data.search_results }}` +- **Input and output connections:** + - Input from `ScrapeOps: Parse Redfin Listings` + - Output to `Split Properties Into Items` +- **Version-specific requirements:** + - Uses type version `3.2`. +- **Edge cases or potential failure types:** + - Several field names appear malformed in the configuration, with names starting with `=` or containing full expressions. This may create unexpected output keys in n8n. + - The expression for region concatenation can error if `region` is missing, because `.name` and `.state` are accessed before fallback handling. + - If `search_results` is missing or not valid JSON/array, the next node fails. +- **Sub-workflow reference:** + - None. + +#### `Split Properties Into Items` +- **Type and technical role:** `n8n-nodes-base.code` + Converts the search results payload into one n8n item per property. +- **Configuration choices:** + - JavaScript logic: + - Reads `search_results` from the first input item. + - If it is a string, parses it with `JSON.parse`. + - Verifies the result is an array. + - Returns each property as its own item. +- **Key expressions or variables used:** + - `const searchResults = $input.first().json.search_results;` + - `JSON.parse(searchResults)` + - Throws `new Error('search_results is not an array')` if validation fails. +- **Input and output connections:** + - Input from `Add Search Metadata` + - Output to `Format Property Fields` +- **Version-specific requirements:** + - Uses type version `2`. +- **Edge cases or potential failure types:** + - Invalid JSON string in `search_results`. + - `search_results` missing or null. + - Non-array payload causes explicit node failure. + - Large result sets may increase execution time and memory usage. +- **Sub-workflow reference:** + - None. + +#### `Format Property Fields` +- **Type and technical role:** `n8n-nodes-base.set` + Normalizes raw property objects into consistent fields for later filtering and storage. +- **Configuration choices:** + - Produces fields such as: + - `property_address` + - `property_price` + - `property_type` + - `bedrooms` + - `bathrooms` + - `bathrooms_full` + - `bathrooms_half` + - `square_footage` + - `lot_area` + - `year_built` + - `listing_status` + - `days_on_market` + - `mls_id` + - `property_url` + - `photo_url` + - `location` + - `description` + - `hoa_fee` + - `price_per_sqft` + - `sold_date` + - `time_zone` + - `country` + - `badges` + - `latitude` + - `longitude` +- **Key expressions or variables used:** + - `={{ $json.address || $json.title || 'N/A' }}` + - `={{ $json.price || 0 }}` + - `={{ $json.property_type || 'N/A' }}` + - `={{ ($json.description || '').substring(0, 200) + '...' }}` + - `={{ ($json.badges || []).join(', ') || 'None' }}` + - `={{ $json.lat_long?.latitude || 0 }}` + - `={{ $json.lat_long?.longitude || 0 }}` +- **Input and output connections:** + - Input from `Split Properties Into Items` + - Output to `Filter Valid Properties` +- **Version-specific requirements:** + - Uses type version `3.2`. +- **Edge cases or potential failure types:** + - Several configured field names again start with `=` or appear malformed. + - There is an empty object in the field list, which may indicate an accidental blank field entry. + - `price_per_sqft` uses `price_per_sqrf`, which may be a typo and produce zeros/undefined values. + - This node creates `property_price`, but the filter node checks `price`, not `property_price`. + - Original fields may or may not still be present depending on Set-node behavior and n8n settings; this affects downstream consistency. +- **Sub-workflow reference:** + - None. + +#### `Filter Valid Properties` +- **Type and technical role:** `n8n-nodes-base.if` + Filters items to retain only listings with a non-placeholder address and a non-zero price. +- **Configuration choices:** + - Conditions are combined with `AND`. + - Condition 1: `property_address != 'N/A'` + - Condition 2: numeric `price != 0` +- **Key expressions or variables used:** + - `={{ $json.property_address }}` + - `={{ $json.price }}` +- **Input and output connections:** + - Input from `Format Property Fields` + - True output goes to: + - `Save Listings to Google Sheets` + - `Send Slack Summary` + - False branch is unused. +- **Version-specific requirements:** + - Uses type version `2`. +- **Edge cases or potential failure types:** + - Likely logic inconsistency: upstream formatting creates `property_price`, but this node validates `price`. + - Because of that mismatch, valid properties could be rejected if `price` no longer exists. + - Strict type validation may fail if values are strings instead of numbers. +- **Sub-workflow reference:** + - None. + +--- + +## 2.4 Save & Notify + +### Overview +This block persists filtered properties to Google Sheets and sends a Slack message after or during processing. It is the output layer of the workflow. + +### Nodes Involved +- `Save Listings to Google Sheets` +- `Send Slack Summary` + +### Node Details + +#### `Save Listings to Google Sheets` +- **Type and technical role:** `n8n-nodes-base.googleSheets` + Appends property data as rows into a specific spreadsheet tab. +- **Configuration choices:** + - Operation: `append` + - Spreadsheet ID: `1FYbt_m8nUdlkSmCzwZeBjgp4js6sdIKpiyyTIPBsigQ` + - Sheet: `gid=0` / cached as `Sheet1` + - Mapping mode: define fields manually + - Type conversion disabled + - Uses a fixed schema including columns such as Address, Price, Bedrooms, Bathrooms, Photo_URL, Property_URL, Coordinates, etc. +- **Key expressions or variables used:** + - `Price = {{ $('Filter Valid Properties').item.json.price }}` + - `Badges = {{ $json.badges }}` + - `MLS_ID = {{ $json.mls_id }}` + - `Status = {{ $json.listing_status }}` + - `Address = {{ $json.address }}` + - `HOA_Fee = {{ $json.hoa_fee }}` + - `Has_HOA = {{ $json.hoa }}` + - `Bedrooms = {{ $json.bedrooms }}` + - `Location = {{ $json.property_address }}` + - `Bathrooms = {{ $json.bathrooms }}` + - `Lot_Acres = {{ $json.lot_area }}` + - `Photo_URL = {{ $json.photo }}` + - `Scraped_At = {{ new Date().toISOString() }}` + - `Year_Built = {{ $json.year_built }}` + - `Coordinates = {{ $json.lat_long.latitude }}, {{ $json.lat_long.longitude }}` + - `Description = {{ $json.description }}` + - `Square_Feet = {{ $json.square_footage }}` + - `Property_URL = {{ $json.property_url }}` + - `Property_Type = {{ $('Filter Valid Properties').item.json.property_type }}` + - `Price_Per_SqFt = {{ $json.price_per_sqrf }}` +- **Input and output connections:** + - Input from `Filter Valid Properties` + - Output to `Send Slack Summary` +- **Version-specific requirements:** + - Uses type version `4.4`. + - Requires Google Sheets OAuth2 credentials. +- **Edge cases or potential failure types:** + - There are multiple field mismatches: + - `Address` maps from `$json.address`, while normalized field is `property_address`. + - `Photo_URL` maps from `$json.photo`, while normalized field appears to be `photo_url`. + - `Coordinates` expects `$json.lat_long.*`, while normalized fields are `latitude` and `longitude`. + - `Price_Per_SqFt` again references `price_per_sqrf`. + - If the Google Sheet columns do not match exactly, append may fail or create incomplete rows. + - Spreadsheet permissions, expired OAuth token, or wrong document ID can break execution. + - Concurrent high-volume appends may hit API quotas. +- **Sub-workflow reference:** + - None. + +#### `Send Slack Summary` +- **Type and technical role:** `n8n-nodes-base.slack` + Posts a message to a Slack channel indicating scrape completion. +- **Configuration choices:** + - Sends a fixed text message: + - `🏠 Redfin Scrape Complete! | Sheet: https://docs.google.com/spreadsheets/d/1FYbt_m8nUdlkSmCzwZeBjgp4js6sdIKpiyyTIPBsigQ` + - Target is a selected Slack channel. + - `executeOnce = true`, so the message should be sent once per execution rather than once per item. +- **Key expressions or variables used:** + - No dynamic count expression is included, despite the sticky note mentioning listing count. +- **Input and output connections:** + - Inputs from: + - `Filter Valid Properties` + - `Save Listings to Google Sheets` + - No downstream output. +- **Version-specific requirements:** + - Uses type version `2.1`. + - Requires Slack API credentials. +- **Edge cases or potential failure types:** + - Slack auth/permission issues. + - Channel not found or bot not invited. + - Because the node has two incoming branches, runtime behavior depends on item flow and execution semantics; `executeOnce` reduces duplicate posting but does not make this a true merge node. + - Message is static, so it does not confirm actual row count or success count. +- **Sub-workflow reference:** + - None. + +--- + +# 3. Summary Table + +| Node Name | Node Type | Functional Role | Input Node(s) | Output Node(s) | Sticky Note | +|---|---|---|---|---|---| +| ` Schedule Trigger` | Schedule Trigger | Launches the workflow every 6 hours | | `Set Search Parameters` | ## 1. Trigger & Configuration
Fires on a schedule and sets the Redfin search URL to scrape. | +| `Set Search Parameters` | Set | Defines the Redfin search URL | ` Schedule Trigger` | `ScrapeOps: Fetch Redfin Page (Proxy)` | ## 1. Trigger & Configuration
Fires on a schedule and sets the Redfin search URL to scrape. | +| `ScrapeOps: Fetch Redfin Page (Proxy)` | ScrapeOps | Fetches the Redfin page with JS rendering and proxy options | `Set Search Parameters` | `ScrapeOps: Parse Redfin Listings` | ## 2. Fetch & Parse Listings
Load the Redfin page via [ScrapeOps Proxy](https://scrapeops.io/docs/n8n/proxy-api/) with JS rendering, then extract structured listing data using the [ScrapeOps Parser API](https://scrapeops.io/docs/n8n/parser-api/). | +| `ScrapeOps: Parse Redfin Listings` | ScrapeOps | Parses Redfin HTML into structured listing JSON | `ScrapeOps: Fetch Redfin Page (Proxy)` | `Add Search Metadata` | ## 2. Fetch & Parse Listings
Load the Redfin page via [ScrapeOps Proxy](https://scrapeops.io/docs/n8n/proxy-api/) with JS rendering, then extract structured listing data using the [ScrapeOps Parser API](https://scrapeops.io/docs/n8n/parser-api/). | +| `Add Search Metadata` | Set | Adds scrape timestamp and search-level metadata | `ScrapeOps: Parse Redfin Listings` | `Split Properties Into Items` | ## 3. Transform & Filter
Lift search summary fields, split results into individual items, normalize property fields, and drop listings missing an address or valid price. | +| `Split Properties Into Items` | Code | Converts listing array into one item per property | `Add Search Metadata` | `Format Property Fields` | ## 3. Transform & Filter
Lift search summary fields, split results into individual items, normalize property fields, and drop listings missing an address or valid price. | +| `Format Property Fields` | Set | Normalizes property fields for filtering and storage | `Split Properties Into Items` | `Filter Valid Properties` | ## 3. Transform & Filter
Lift search summary fields, split results into individual items, normalize property fields, and drop listings missing an address or valid price. | +| `Filter Valid Properties` | If | Keeps only listings with address and non-zero price | `Format Property Fields` | `Save Listings to Google Sheets`, `Send Slack Summary` | ## 3. Transform & Filter
Lift search summary fields, split results into individual items, normalize property fields, and drop listings missing an address or valid price. | +| `Save Listings to Google Sheets` | Google Sheets | Appends valid listings into the spreadsheet | `Filter Valid Properties` | `Send Slack Summary` | ## 4. Save & Notify
Append valid listings to Google Sheets and optionally post a Slack summary with listing count and a link to the sheet. | +| `Send Slack Summary` | Slack | Sends a completion message to Slack | `Filter Valid Properties`, `Save Listings to Google Sheets` | | ## 4. Save & Notify
Append valid listings to Google Sheets and optionally post a Slack summary with listing count and a link to the sheet. | +| `Sticky Note` | Sticky Note | Workspace documentation | | | # 🏑 Redfin Property Scraper β†’ Google Sheets + Slack

This workflow automatically scrapes Redfin property listings on a schedule. It fetches search results via **ScrapeOps Proxy API**, parses them into clean structured data using the **ScrapeOps Redfin Parser API**, filters valid listings, saves them to Google Sheets, and optionally sends a Slack summary.

### How it works
1. ⏰ **Schedule Trigger** fires every 6 hours automatically.
2. βš™οΈ **Set Search Parameters** defines the Redfin search URL to scrape.
3. 🌐 **ScrapeOps: Fetch Redfin Page** loads the listing page with JS rendering and residential proxy via [ScrapeOps Proxy API](https://scrapeops.io/docs/n8n/proxy-api/).
4. πŸ” **ScrapeOps: Parse Redfin Listings** extracts clean structured JSON using the [ScrapeOps Parser API](https://scrapeops.io/docs/n8n/parser-api/).
5. πŸ—‚οΈ **Add Search Metadata** lifts summary fields like total listings, region, and price range.
6. πŸ“¦ **Split Properties Into Items** turns the results array into one item per property.
7. πŸ—ΊοΈ **Format Property Fields** normalizes address, price, beds, baths, sqft, and more.
8. βœ… **Filter Valid Properties** drops items missing an address or with price = 0.
9. πŸ’Ύ **Save Listings to Google Sheets** appends valid rows to your sheet.
10. πŸ“£ **Send Slack Summary** posts an optional summary with listing count and sheet link.

### Setup steps
- Register for a free ScrapeOps API key: https://scrapeops.io/app/register/n8n
- Add ScrapeOps credentials to both ScrapeOps nodes. Docs: https://scrapeops.io/docs/n8n/overview/
- Duplicate the [Google Sheet template](https://docs.google.com/spreadsheets/d/1FYbt_m8nUdlkSmCzwZeBjgp4js6sdIKpiyyTIPBsigQ/edit?gid=0#gid=0) and paste your Sheet ID into **Save Listings to Google Sheets**.
- Set your target city URL in **Set Search Parameters**.
- Optional: configure the Slack node with your channel and credentials.

### Customization
- Change `redfin_url` to any Redfin city, ZIP, or filtered search page.
- Adjust `wait` and `scroll` settings in the Proxy node if results are empty.
- Change the schedule interval to daily or hourly as needed. | +| `Sticky Note1` | Sticky Note | Workspace documentation | | | ## 1. Trigger & Configuration
Fires on a schedule and sets the Redfin search URL to scrape. | +| `Sticky Note2` | Sticky Note | Workspace documentation | | | ## 2. Fetch & Parse Listings
Load the Redfin page via [ScrapeOps Proxy](https://scrapeops.io/docs/n8n/proxy-api/) with JS rendering, then extract structured listing data using the [ScrapeOps Parser API](https://scrapeops.io/docs/n8n/parser-api/). | +| `Sticky Note3` | Sticky Note | Workspace documentation | | | ## 3. Transform & Filter
Lift search summary fields, split results into individual items, normalize property fields, and drop listings missing an address or valid price. | +| `Sticky Note4` | Sticky Note | Workspace documentation | | | ## 4. Save & Notify
Append valid listings to Google Sheets and optionally post a Slack summary with listing count and a link to the sheet. | + +--- + +# 4. Reproducing the Workflow from Scratch + +Below is a practical rebuild sequence in n8n. It mirrors the current workflow, while also noting places where the original configuration contains inconsistencies that you may want to correct during implementation. + +## Prerequisites +1. Install or enable n8n. +2. Install the ScrapeOps n8n node package if it is not already available in your instance. +3. Prepare credentials: + - **ScrapeOps API credentials** + - **Google Sheets OAuth2 credentials** + - **Slack API credentials** +4. Duplicate or create a Google Sheet with columns matching the intended schema. + +## Step-by-step build + +1. **Create a Schedule Trigger node** + - Type: `Schedule Trigger` + - Set interval to every `6 hours`. + - This is the workflow entry point. + +2. **Create a Set node named `Set Search Parameters`** + - Type: `Set` + - Add one field: + - `redfin_url` as string + - Example value: `https://www.redfin.com/city/21853/MD/California` + - Connect `Schedule Trigger` β†’ `Set Search Parameters`. + +3. **Create a ScrapeOps node named `ScrapeOps: Fetch Redfin Page (Proxy)`** + - Type: `ScrapeOps` + - Use ScrapeOps credentials. + - Configure: + - URL: `{{$json.redfin_url}}` + - Wait: `5000` + - Scroll: `2000` + - Country: `us` + - Render JS: enabled + - Device type: `desktop` + - Residential proxy: enabled + - Connect `Set Search Parameters` β†’ `ScrapeOps: Fetch Redfin Page (Proxy)`. + +4. **Create another ScrapeOps node named `ScrapeOps: Parse Redfin Listings`** + - Type: `ScrapeOps` + - Use the same ScrapeOps credentials. + - Configure: + - API Type: `Parser API` + - Parser Domain: `redfin` + - Parser URL: `{{ $('Set Search Parameters').item.json.redfin_url }}` + - Parser HTML: `{{ $json }}` + - Connect `ScrapeOps: Fetch Redfin Page (Proxy)` β†’ `ScrapeOps: Parse Redfin Listings`. + +5. **Create a Set node named `Add Search Metadata`** + - Type: `Set` + - Add fields for search metadata. To reproduce the workflow cleanly, use these recommended field names: + - `timestamp` β†’ `{{ new Date().toISOString() }}` + - `search_title` β†’ `{{ $json.data.search_information.search_title || 'N/A' }}` + - `search_type` β†’ `{{ $json.data.search_information.search_type || 'N/A' }}` + - `total_properties` β†’ `{{ $json.data.search_information.total_count || 0 }}` + - `region_info` β†’ `{{ $json.data.search_information.region ? $json.data.search_information.region.name + ', ' + $json.data.search_information.region.state : 'N/A' }}` + - `price_range` β†’ `{{ ($json.data.search_information.min_price || 0) + ' - ' + ($json.data.search_information.max_price || 0) }}` + - `average_price` β†’ `{{ $json.data.search_information.average_price || 0 }}` + - `url_scraped` β†’ `{{ $json.url || 'N/A' }}` + - `search_results` β†’ `{{ $json.data.search_results }}` + - Connect `ScrapeOps: Parse Redfin Listings` β†’ `Add Search Metadata`. + +6. **Create a Code node named `Split Properties Into Items`** + - Type: `Code` + - JavaScript: + ```javascript + const searchResults = $input.first().json.search_results; + + const propertiesArray = typeof searchResults === 'string' + ? JSON.parse(searchResults) + : searchResults; + + if (!Array.isArray(propertiesArray)) { + throw new Error('search_results is not an array'); + } + + return propertiesArray.map((property, index) => ({ + json: property, + pairedItem: { item: 0 } + })); + ``` + - Connect `Add Search Metadata` β†’ `Split Properties Into Items`. + +7. **Create a Set node named `Format Property Fields`** + - Type: `Set` + - Add normalized fields: + - `property_address` β†’ `{{ $json.address || $json.title || 'N/A' }}` + - `property_price` β†’ `{{ $json.price || 0 }}` + - `property_type` β†’ `{{ $json.property_type || 'N/A' }}` + - `bedrooms` β†’ `{{ $json.bedrooms || 0 }}` + - `bathrooms` β†’ `{{ $json.bathrooms || 0 }}` + - `bathrooms_full` β†’ `{{ $json.bathrooms_full || 0 }}` + - `bathrooms_half` β†’ `{{ $json.bathrooms_half || 0 }}` + - `square_footage` β†’ `{{ $json.area || 0 }}` + - `lot_area` β†’ `{{ $json.lot_area || 0 }}` + - `year_built` β†’ `{{ $json.year_built || 'N/A' }}` + - `listing_status` β†’ `{{ $json.listing_status || $json.status || 'N/A' }}` + - `days_on_market` β†’ `{{ $json.days_on_market || $json.dom || 'N/A' }}` + - `mls_id` β†’ `{{ $json.mls_id || 'N/A' }}` + - `property_url` β†’ `{{ $json.url || 'N/A' }}` + - `photo_url` β†’ `{{ $json.photo || 'N/A' }}` + - `location` β†’ `{{ $json.location || 'N/A' }}` + - `description` β†’ `{{ (($json.description || '').substring(0, 200)) + '...' }}` + - `hoa_fee` β†’ `{{ $json.hoa || 0 }}` + - `price_per_sqft` β†’ `{{ $json.price_per_sqft || 0 }}` + - `sold_date` β†’ `{{ $json.sold_date || 'N/A' }}` + - `time_zone` β†’ `{{ $json.time_zone || 'N/A' }}` + - `country` β†’ `{{ $json.country || 'N/A' }}` + - `badges` β†’ `{{ ($json.badges || []).join(', ') || 'None' }}` + - `latitude` β†’ `{{ $json.lat_long?.latitude || 0 }}` + - `longitude` β†’ `{{ $json.lat_long?.longitude || 0 }}` + - Prefer clean field names without `=` prefixes. + - Connect `Split Properties Into Items` β†’ `Format Property Fields`. + +8. **Create an If node named `Filter Valid Properties`** + - Type: `If` + - Set `AND` logic. + - Conditions: + - `{{$json.property_address}}` is not equal to `N/A` + - `{{$json.property_price}}` is not equal to `0` + - Important: use `property_price`, not `price`, if you want this block to align with the formatting node. + - Connect `Format Property Fields` β†’ `Filter Valid Properties`. + +9. **Create a Google Sheets node named `Save Listings to Google Sheets`** + - Type: `Google Sheets` + - Credentials: Google Sheets OAuth2. + - Operation: `Append` + - Select your spreadsheet. + - Select target sheet/tab. + - Enable manual field mapping. + - Map columns such as: + - `Address` β†’ `{{ $json.property_address }}` + - `Price` β†’ `{{ $json.property_price }}` + - `Property_Type` β†’ `{{ $json.property_type }}` + - `Bedrooms` β†’ `{{ $json.bedrooms }}` + - `Bathrooms` β†’ `{{ $json.bathrooms }}` + - `Square_Feet` β†’ `{{ $json.square_footage }}` + - `Lot_Acres` β†’ `{{ $json.lot_area }}` + - `Year_Built` β†’ `{{ $json.year_built }}` + - `Status` β†’ `{{ $json.listing_status }}` + - `MLS_ID` β†’ `{{ $json.mls_id }}` + - `Price_Per_SqFt` β†’ `{{ $json.price_per_sqft }}` + - `HOA_Fee` β†’ `{{ $json.hoa_fee }}` + - `Location` β†’ `{{ $json.location }}` + - `Description` β†’ `{{ $json.description }}` + - `Photo_URL` β†’ `{{ $json.photo_url }}` + - `Property_URL` β†’ `{{ $json.property_url }}` + - `Coordinates` β†’ `{{ $json.latitude }}, {{ $json.longitude }}` + - `Scraped_At` β†’ `{{ new Date().toISOString() }}` + - `Badges` β†’ `{{ $json.badges }}` + - Connect the **true** output of `Filter Valid Properties` β†’ `Save Listings to Google Sheets`. + +10. **Prepare the spreadsheet columns** + - Ensure your Google Sheet contains the same column names used in the node mapping. + - The original workflow references: + - Address + - Price + - Property_Type + - Bedrooms + - Bathrooms + - Square_Feet + - Lot_Acres + - Year_Built + - Status + - MLS_ID + - Price_Per_SqFt + - HOA_Fee + - Has_HOA + - Location + - Description + - Photo_URL + - Property_URL + - Coordinates + - Scraped_At + - Badges + +11. **Create a Slack node named `Send Slack Summary`** + - Type: `Slack` + - Credentials: Slack API. + - Action: send message to channel. + - Select the target channel. + - Message example: + - `🏠 Redfin Scrape Complete! | Sheet: https://docs.google.com/spreadsheets/d/YOUR_SHEET_ID` + - Enable `Execute Once`. + - To match the original workflow, connect: + - `Filter Valid Properties` β†’ `Send Slack Summary` + - `Save Listings to Google Sheets` β†’ `Send Slack Summary` + - For a cleaner design, many builders would instead use a Merge or summary node before Slack, but that is not how the provided workflow is wired. + +12. **Optionally add sticky notes** + - Add documentation notes for each major block: + - Trigger & Configuration + - Fetch & Parse Listings + - Transform & Filter + - Save & Notify + +13. **Activate and test** + - Run manually first. + - Check: + - ScrapeOps returns rendered HTML + - Parser returns `data.search_results` + - Code node splits items correctly + - Google Sheets receives rows + - Slack receives one message + +## Credential setup details + +### ScrapeOps +- Create a ScrapeOps account: https://scrapeops.io/app/register/n8n +- Add ScrapeOps credentials in n8n. +- Assign the same credential to both ScrapeOps nodes. + +### Google Sheets +- Create OAuth2 credentials in n8n for Google Sheets. +- Share the target spreadsheet with the authenticated Google account if needed. +- Use the correct spreadsheet ID and sheet tab. + +### Slack +- Create/connect a Slack app or bot credential in n8n. +- Ensure the bot can post to the target channel. +- Invite the bot to the channel if necessary. + +## Important implementation cautions +To reproduce the workflow exactly, you could keep the original field mismatches. To reproduce it reliably, correct the following: +1. Use `property_price` consistently instead of switching between `price` and `property_price`. +2. Use `property_address` consistently instead of switching between `address` and `property_address`. +3. Use `photo_url` consistently instead of `photo`. +4. Use `latitude`/`longitude` consistently instead of later referencing `lat_long`. +5. Replace `price_per_sqrf` with `price_per_sqft` if that is the intended source field. +6. Remove malformed Set field names that start with `=` unless intentionally required. +7. Remove the blank field entry in `Format Property Fields`. + +--- + +# 5. General Notes & Resources + +| Note Content | Context or Link | +|---|---| +| ScrapeOps registration page for API access | https://scrapeops.io/app/register/n8n | +| ScrapeOps n8n overview documentation | https://scrapeops.io/docs/n8n/overview/ | +| ScrapeOps Proxy API documentation | https://scrapeops.io/docs/n8n/proxy-api/ | +| ScrapeOps Parser API documentation | https://scrapeops.io/docs/n8n/parser-api/ | +| Google Sheet template referenced in the workflow | https://docs.google.com/spreadsheets/d/1FYbt_m8nUdlkSmCzwZeBjgp4js6sdIKpiyyTIPBsigQ/edit?gid=0#gid=0 | +| Workflow note: change `redfin_url` to any Redfin city, ZIP, or filtered search page | Configuration guidance from workspace notes | +| Workflow note: adjust `wait` and `scroll` settings if results are empty | Scrape/render tuning guidance from workspace notes | +| Workflow note: change the schedule interval to daily or hourly as needed | Scheduling customization guidance from workspace notes | \ No newline at end of file