28 KiB
Skip trace people with ScraperCity and save contact details to Google Sheets
Skip trace people with ScraperCity and save contact details to Google Sheets
1. Workflow Overview
This workflow performs a skip trace / people lookup using the ScraperCity People Finder API, waits for the asynchronous job to complete, downloads the resulting CSV, parses and deduplicates the contact data, and then saves the output into Google Sheets.
Typical use cases include:
- Looking up contact details for one or more people from names, phones, or emails
- Enriching lead or prospect lists
- Creating a repeatable pipeline to store skip-trace results in a spreadsheet
The workflow is organized into the following logical blocks:
1.1 Input Reception and Search Configuration
The workflow starts manually. A Set node captures comma-separated search inputs such as names, phones, and emails, then a Code node converts these values into the request structure expected by ScraperCity.
1.2 Job Submission
The prepared payload is sent to the ScraperCity People Finder endpoint. The returned runId is extracted and stored for later status checks and file download.
1.3 Asynchronous Polling Loop
Because ScraperCity jobs are not immediate, the workflow enters a loop: wait 60 seconds, check job status, and decide whether to continue waiting or proceed.
1.4 Results Download, Parsing, and Storage
Once the job succeeds, the workflow downloads the CSV output, parses it into records, removes duplicates, and appends or updates rows in Google Sheets.
2. Block-by-Block Analysis
2.1 Input Reception and Search Configuration
Overview
This block initializes the workflow and prepares the search criteria. It converts user-friendly comma-separated text fields into arrays required by the ScraperCity API.
Nodes Involved
- When clicking 'Execute workflow'
- Configure Search Inputs
- Build Request Body
Node Details
1) When clicking 'Execute workflow'
- Type and role:
n8n-nodes-base.manualTrigger
Manual entry point used for testing or ad hoc execution. - Configuration choices: No custom parameters.
- Key expressions or variables used: None.
- Input and output connections:
- Input: none
- Output: Configure Search Inputs
- Version-specific requirements:
typeVersion: 1 - Edge cases / failures:
- No runtime failure expected unless workflow execution is interrupted manually.
- Sub-workflow reference: None.
2) Configure Search Inputs
- Type and role:
n8n-nodes-base.set
Defines the initial search values. - Configuration choices: Creates four fields:
names="John Smith,Jane Doe"phones=""emails=""max_results=1
- Key expressions or variables used: Static values only.
- Input and output connections:
- Input: When clicking 'Execute workflow'
- Output: Build Request Body
- Version-specific requirements:
typeVersion: 3.4 - Edge cases / failures:
- Empty inputs are allowed, but sending no usable identifiers may produce poor or empty results from the API.
max_resultsshould remain a positive number.
- Sub-workflow reference: None.
3) Build Request Body
- Type and role:
n8n-nodes-base.code
Transforms comma-separated text into arrays and builds the API request object. - Configuration choices:
The script:- Splits
names,emails, andphoneson commas - Trims whitespace
- Removes empty values
- Maps the resulting arrays to:
nameemailphone_numberstreet_citystatezipas an empty arraymax_results
- Splits
- Key expressions or variables used:
$input.first().jsoncfg.namescfg.emailscfg.phonescfg.max_results
- Input and output connections:
- Input: Configure Search Inputs
- Output: Submit Skip Trace Job
- Version-specific requirements:
typeVersion: 2 - Edge cases / failures:
- If upstream fields are missing, the code still handles them reasonably by converting falsy values into empty arrays.
- If
max_resultsis missing, it defaults to1. - API-side validation may fail if all search arrays are empty.
- Sub-workflow reference: None.
2.2 Job Submission
Overview
This block sends the skip trace request to ScraperCity and stores the returned job identifier. That runId is the central reference used by all later polling and download steps.
Nodes Involved
- Submit Skip Trace Job
- Store Run ID
Node Details
4) Submit Skip Trace Job
- Type and role:
n8n-nodes-base.httpRequest
Sends a POST request to create a ScraperCity People Finder job. - Configuration choices:
- Method:
POST - URL:
https://app.scrapercity.com/api/v1/scrape/people-finder - Body type: JSON
- Body content: serialized current item JSON
- Authentication: generic credential type using header auth
- Method:
- Key expressions or variables used:
={{ JSON.stringify($json) }}
- Input and output connections:
- Input: Build Request Body
- Output: Store Run ID
- Version-specific requirements:
typeVersion: 4.2 - Edge cases / failures:
- Invalid or missing API key
- Incorrect header auth setup
- API rejection due to malformed body or unsupported search criteria
- Network timeout or remote service failure
- Sub-workflow reference: None.
- Credential requirements:
Requires a Header Auth credential named like ScraperCity API Key, typically:- Header name:
Authorization - Header value:
Bearer YOUR_KEY
- Header name:
5) Store Run ID
- Type and role:
n8n-nodes-base.set
ExtractsrunIdfrom the API response and stores it in a predictable field. - Configuration choices: Sets:
runId = {{ $json.runId }}
- Key expressions or variables used:
={{ $json.runId }}
- Input and output connections:
- Input: Submit Skip Trace Job
- Output: Polling Loop
- Version-specific requirements:
typeVersion: 3.4 - Edge cases / failures:
- If the API response does not include
runId, all downstream status/download expressions will fail. - A partial or error response from ScraperCity may cause this node to output an empty string.
- If the API response does not include
- Sub-workflow reference: None.
2.3 Asynchronous Polling Loop
Overview
This block repeatedly checks whether the submitted job has completed. It pauses for 60 seconds between checks and loops until the returned status equals SUCCEEDED.
Nodes Involved
- Polling Loop
- Wait 60 Seconds
- Check Scrape Status
- Is Scrape Complete?
Node Details
6) Polling Loop
- Type and role:
n8n-nodes-base.splitInBatches
Used here as a loop-control mechanism rather than for real batching. - Configuration choices:
- Batch size:
1 - Reset:
false
- Batch size:
- Key expressions or variables used: None directly.
- Input and output connections:
- Input: Store Run ID, Is Scrape Complete? (false branch loopback)
- Output 0: Wait 60 Seconds
- Output 1: unused
- Version-specific requirements:
typeVersion: 3 - Edge cases / failures:
- This pattern works as a loop, but if workflow logic changes incorrectly it may create an unintended infinite polling cycle.
- Since no max-attempt guard exists, jobs stuck in non-terminal states could cause very long-running executions.
- Sub-workflow reference: None.
7) Wait 60 Seconds
- Type and role:
n8n-nodes-base.wait
Introduces a 60-second delay before each status check. - Configuration choices: Wait amount is
60. - Key expressions or variables used: None.
- Input and output connections:
- Input: Polling Loop
- Output: Check Scrape Status
- Version-specific requirements:
typeVersion: 1.1 - Edge cases / failures:
- Long waits increase total execution duration.
- Depending on n8n hosting mode, wait behavior may require proper persistence/execution resumption support.
- Sub-workflow reference: None.
8) Check Scrape Status
- Type and role:
n8n-nodes-base.httpRequest
Queries the ScraperCity status endpoint for the submitted run. - Configuration choices:
- Method:
GET - URL uses the stored run ID
- Authentication: generic header auth
- Method:
- Key expressions or variables used:
=https://app.scrapercity.com/api/v1/scrape/status/{{ $('Store Run ID').item.json.runId }}
- Input and output connections:
- Input: Wait 60 Seconds
- Output: Is Scrape Complete?
- Version-specific requirements:
typeVersion: 4.2 - Edge cases / failures:
- Missing
runId - Invalid API key
- Run not found or expired
- API temporarily unavailable
- Missing
- Sub-workflow reference: None.
9) Is Scrape Complete?
- Type and role:
n8n-nodes-base.if
Routes execution depending on whether the returned scrape status is complete. - Configuration choices:
- Condition checks whether
$json.statusequalsSUCCEEDED - Strict type validation enabled
- Case-sensitive comparison
- Condition checks whether
- Key expressions or variables used:
={{ $json.status }}
- Input and output connections:
- Input: Check Scrape Status
- True output: Download Results CSV
- False output: Polling Loop
- Version-specific requirements:
typeVersion: 2.2 - Edge cases / failures:
- Any terminal non-success state such as
FAILED,CANCELLED, or unexpected status will loop forever because onlySUCCEEDEDexits the loop. - If the API changes status naming or casing, the comparison may never match.
- Any terminal non-success state such as
- Sub-workflow reference: None.
2.4 Results Download, Parsing, and Storage
Overview
After success is confirmed, this block downloads the output CSV, converts it into structured contact records, removes duplicates, and writes the final dataset to Google Sheets.
Nodes Involved
- Download Results CSV
- Parse and Format CSV Results
- Remove Duplicate Contacts
- Write Results to Google Sheets
Node Details
10) Download Results CSV
- Type and role:
n8n-nodes-base.httpRequest
Downloads the completed job results from ScraperCity. - Configuration choices:
- Method:
GET - URL uses the stored
runId - Response format: text
- Authentication: generic header auth
- Method:
- Key expressions or variables used:
=https://app.scrapercity.com/api/downloads/{{ $('Store Run ID').item.json.runId }}
- Input and output connections:
- Input: Is Scrape Complete? (true branch)
- Output: Parse and Format CSV Results
- Version-specific requirements:
typeVersion: 4.2 - Edge cases / failures:
- Missing
runId - File not yet available despite status check
- Empty or non-CSV response
- Auth or download endpoint failure
- Missing
- Sub-workflow reference: None.
11) Parse and Format CSV Results
- Type and role:
n8n-nodes-base.code
Parses CSV text manually into JSON records and performs an initial deduplication. - Configuration choices:
The script:- Reads CSV text from
json.dataorjson.body - Returns an error object if no text is present
- Splits content by newline
- Parses the header row and data rows
- Uses a custom CSV parser with basic quote handling
- Builds one JSON item per row
- Deduplicates using
name + primary_phone(or fallbackphone) - Returns an error item if no records are parsed
- Reads CSV text from
- Key expressions or variables used:
$input.first().json.data$input.first().json.body$('Store Run ID').item.json.runId- Record fields such as
name,primary_phone,phone
- Input and output connections:
- Input: Download Results CSV
- Output: Remove Duplicate Contacts
- Version-specific requirements:
typeVersion: 2 - Edge cases / failures:
- CSV parsing is custom and may fail on complex CSV cases such as escaped quotes, embedded line breaks, or unusual delimiters.
- If the response is empty, an error object is emitted instead of normal contact rows.
- If headers differ from expected field names, deduplication quality may decrease.
- Sub-workflow reference: None.
12) Remove Duplicate Contacts
- Type and role:
n8n-nodes-base.removeDuplicates
Performs a second deduplication step across parsed records. - Configuration choices:
- Compare mode: selected fields
- Fields compared:
nameprimary_phone
- Key expressions or variables used: None.
- Input and output connections:
- Input: Parse and Format CSV Results
- Output: Write Results to Google Sheets
- Version-specific requirements:
typeVersion: 2 - Edge cases / failures:
- If parsed items contain error objects instead of contact records, those may still pass through and be treated as unique rows.
- Records with same person but slightly different phone formatting may not be recognized as duplicates.
- Sub-workflow reference: None.
13) Write Results to Google Sheets
- Type and role:
n8n-nodes-base.googleSheets
Appends or updates records in a target Google Sheet. - Configuration choices:
- Operation:
appendOrUpdate - Column mapping: auto-map input data
- Sheet name: configured by ID expression placeholder
- Document ID: configured by ID expression placeholder
- Operation:
- Key expressions or variables used:
sheetName.value = "="documentId.value = "="
- Input and output connections:
- Input: Remove Duplicate Contacts
- Output: none
- Version-specific requirements:
typeVersion: 4.6 - Edge cases / failures:
- As provided, both spreadsheet document ID and sheet identifier are not configured and must be filled in before use.
- OAuth2 credential may be missing or unauthorized.
- Auto-mapping depends on header names matching incoming JSON fields or sheet columns.
appendOrUpdatemay require a defined matching column configuration depending on n8n version and target sheet structure.
- Sub-workflow reference: None.
- Credential requirements:
Requires a Google Sheets OAuth2 credential.
3. Summary Table
| Node Name | Node Type | Functional Role | Input Node(s) | Output Node(s) | Sticky Note |
|---|---|---|---|---|---|
| When clicking 'Execute workflow' | Manual Trigger | Manual start of the workflow | Configure Search Inputs | ## How it works 1. You enter names, phone numbers, or emails in the Configure Search Inputs node. 2. The workflow submits a skip trace job to the ScraperCity People Finder API. 3. It polls for completion every 60 seconds (jobs take 10-60 min). 4. Once done, it downloads the CSV, parses each contact, deduplicates, and writes rows to Google Sheets. ## Setup steps 1. Create a Header Auth credential named ScraperCity API Key -- set header to Authorization, value to Bearer YOUR_KEY.2. Connect a Google Sheets OAuth2 credential. 3. Edit Configure Search Inputs with your lookup targets. 4. Set your Google Sheet ID in Write Results to Google Sheets. 5. Click Execute workflow. |
|
| Configure Search Inputs | Set | Defines names, phones, emails, and result limit | When clicking 'Execute workflow' | Build Request Body | ## How it works 1. You enter names, phone numbers, or emails in the Configure Search Inputs node. 2. The workflow submits a skip trace job to the ScraperCity People Finder API. 3. It polls for completion every 60 seconds (jobs take 10-60 min). 4. Once done, it downloads the CSV, parses each contact, deduplicates, and writes rows to Google Sheets. ## Setup steps 1. Create a Header Auth credential named ScraperCity API Key -- set header to Authorization, value to Bearer YOUR_KEY.2. Connect a Google Sheets OAuth2 credential. 3. Edit Configure Search Inputs with your lookup targets. 4. Set your Google Sheet ID in Write Results to Google Sheets. 5. Click Execute workflow. ## Configuration Enter comma-separated names, phones, or emails in Configure Search Inputs. Build Request Body converts them into arrays for the API. |
| Build Request Body | Code | Converts comma-separated inputs into API arrays | Configure Search Inputs | Submit Skip Trace Job | ## Configuration Enter comma-separated names, phones, or emails in Configure Search Inputs. Build Request Body converts them into arrays for the API. |
| Submit Skip Trace Job | HTTP Request | Creates the ScraperCity skip trace job | Build Request Body | Store Run ID | ## Submit Job Submit Skip Trace Job POSTs to the ScraperCity People Finder endpoint. Store Run ID saves the returned job ID for later polling. |
| Store Run ID | Set | Saves the returned run identifier | Submit Skip Trace Job | Polling Loop | ## Submit Job Submit Skip Trace Job POSTs to the ScraperCity People Finder endpoint. Store Run ID saves the returned job ID for later polling. |
| Polling Loop | Split In Batches | Loop controller for repeated polling | Store Run ID, Is Scrape Complete? | Wait 60 Seconds | ## Async Polling Loop Wait 60 Seconds pauses between checks. Check Scrape Status hits the status endpoint. Is Scrape Complete? routes to download on success or loops back. |
| Wait 60 Seconds | Wait | Delays before each status check | Polling Loop | Check Scrape Status | ## Async Polling Loop Wait 60 Seconds pauses between checks. Check Scrape Status hits the status endpoint. Is Scrape Complete? routes to download on success or loops back. |
| Check Scrape Status | HTTP Request | Retrieves job execution status from ScraperCity | Wait 60 Seconds | Is Scrape Complete? | ## Async Polling Loop Wait 60 Seconds pauses between checks. Check Scrape Status hits the status endpoint. Is Scrape Complete? routes to download on success or loops back. |
| Is Scrape Complete? | If | Routes success to download and non-success back to loop | Check Scrape Status | Download Results CSV, Polling Loop | ## Async Polling Loop Wait 60 Seconds pauses between checks. Check Scrape Status hits the status endpoint. Is Scrape Complete? routes to download on success or loops back. |
| Download Results CSV | HTTP Request | Downloads the CSV output file | Is Scrape Complete? | Parse and Format CSV Results | ## Async Polling Loop Wait 60 Seconds pauses between checks. Check Scrape Status hits the status endpoint. Is Scrape Complete? routes to download on success or loops back. |
| Parse and Format CSV Results | Code | Parses CSV text into structured contact items | Download Results CSV | Remove Duplicate Contacts | ## Download and Output Download Results CSV fetches the file. Parse and Format CSV Results splits into records. Remove Duplicate Contacts dedupes. Write Results to Google Sheets appends rows. |
| Remove Duplicate Contacts | Remove Duplicates | Removes duplicate contact rows | Parse and Format CSV Results | Write Results to Google Sheets | ## Download and Output Download Results CSV fetches the file. Parse and Format CSV Results splits into records. Remove Duplicate Contacts dedupes. Write Results to Google Sheets appends rows. |
| Write Results to Google Sheets | Google Sheets | Saves final records into a spreadsheet | Remove Duplicate Contacts | ## Download and Output Download Results CSV fetches the file. Parse and Format CSV Results splits into records. Remove Duplicate Contacts dedupes. Write Results to Google Sheets appends rows. |
|
| Overview | Sticky Note | Workspace documentation / setup guidance | |||
| Section - Configuration | Sticky Note | Workspace annotation for input preparation block | |||
| Section - Submit Job | Sticky Note | Workspace annotation for submission block | |||
| Section - Async Polling Loop | Sticky Note | Workspace annotation for polling block | |||
| Section - Download and Output | Sticky Note | Workspace annotation for result handling block |
4. Reproducing the Workflow from Scratch
-
Create a new workflow in n8n.
-
Add a Manual Trigger node
- Type: Manual Trigger
- Name it: When clicking 'Execute workflow'
-
Add a Set node
- Type: Set
- Name it: Configure Search Inputs
- Create these fields:
namesas String, example:John Smith,Jane Doephonesas String, example: leave blank if unusedemailsas String, example: leave blank if unusedmax_resultsas Number, set to1
- Connect:
- When clicking 'Execute workflow' → Configure Search Inputs
-
Add a Code node
- Type: Code
- Name it: Build Request Body
- Paste logic that:
- Splits comma-separated strings into arrays
- Trims whitespace
- Removes empty entries
- Returns a single JSON object with:
nameemailphone_numberstreet_citystatezipas[]max_results
- Equivalent behavior:
- Read from the Set node output
- Convert
names,emails,phones - Default
max_resultsto1if missing
- Connect:
- Configure Search Inputs → Build Request Body
-
Create the ScraperCity authentication credential
- Credential type: Header Auth
- Suggested name: ScraperCity API Key
- Header name:
Authorization - Header value:
Bearer YOUR_KEY - Replace
YOUR_KEYwith the real ScraperCity API key.
-
Add an HTTP Request node
- Type: HTTP Request
- Name it: Submit Skip Trace Job
- Method:
POST - URL:
https://app.scrapercity.com/api/v1/scrape/people-finder - Authentication: Generic Credential Type
- Generic Auth Type: Header Auth
- Select the ScraperCity API Key credential
- Send Body: enabled
- Body Content Type / Specify Body: JSON
- JSON body expression:
{{ JSON.stringify($json) }} - Connect:
- Build Request Body → Submit Skip Trace Job
-
Add a Set node
- Type: Set
- Name it: Store Run ID
- Add field:
runIdas String- Value:
{{ $json.runId }}
- Connect:
- Submit Skip Trace Job → Store Run ID
-
Add a Split In Batches node
- Type: Split In Batches
- Name it: Polling Loop
- Batch Size:
1 - Reset:
false - Use it purely as a loop controller
- Connect:
- Store Run ID → Polling Loop
-
Add a Wait node
- Type: Wait
- Name it: Wait 60 Seconds
- Wait amount:
60seconds - Connect:
- Polling Loop → Wait 60 Seconds
-
Add an HTTP Request node
- Type: HTTP Request
- Name it: Check Scrape Status
- Method:
GET - URL expression:
https://app.scrapercity.com/api/v1/scrape/status/{{ $('Store Run ID').item.json.runId }} - Authentication: Generic Credential Type
- Generic Auth Type: Header Auth
- Use the same ScraperCity API Key credential
- Connect:
- Wait 60 Seconds → Check Scrape Status
-
Add an If node
- Type: If
- Name it: Is Scrape Complete?
- Condition:
- Left value:
{{ $json.status }} - Operator: equals
- Right value:
SUCCEEDED
- Left value:
- Keep case-sensitive comparison enabled
- Connect:
- Check Scrape Status → Is Scrape Complete?
-
Create the polling loopback
- Connect the false output of Is Scrape Complete? back to Polling Loop
- Connect the true output later to the download step
- Note: this design keeps polling every 60 seconds until the status becomes exactly
SUCCEEDED
-
Add an HTTP Request node
- Type: HTTP Request
- Name it: Download Results CSV
- Method:
GET - URL expression:
https://app.scrapercity.com/api/downloads/{{ $('Store Run ID').item.json.runId }} - Authentication: Generic Credential Type
- Generic Auth Type: Header Auth
- Use the same ScraperCity API Key credential
- Response format: Text
- Connect:
- Is Scrape Complete? true output → Download Results CSV
-
Add a Code node
- Type: Code
- Name it: Parse and Format CSV Results
- Implement logic that:
- Reads the text response from
json.dataorjson.body - Returns an error item if no CSV text exists
- Splits the CSV into lines
- Parses headers and data rows
- Builds a JSON object for each record
- Deduplicates using
name+primary_phone, falling back tophone - Returns at least one error item if parsing yields nothing
- Reads the text response from
- Connect:
- Download Results CSV → Parse and Format CSV Results
-
Add a Remove Duplicates node
- Type: Remove Duplicates
- Name it: Remove Duplicate Contacts
- Compare mode: Selected Fields
- Fields to compare:
nameprimary_phone
- Connect:
- Parse and Format CSV Results → Remove Duplicate Contacts
-
Create a Google Sheets OAuth2 credential
- Credential type: Google Sheets OAuth2 API
- Authenticate against the Google account that owns or can edit the target spreadsheet
-
Prepare the destination spreadsheet
- Create or choose a Google Sheet
- Add column headers matching expected CSV keys where possible, such as:
nameprimary_phone- plus any additional fields returned by ScraperCity
- Ensure the target tab exists
-
Add a Google Sheets node
- Type: Google Sheets
- Name it: Write Results to Google Sheets
- Credential: your Google Sheets OAuth2 credential
- Operation: Append or Update
- Mapping mode: Auto-map input data
- Set:
- Document ID to your spreadsheet ID
- Sheet Name or sheet identifier to the correct tab
- Connect:
- Remove Duplicate Contacts → Write Results to Google Sheets
-
Optional but recommended hardening
- Add a maximum retry counter to avoid endless polling
- Add explicit handling for
FAILED,CANCELLED, or unknown statuses - Add an error filter before Google Sheets so error objects are not written as rows
- Normalize phone number formatting before duplicate removal
-
Add sticky notes if desired
- One overview note with setup instructions
- One note for configuration
- One note for submission
- One note for polling
- One note for download/output
-
Test the workflow
- Enter real names, phones, or emails in Configure Search Inputs
- Run the workflow manually
- Confirm:
- A
runIdis returned - Status polling works
- CSV downloads after success
- Parsed rows appear in Google Sheets
- A
Expected Inputs and Outputs
- Workflow input: manual execution plus configured values in the Set node
- External API input: JSON payload with arrays for
name,email,phone_number - Intermediate key output:
runId - Final output: one or more rows appended/updated in Google Sheets
Sub-workflow Setup
This workflow contains no sub-workflows and does not invoke any external n8n workflow.
5. General Notes & Resources
| Note Content | Context or Link |
|---|---|
| How it works: Enter names, phone numbers, or emails in the Configure Search Inputs node; the workflow submits a skip trace job to the ScraperCity People Finder API; it polls for completion every 60 seconds (jobs take 10–60 min); once done, it downloads the CSV, parses each contact, deduplicates, and writes rows to Google Sheets. | Workspace overview note |
Setup steps: Create a Header Auth credential named ScraperCity API Key with header Authorization and value Bearer YOUR_KEY; connect a Google Sheets OAuth2 credential; edit Configure Search Inputs; set your Google Sheet ID in Write Results to Google Sheets; click Execute workflow. |
Workspace setup guidance |
| Configuration: Enter comma-separated names, phones, or emails in Configure Search Inputs. Build Request Body converts them into arrays for the API. | Configuration block note |
| Submit Job: Submit Skip Trace Job POSTs to the ScraperCity People Finder endpoint. Store Run ID saves the returned job ID for later polling. | Submission block note |
| Async Polling Loop: Wait 60 Seconds pauses between checks. Check Scrape Status hits the status endpoint. Is Scrape Complete? routes to download on success or loops back. | Polling block note |
| Download and Output: Download Results CSV fetches the file. Parse and Format CSV Results splits into records. Remove Duplicate Contacts dedupes. Write Results to Google Sheets appends rows. | Output block note |