klbr/n8nworkflows.xyz

Fork 0

mirror of https://github.com/khoaliber/n8nworkflows.xyz.git synced 2026-04-22 09:08:03 +00:00

Files

T

nusquama f7092abecf creation

2025-11-12 18:37:24 +01:00

24 KiB

Raw Blame History

Manipulate PDF with Adobe developer API

https://n8nworkflows.xyz/workflows/manipulate-pdf-with-adobe-developer-api-2424

Manipulate PDF with Adobe developer API

1. Workflow Overview

This workflow serves as a comprehensive, generic wrapper to interact with the Adobe PDF Services API, enabling various PDF manipulations such as splitting, combining, OCR, page operations, and content extraction. It encapsulates the multi-step Adobe API process, which includes authentication, asset registration, file upload, query execution, result polling, and downloading the transformed PDF or related data.

Target Use Cases:

Automating PDF transformations using Adobe’s API from other workflows or manual triggers.
Extracting clean PDF content for AI and Retrieval-Augmented Generation (RAG) systems, e.g., extracting tables as images for improved AI recognition.
Developers needing a reusable, modular integration with Adobe PDF Services.

Logical Blocks:

1.1 Input Reception & Setup: Triggering the workflow and preparing input data and query parameters.
1.2 Authentication: Obtaining a temporary access token from Adobe for API calls.
1.3 Asset Registration & Upload: Registering a new PDF asset and uploading the file to Adobe’s cloud storage.
1.4 Query Processing: Sending the transformation request to Adobe and managing asynchronous processing.
1.5 Result Handling: Polling for completion, downloading the output, and forwarding results back.

2. Block-by-Block Analysis

1.1 Input Reception & Setup

Overview:
This block receives the initial trigger to start the workflow and prepares the Adobe API query parameters alongside input PDF data.

Nodes Involved:

When clicking ‘Test workflow’ (Manual Trigger)
Load a test pdf file (Dropbox node)
Adobe API Query (Set node)
Query + File (Merge node)

Node Details:

When clicking ‘Test workflow’
- Type: Manual Trigger
- Role: Entry point for manual testing.
- Inputs: None
- Outputs: Triggers the "Load a test pdf file" and "Adobe API Query" nodes.
- Failure modes: None significant; manual trigger.
Load a test pdf file
- Type: Dropbox (Download operation)
- Role: Downloads a test PDF file from Dropbox for processing.
- Configuration: Path set to a specific PDF in Dropbox, OAuth2 authentication used.
- Inputs: Triggered by manual trigger node.
- Outputs: Binary PDF data.
- Failure modes: OAuth2 token expiry, file not found, network errors.
Adobe API Query
- Type: Set node
- Role: Prepares the JSON payload and the target endpoint for the Adobe API request.
- Configuration:
  - endpoint: set to "extractpdf" (example use case).
  - json_payload: JSON object specifying what to extract, here “tables” and “text”.
- Inputs: None directly (manual trigger)
- Outputs: Passes JSON data downstream.
- Failure modes: Expression evaluation failure if JSON malformed.
Query + File
- Type: Merge (combine by position)
- Role: Combines the query parameters and the PDF file binary data into one data object for subsequent processing.
- Inputs: From "Adobe API Query" and "Load a test pdf file" nodes.
- Outputs: Merged JSON + binary object.
- Failure modes: Mismatched input positions or missing inputs.

1.2 Authentication

Overview:
Obtains an OAuth access token from Adobe necessary for all subsequent API calls.

Nodes Involved:

Authenticartion (get token) (HTTP Request)

Node Details:

Authenticartion (get token)
- Type: HTTP Request (POST)
- Role: Exchanges client credentials for a temporary access token.
- Configuration:
  - URL: https://pdf-services.adobe.io/token
  - Content-Type: application/x-www-form-urlencoded
  - Uses a custom HTTP credential containing client_id and client_secret in the body.
- Inputs: Triggered after input preparation.
- Outputs: JSON object containing access_token.
- Failure modes: Invalid credentials, network errors, token expiration.
- Credential note: Requires a "Custom Auth" credential with body parameters client_id and client_secret.

1.3 Asset Registration & Upload

Overview:
Registers a new asset to host the PDF on Adobe servers and uploads the PDF binary data to this asset.

Nodes Involved:

Create Asset (HTTP Request)
Query + File + Asset information (Merge node)
Upload PDF File (asset) (HTTP Request)

Node Details:

Create Asset
- Type: HTTP Request (POST)
- Role: Registers a new asset with Adobe by sending a POST request with mediaType "application/pdf".
- Configuration:
  - URL: https://pdf-services.adobe.io/assets
  - Auth: Header with Authorization: Bearer {{access_token}}
  - Sends JSON body { "mediaType": "application/pdf" }
- Inputs: Access token from Authentication node.
- Outputs: JSON response containing asset metadata including assetID and uploadUri.
- Failure modes: Auth errors, rate limits, malformed requests.
Query + File + Asset information
- Type: Merge (combine by position)
- Role: Combines the original query + file data with the newly created asset info.
- Inputs: From "Query + File" and "Create Asset" nodes.
- Outputs: Merged object including assetID and uploadUri for upload.
- Failure modes: Input mismatch, missing asset info.
Upload PDF File (asset)
- Type: HTTP Request (PUT)
- Role: Uploads the PDF binary data to the asset’s upload URI.
- Configuration:
  - URL: dynamic, based on uploadUri from asset registration.
  - Content-Type: binaryData
  - Input data field: binary PDF file data from merged input.
- Inputs: Merged data containing uploadUri and binary file.
- Outputs: Confirmation of upload success.
- Failure modes: Upload failures, network issues, invalid uploadUri.

1.4 Query Processing

Overview:
Submits the requested PDF transformation query to Adobe, waits for processing, and polls for completion.

Nodes Involved:

Process Query (HTTP Request)
Wait 5 second (Wait node)
Try to download the result (HTTP Request)
Switch (Switch node)

Node Details:

Process Query
- Type: HTTP Request (POST)
- Role: Calls the Adobe API operation endpoint for the requested transformation.
- Configuration:
  - URL: https://pdf-services.adobe.io/operation/{{endpoint}} where endpoint is dynamic.
  - Body: JSON combining assetID and the original json_payload.
  - Auth: Bearer token header.
  - Specifies full HTTP response to get headers (used for polling).
- Inputs: Merged query + file + asset data.
- Outputs: HTTP response with 'location' header for polling result URL.
- Failure modes: Auth errors, invalid payload, server errors.
Wait 5 second
- Type: Wait node
- Role: Delay between polling attempts to avoid overloading Adobe.
- Configuration: 5 seconds delay.
- Inputs: Chain from Process Query or Switch node.
- Outputs: Triggers next polling attempt.
- Failure modes: None likely.
Try to download the result
- Type: HTTP Request (GET)
- Role: Polls the URL in 'location' header for processing status or final result.
- Configuration:
  - URL: dynamic from Process Query response header location.
  - Auth: Bearer token header.
- Inputs: After wait node.
- Outputs: JSON containing status and possibly download URL.
- Failure modes: 404 if not ready, network errors, token expiry.
Switch
- Type: Switch node
- Role: Routes output based on status field (in progress, failed, or else).
- Configuration: Checks JSON status field to determine next step.
- Inputs: From download attempt node.
- Outputs:
  - If "in progress": loops back to Wait 5 second (poll again).
  - If "failed" or other: forwards response to origin workflow (end).
- Failure modes: Missing or unexpected status fields.

1.5 Result Handling

Overview:
Forwards the final output or error back to the calling workflow or user.

Nodes Involved:

Forward response to origin workflow (Set node)

Node Details:

Forward response to origin workflow
- Type: Set node
- Role: Passes through the final JSON data from Adobe API to the outer workflow or caller.
- Configuration: Includes all fields as is without modification.
- Inputs: From Switch node on failure or success completion.
- Outputs: Final output of the workflow.
- Failure modes: None significant; pass-through node.

3. Summary Table

Node Name	Node Type	Functional Role	Input Node(s)	Output Node(s)	Sticky Note
When clicking ‘Test workflow’	Manual Trigger	Manual workflow start trigger	—	Load a test pdf file, Adobe API Query
Load a test pdf file	Dropbox	Downloads test PDF from Dropbox	When clicking ‘Test workflow’	Query + File
Adobe API Query	Set	Prepares endpoint and payload JSON	When clicking ‘Test workflow’	Query + File
Query + File	Merge	Merges query and PDF file data	Adobe API Query, Load a test pdf file	Authenticartion (get token), Query + File + Asset information
Authenticartion (get token)	HTTP Request	Retrieves Adobe API access token	Query + File, Execute Workflow Trigger	Create Asset	See Sticky Note3: Custom Auth credential with client_id and client_secret in body.
Create Asset	HTTP Request	Registers new PDF asset on Adobe	Authenticartion (get token)	Query + File + Asset information
Query + File + Asset information	Merge	Combines query, file, and asset info	Query + File, Create Asset	Upload PDF File (asset)
Upload PDF File (asset)	HTTP Request	Uploads PDF binary to Adobe asset	Query + File + Asset information	Process Query
Process Query	HTTP Request	Sends transformation query to Adobe	Upload PDF File (asset)	Wait 5 second
Wait 5 second	Wait	Delays for 5 seconds between polls	Process Query, Switch	Try to download the result	Sticky Note2: "Wait for file do be processed"
Try to download the result	HTTP Request	Polls for query result or status	Wait 5 second	Switch
Switch	Switch	Routes based on processing status	Try to download the result	Wait 5 second (if in progress), Forward response to origin workflow (if failed or complete)
Forward response to origin workflow	Set	Passes final result to caller	Switch	—
Execute Workflow Trigger	Execute Workflow Trigger	Allows external workflows to call this wrapper	—	Authenticartion (get token), Query + File + Asset information
Sticky Note	Sticky Note	Documentation and guidance	—	—	See section 5 for full sticky note contents
Sticky Note1	Sticky Note	Development testing note	—	—
Sticky Note2	Sticky Note	Explains wait node	—	—	"Wait for file do be processed"
Sticky Note3	Sticky Note	Details credential for token request	—	—	Custom Auth credential example with client_id and client_secret body
Sticky Note4	Sticky Note	Details credential for API queries	—	—	Header Auth credential example with X-API-Key
Sticky Note5	Sticky Note	Explains workflow input format	—	—	Examples of endpoint and json_payload formats

4. Reproducing the Workflow from Scratch

Create Manual Trigger Node:
- Node type: Manual Trigger
- Purpose: Start workflow manually for testing.
Add Dropbox node to download test PDF:
- Node type: Dropbox
- Operation: Download
- Path: Set to desired PDF file path in Dropbox.
- Authentication: OAuth2 with Dropbox credentials.
Add Set node to prepare Adobe API query:
- Node type: Set
- Fields:
  - endpoint: string, e.g., "extractpdf"
  - json_payload: JSON object with desired extraction params, e.g., { "renditionsToExtract": ["tables"], "elementsToExtract": ["text","tables"] }
Merge node to combine query and PDF file:
- Node type: Merge
- Mode: Combine by position (mergeByPosition)
- Inputs: Connect outputs of Set node and Dropbox node.
Add HTTP Request node for Adobe Authentication:
- Node type: HTTP Request
- Method: POST
- URL: https://pdf-services.adobe.io/token
- Content-Type: application/x-www-form-urlencoded
- Authentication: Use a Custom Auth credential configured as:
```
{
  "headers": {
    "Content-Type": "application/x-www-form-urlencoded"
  },
  "body": {
    "client_id": "YOUR_CLIENT_ID",
    "client_secret": "YOUR_CLIENT_SECRET"
  }
}
```
- Input: Connect from Merge node.
Add HTTP Request node to Create Asset:
- Node type: HTTP Request
- Method: POST
- URL: https://pdf-services.adobe.io/assets
- Authentication: Use a Header Auth credential with header Authorization: Bearer {{access_token}} (access_token from previous node).
- Body: JSON { "mediaType": "application/pdf" }
- Input: Connect from Authentication node.
Add Merge node to combine query+file with asset info:
- Node type: Merge
- Mode: Combine by position
- Inputs: From "Query + File" merge node and "Create Asset" node.
Add HTTP Request node to Upload PDF File:
- Node type: HTTP Request
- Method: PUT
- URL: Dynamic from asset info uploadUri
- Content-Type: binaryData
- Input Data Field: Set to binary PDF file data
- Input: Connect from previous Merge node.
Add HTTP Request node to Process Query:
- Node type: HTTP Request
- Method: POST
- URL: https://pdf-services.adobe.io/operation/{{endpoint}} (endpoint from merged data)
- Authentication: Header with Bearer token.
- Body: JSON merging assetID and json_payload.
- Input: Connect from Upload PDF node.
Add Wait node:
- Node type: Wait
- Duration: 5 seconds
- Input: Connect from Process Query node.
Add HTTP Request node to Try Download Result:
- Node type: HTTP Request
- Method: GET
- URL: From location header of Process Query response
- Authentication: Bearer token header
- Input: Connect from Wait node.
Add Switch node:
- Node type: Switch
- Check JSON field status
- Condition branches:
  - If in progress: Connect back to Wait node (loop).
  - If failed or others: Forward output to next node.
Add Set node to Forward response:
- Node type: Set
- Purpose: Pass final output downstream or back to calling workflow.
- Input: Connect from Switch node.
Optional: Add Execute Workflow Trigger node if you want this workflow callable from others.
Create Credentials:
- Custom Auth credential for token request with client_id and client_secret in body.
- Header Auth credential for other API calls with X-API-Key header (same as client_id).

5. General Notes & Resources

Note Content	Context or Link
Adobe API Wrapper workflow documentation and official API docs: https://developer.adobe.com/document-services/docs/overview/pdf-services-api/howtos/ and https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/gettingstarted/	Sticky Note at workflow start
Credential setup for token request requires creating a Custom Auth credential with client_id and client_secret in the body, Content-Type `application/x-www-form-urlencoded`.	Sticky Note3
Credential setup for all other Adobe API queries requires Header Auth credential with `X-API-Key` header set to client_id.	Sticky Note4
Workflow input expects an object with `endpoint` string, `json_payload` object (excluding assetID), and PDF binary data as input. Examples provided for `splitpdf` and `extractpdf` endpoints.	Sticky Note5
Typical use case: extracting tables as images from PDFs to forward to AI systems for improved recognition accuracy.	Workflow overview section
The workflow is designed to be used as a sub-workflow via Execute Workflow node or manually triggered for testing.	Node "Execute Workflow Trigger" presence
Rate limits: Adobe free tier allows up to 500 PDF operations per month.	Workflow description
Polling delay is set to 5 seconds to avoid overwhelming Adobe API during processing.	Sticky Note2

This document fully describes the workflow’s logic, node configuration, and integration points to enable seamless reproduction, modification, and error anticipation.

24 KiB Raw Blame History Unescape Escape

Manipulate PDF with Adobe developer API

1. Workflow Overview

2. Block-by-Block Analysis

1.1 Input Reception & Setup

1.2 Authentication

1.3 Asset Registration & Upload

1.4 Query Processing

1.5 Result Handling

3. Summary Table

4. Reproducing the Workflow from Scratch

5. General Notes & Resources

24 KiB

Raw Blame History