POST
/
parse
import os
from reducto import Reducto

client = Reducto(
    api_key=os.environ.get("REDUCTO_API_KEY"),  # This is the default and can be omitted
)
parse_response = client.parse.run(
    document_url="string",
)
print(parse_response.job_id)
{
  "job_id": "<string>",
  "duration": 123,
  "pdf_url": "<string>",
  "usage": {
    "num_pages": 123
  },
  "result": {
    "type": "full",
    "chunks": [
      {
        "content": "<string>",
        "embed": "<string>",
        "enriched": "<string>",
        "enrichment_success": false,
        "blocks": [
          {
            "type": "Header",
            "bbox": {
              "left": 123,
              "top": 123,
              "width": 123,
              "height": 123,
              "page": 123,
              "original_page": 123
            },
            "content": "<string>",
            "image_url": "<string>"
          }
        ]
      }
    ],
    "ocr": {
      "words": [
        {
          "text": "<string>",
          "bbox": {
            "left": 123,
            "top": 123,
            "width": 123,
            "height": 123,
            "page": 123,
            "original_page": 123
          }
        }
      ],
      "lines": [
        {
          "text": "<string>",
          "bbox": {
            "left": 123,
            "top": 123,
            "width": 123,
            "height": 123,
            "page": 123,
            "original_page": 123
          }
        }
      ]
    },
    "custom": "<any>"
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
document_url
required

The URL of the document to be processed. You can provide one of the following:

  1. A publicly available URL
  2. A presigned S3 URL
  3. A reducto:// prefixed URL obtained from the /upload endpoint after directly uploading a document
options
object
advanced_options
object
experimental_options
object

Response

200
application/json
Successful Response
job_id
string
required
duration
number
required

The duration of the parse request in seconds.

usage
object
required
result
object
required

The response from the document processing service. Note that there can be two types of responses, Full Result and URL Result. This is due to limitations on the max return size on HTTPS. If the response is too large, it will be returned as a presigned URL in the URL response. You should handle this in your application.

pdf_url
string | null

The storage URL of the converted PDF file.