Split Async

import requests

url = "https://platform.reducto.ai/split_async"

payload = {
    "split_description": [
        {
            "name": "<string>",
            "description": "<string>",
            "partition_key": "<string>"
        }
    ],
    "document_url": "<string>",
    "options": {
        "ocr_mode": "standard",
        "extraction_mode": "ocr",
        "chunking": { "chunk_mode": "variable" },
        "table_summary": { "enabled": False },
        "figure_summary": {
            "enabled": False,
            "enhanced": False,
            "override": False
        },
        "filter_blocks": [],
        "force_url_result": False
    },
    "advanced_options": {
        "ocr_system": "highres",
        "table_output_format": "html",
        "merge_tables": False,
        "include_formula_information": False,
        "include_color_information": False,
        "continue_hierarchy": True,
        "keep_line_breaks": False,
        "page_range": {},
        "large_table_chunking": {
            "enabled": True,
            "size": 50
        },
        "spreadsheet_table_clustering": "default",
        "add_page_markers": False,
        "remove_text_formatting": False,
        "return_ocr_data": False,
        "filter_line_numbers": False,
        "read_comments": False,
        "persist_results": False,
        "exclude_hidden_sheets": False,
        "exclude_hidden_rows_cols": False,
        "enable_change_tracking": False,
        "enable_highlight_detection": False
    },
    "experimental_options": {
        "enrich": {
            "enabled": False,
            "mode": "standard"
        },
        "layout_enrichment": False,
        "native_office_conversion": False,
        "enable_checkboxes": False,
        "enable_equations": False,
        "rotate_pages": True,
        "rotate_figures": False,
        "enable_scripts": False,
        "return_figure_images": False,
        "return_table_images": False,
        "layout_model": "default",
        "embed_text_metadata_pdf": False,
        "detect_signatures": False,
        "danger_filter_wide_boxes": False
    },
    "split_rules": "Split the document into the applicable sections. Sections may only overlap at their first and last page if at all.",
    "priority": False,
    "split_options": { "table_cutoff": "truncate" },
    "webhook": {
        "mode": "disabled",
        "channels": []
    }
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)

{
  "job_id": "<string>"
}

POST

split_async

Split Async

import requests

url = "https://platform.reducto.ai/split_async"

payload = {
    "split_description": [
        {
            "name": "<string>",
            "description": "<string>",
            "partition_key": "<string>"
        }
    ],
    "document_url": "<string>",
    "options": {
        "ocr_mode": "standard",
        "extraction_mode": "ocr",
        "chunking": { "chunk_mode": "variable" },
        "table_summary": { "enabled": False },
        "figure_summary": {
            "enabled": False,
            "enhanced": False,
            "override": False
        },
        "filter_blocks": [],
        "force_url_result": False
    },
    "advanced_options": {
        "ocr_system": "highres",
        "table_output_format": "html",
        "merge_tables": False,
        "include_formula_information": False,
        "include_color_information": False,
        "continue_hierarchy": True,
        "keep_line_breaks": False,
        "page_range": {},
        "large_table_chunking": {
            "enabled": True,
            "size": 50
        },
        "spreadsheet_table_clustering": "default",
        "add_page_markers": False,
        "remove_text_formatting": False,
        "return_ocr_data": False,
        "filter_line_numbers": False,
        "read_comments": False,
        "persist_results": False,
        "exclude_hidden_sheets": False,
        "exclude_hidden_rows_cols": False,
        "enable_change_tracking": False,
        "enable_highlight_detection": False
    },
    "experimental_options": {
        "enrich": {
            "enabled": False,
            "mode": "standard"
        },
        "layout_enrichment": False,
        "native_office_conversion": False,
        "enable_checkboxes": False,
        "enable_equations": False,
        "rotate_pages": True,
        "rotate_figures": False,
        "enable_scripts": False,
        "return_figure_images": False,
        "return_table_images": False,
        "layout_model": "default",
        "embed_text_metadata_pdf": False,
        "detect_signatures": False,
        "danger_filter_wide_boxes": False
    },
    "split_rules": "Split the document into the applicable sections. Sections may only overlap at their first and last page if at all.",
    "priority": False,
    "split_options": { "table_cutoff": "truncate" },
    "webhook": {
        "mode": "disabled",
        "channels": []
    }
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)

{
  "job_id": "<string>"
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

split_description

SplitCategory · object[]

required

The configuration options for processing the document.

Show child attributes

document_url

required

The URL of the document to be processed. You can provide one of the following:

A publicly available URL
A presigned S3 URL
A reducto:// prefixed URL obtained from the /upload endpoint after directly uploading a document
A job_id (jobid://) or a list of job_ids (jobid://) obtained from a previous /parse endpoint

options

BaseProcessingOptions · object

Show child attributes

advanced_options

AdvancedProcessingOptions · object

Show child attributes

experimental_options

ExperimentalProcessingOptions · object

Show child attributes

split_rules

string

default:Split the document into the applicable sections. Sections may only overlap at their first and last page if at all.

The prompt that describes rules for splitting the document.

priority

boolean

default:false

If True, attempts to process the job with priority if the user has priority processing budget available; by default, sync jobs are prioritized above async jobs.

split_options

SplitOptions · object

Show child attributes

webhook

WebhookConfigNew · object

Show child attributes

Response

Successful Response

job_id

string

required

Split Edit

⌘I

Document Processing

Job Management

Utilities

Split Async

Authorizations

Body

Response