🚀 Our new and improved config V3 is now live! See API reference for details.
import requests
url = "https://platform.reducto.ai/job/{job_id}"
headers = {"Authorization": "Bearer <token>"}
response = requests.get(url, headers=headers)
print(response.json()){
"status": "Pending",
"result": {
"job_id": "<string>",
"duration": 123,
"usage": {
"num_pages": 123,
"credits": 123
},
"result": {
"type": "<string>",
"chunks": [
{
"content": "<string>",
"embed": "<string>",
"enriched": "<string>",
"blocks": [
{
"type": "Header",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"content": "<string>",
"image_url": "<string>",
"confidence": "low",
"granular_confidence": {
"extract_confidence": 123,
"parse_confidence": 123
}
}
],
"enrichment_success": false
}
],
"ocr": {
"words": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"confidence": 123,
"chunk_index": 123
}
],
"lines": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"confidence": 123,
"chunk_index": 123
}
]
},
"custom": "<unknown>"
},
"pdf_url": "<string>",
"studio_link": "<string>"
},
"progress": 123,
"reason": "<string>"
}import requests
url = "https://platform.reducto.ai/job/{job_id}"
headers = {"Authorization": "Bearer <token>"}
response = requests.get(url, headers=headers)
print(response.json()){
"status": "Pending",
"result": {
"job_id": "<string>",
"duration": 123,
"usage": {
"num_pages": 123,
"credits": 123
},
"result": {
"type": "<string>",
"chunks": [
{
"content": "<string>",
"embed": "<string>",
"enriched": "<string>",
"blocks": [
{
"type": "Header",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"content": "<string>",
"image_url": "<string>",
"confidence": "low",
"granular_confidence": {
"extract_confidence": 123,
"parse_confidence": 123
}
}
],
"enrichment_success": false
}
],
"ocr": {
"words": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"confidence": 123,
"chunk_index": 123
}
],
"lines": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"confidence": 123,
"chunk_index": 123
}
]
},
"custom": "<unknown>"
},
"pdf_url": "<string>",
"studio_link": "<string>"
},
"progress": 123,
"reason": "<string>"
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Successful Response
Pending, Completed, Failed, Idle Show child attributes
The duration of the parse request in seconds.
The response from the document processing service. Note that there can be two types of responses, Full Result and URL Result. This is due to limitations on the max return size on HTTPS. If the response is too large, it will be returned as a presigned URL in the URL response. You should handle this in your application.
Show child attributes
type = 'full'
"full"Show child attributes
The content of the chunk extracted from the document.
Chunk content optimized for embedding and retrieval.
The enriched content of the chunk extracted from the document.
Show child attributes
The type of block extracted from the document.
Header, Footer, Title, Section Header, Page Number, List Item, Figure, Table, Key Value, Text, Comment, Signature The bounding box of the block extracted from the document.
Show child attributes
The page number of the bounding box (1-indexed).
The page number in the original document of the bounding box (1-indexed).
The content of the block extracted from the document.
(Experimental) The URL of the image associated with the block.
The confidence for the block. It is either low or high and takes into account factors like OCR and table structure
Granular confidence scores for the block. It is a dictionary of confidence scores for the block. The confidence scores will not be None if the user has enabled numeric confidence scores.
Whether the enrichment was successful.
Show child attributes
Show child attributes
Show child attributes
The page number of the bounding box (1-indexed).
The page number in the original document of the bounding box (1-indexed).
OCR confidence score between 0 and 1, where 1 indicates highest confidence
The index of the chunk that the word belongs to.
Show child attributes
Show child attributes
The page number of the bounding box (1-indexed).
The page number in the original document of the bounding box (1-indexed).
OCR confidence score between 0 and 1, where 1 indicates highest confidence
The index of the chunk that the line belongs to.
The storage URL of the converted PDF file.
The link to the studio pipeline for the document.
Was this page helpful?