Retrieve Parse
import os
from reducto import Reducto
client = Reducto(
api_key=os.environ.get("REDUCTO_API_KEY"), # This is the default and can be omitted
)
job = client.job.get(
"job_id",
)
print(job.status)
{
"status": "Pending",
"result": {
"job_id": "<string>",
"duration": 123,
"pdf_url": "<string>",
"usage": {
"num_pages": 123
},
"result": {
"type": "full",
"chunks": [
{
"content": "<string>",
"embed": "<string>",
"enriched": "<string>",
"enrichment_success": false,
"blocks": [
{
"type": "Header",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"content": "<string>",
"image_url": "<string>"
}
]
}
],
"ocr": {
"words": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
}
}
],
"lines": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
}
}
]
},
"custom": "<any>"
}
},
"progress": 123,
"reason": "<string>"
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Path Parameters
Response
Pending
, Completed
, Failed
, Idle
The duration of the parse request in seconds.
The response from the document processing service. Note that there can be two types of responses, Full Result and URL Result. This is due to limitations on the max return size on HTTPS. If the response is too large, it will be returned as a presigned URL in the URL response. You should handle this in your application.
type = 'full'
full
The content of the chunk extracted from the document.
Chunk content optimized for embedding and retrieval.
The enriched content of the chunk extracted from the document.
The type of block extracted from the document.
Header
, Footer
, Title
, Section Header
, Page Number
, List Item
, Figure
, Table
, Key Value
, Text
, Comment
, Discard
The bounding box of the block extracted from the document.
The content of the block extracted from the document.
(Experimental) The URL of the image associated with the block.
Whether the enrichment was successful.
The storage URL of the converted PDF file.
import os
from reducto import Reducto
client = Reducto(
api_key=os.environ.get("REDUCTO_API_KEY"), # This is the default and can be omitted
)
job = client.job.get(
"job_id",
)
print(job.status)
{
"status": "Pending",
"result": {
"job_id": "<string>",
"duration": 123,
"pdf_url": "<string>",
"usage": {
"num_pages": 123
},
"result": {
"type": "full",
"chunks": [
{
"content": "<string>",
"embed": "<string>",
"enriched": "<string>",
"enrichment_success": false,
"blocks": [
{
"type": "Header",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
},
"content": "<string>",
"image_url": "<string>"
}
]
}
],
"ocr": {
"words": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
}
}
],
"lines": [
{
"text": "<string>",
"bbox": {
"left": 123,
"top": 123,
"width": 123,
"height": 123,
"page": 123,
"original_page": 123
}
}
]
},
"custom": "<any>"
}
},
"progress": 123,
"reason": "<string>"
}