We automatically swap to returning a result with type=url when the request response size is close to exceeding the maximum allowed HTTP response size (6 MB for our current infra). If you would like to make your processing code consistent and not account for the two cases, we recommend forcing the URL result by passing in the force_url_result config parameter. Forcing the full result is not possible due to the max response payload limitation. The URL links to a JSON object that contains the full results you should expect!

Handling URL Responses

Using the Python SDK Helper

The Reducto Python SDK provides a convenient helper function to automatically handle URL responses:
from reducto import Reducto
from reducto.lib.helpers import handle_url_response

client = Reducto(api_key="your_api_key")

# Parse a document
response = client.document.parse(
    file=open("document.pdf", "rb"),
    config={"force_url_result": True}  # Optional: force URL response
)

# Use the helper to get the full result regardless of response type
full_response = handle_url_response(response)

# Now you can access the full result data
print(full_response.result)
print(f"Job ID: {full_response.job_id}")
print(f"Duration: {full_response.duration}s")
The handle_url_response helper automatically:
  • Checks if the response is a URL type
  • Fetches the content from the URL if needed
  • Returns a consistent FullParseResponse object with the complete results

Direct API Usage with Requests

If you’re calling the API directly without the SDK, you can handle URL responses like this:
import requests
import json

# Make your API request
response = requests.post(
    "https://api.reducto.ai/v1/parse",
    headers={"Authorization": "Bearer your_api_key"},
    files={"file": open("document.pdf", "rb")},
    data={"config": json.dumps({"force_url_result": True})}
)

parse_result = response.json()

# Check if we got a URL response
if parse_result["result"]["type"] == "url":
    # Fetch the actual results from the URL
    url_response = requests.get(parse_result["result"]["url"])
    full_result = url_response.json()
    
    # Use the full result
    print(full_result)
else:
    # Direct result - use as-is
    full_result = parse_result["result"]["result"]
    print(full_result)

print(f"Job ID: {parse_result['job_id']}")
print(f"Duration: {parse_result['duration']}s")

Best Practices

  • Consistent handling: Use force_url_result: true in your config to always get URL responses, making your code more predictable
  • Error handling: Always check for HTTP errors when fetching from the URL
  • Timeout handling: Set appropriate timeouts when fetching URL content
  • URL expiration: The URLs have an expiration time, so fetch the content promptly after receiving the response