Retry Mechanism¶

structx includes a robust retry mechanism with exponential backoff to handle transient errors when communicating with LLM providers.

Basic Usage¶

The retry mechanism is enabled by default with sensible defaults:

extractor = Extractor.from_litellm(
    model="gpt-4o",
    api_key="your-api-key"
    # Default retry settings:
    # max_retries=3, min_wait=1, max_wait=10
)

Customizing Retry Behavior¶

You can customize the retry behavior when initializing the extractor:

extractor = Extractor.from_litellm(
    model="gpt-4o",
    api_key="your-api-key",
    max_retries=5,      # Maximum number of retry attempts
    min_wait=2,         # Minimum seconds to wait between retries
    max_wait=30         # Maximum seconds to wait between retries
)

How It Works¶

The retry mechanism uses exponential backoff, which means:

First retry: Wait min_wait seconds
Second retry: Wait twice as long
Subsequent retries: Continue doubling the wait time up to max_wait

This approach helps prevent overwhelming the API during temporary outages and gives the service time to recover.

Retry Flow¶

View Retry Flow Diagram

graph TD
    A[LLM Request] --> B{Success?}
    B -->|Yes| C[Return Result]
    B -->|No| D{Retryable Error?}

    D -->|No| E[Raise Exception]
    D -->|Yes| F{Max Retries Reached?}

    F -->|Yes| G[Raise Final Exception]
    F -->|No| H[Calculate Wait Time]

    H --> I[Wait]
    I --> J[Increment Retry Count]
    J --> A

    subgraph "Wait Time Calculation"
        K["Base Wait = min_wait * 2^retry_count"]
        L["Actual Wait = min of Base Wait and max_wait"]
        M["Add Jitter plus or minus 10%"]
    end

    H --> K
    K --> L
    L --> M

    subgraph "Retryable Errors"
        N[Network Timeouts]
        O[Rate Limiting]
        P[Server Errors 5xx]
        Q[Connection Errors]
    end

    D --> N

Retry-Eligible Errors¶

The retry mechanism automatically handles:

Network timeouts
Rate limiting errors
Temporary server errors
Connection issues

Critical errors like authentication failures or invalid inputs are not retried as they require manual intervention.

Monitoring Retries¶

You can monitor retry attempts through the logs:

import logging
from loguru import logger

# Configure more verbose logging
logger.remove()
logger.add(sys.stderr, level="DEBUG")

# Now extraction calls will show retry attempts in the logs
result = extractor.extract(data=data, query=query)

Retry with Async Operations¶

The retry mechanism works seamlessly with async operations:

import asyncio

async def extract_with_retry():
    result = await extractor.extract_async(
        data="document.pdf",
        query="extract key information"
    )
    return result

# The same retry settings apply to async operations
result = asyncio.run(extract_with_retry())

Next Steps¶

Check out Token Usage Tracking to monitor resource consumption
Learn about Async Operations for better performance
Explore Error Handling for more details on handling errors
See the Configuration Options for all available settings