Batch Processing with Retry

September 22, 2025

Retry-per-batch Approach

The chunker itertools.batched pairs very well with Tenacity’s retry for batch processing.

Core Arguments in `@retry`

Tenacity’s decorator is very flexible, but these are the ones you’ll use most often:

1. stop (when to give up)

Controls how many times to retry, or for how long.
Common helpers:
- stop_after_attempt(n) → stop after n attempts.
- stop_after_delay(seconds) → stop after total elapsed time.
Example:
```
stop=stop_after_attempt(3)
```

2. wait (how long to sleep between attempts)

Defines the wait strategy between retries.
Common helpers:
- wait_fixed(1) → always wait 1 second.
- wait_random(min=1, max=3) → wait random between 1 and 3 seconds.
- wait_exponential(multiplier=1, max=60) → exponential backoff.

Example:

wait=wait_fixed(1)  # simple and predictable

3. retry (when to retry)

Predicate to decide whether to retry given an exception or result.
Common helpers:
- retry_if_exception_type(SomeException) → retry on that error.
- retry_if_result(lambda r: r is None) → retry on “bad” results.

Example:

# will retry only when the function raises ConnectionError
retry=retry_if_exception_type(ConnectionError)
  
# will retry on multiple exceptions:
retry=retry_if_exception_type((TimeoutError, ConnectionError)) 

4. reraise

This controls what exception you see after retries are exhausted.

reraise=False (default):
- Tenacity raises a RetryError.
- The RetryError wraps the last attempt’s exception.
- You can inspect details via .last_attempt.
```
try:
    flaky()
except RetryError as e:
    print("wrapped:", e.last_attempt.exception())
```
reraise=True:
- Tenacity directly re-raises the last real exception (e.g., ConnectionError).
- Makes the retry logic transparent to callers.
```
try:
    flaky2()
except ConnectionError:
    print("saw real error after retries")
```
reraise=True is usually nicer for library-style code where the caller shouldn’t care that Tenacity was used.
reraise=False is useful if you want retry metadata (e.g., logging or metrics on how many attempts were made).

5. before / after / before_sleep

Hooks for logging, metrics, or side effects.

Example:

from tenacity import before_sleep_log
import logging
  
log = logging.getLogger(__name__)

before_sleep=before_sleep_log(log, logging.WARNING)

Putting It Together (Batch Example with `itertools.batched`)

import itertools
import logging
from tenacity import (
    retry, stop_after_attempt, wait_fixed,
    retry_if_exception_type, before_sleep_log
)

# --- set up built-in logger ---
logging.basicConfig(
    level=logging.INFO,  # or DEBUG if you want more detail
    format="%(asctime)s [%(levelname)s] %(name)s: %(message)s"
)
log = logging.getLogger(__name__)

# retry policy: up to 3 times, 1s apart, on ConnectionError
@retry(
    stop=stop_after_attempt(3),
    wait=wait_fixed(1),
    retry=retry_if_exception_type(ConnectionError),
    reraise=True,
    before_sleep=before_sleep_log(log, logging.WARNING),  # <-- logs retries
)
def process_batch(batch):
    log.info("Processing batch with %d UUIDs", len(batch))
    # run your Hive query here
    return run_query(batch)

def process_all(uuids, batch_size=1000):
    for batch in itertools.batched(uuids, batch_size):
        try:
            yield process_batch(batch)
        except Exception as e:
            log.error("Batch failed after retries: %s", e)
            # re-raise if you want the whole job to stop
            raise

What happens now:

On each retryable failure, Tenacity logs a warning:

2025-09-22 15:45:12 [WARNING] __main__: RetryError caught ... Retrying in 1 seconds ...

On successful processing, you’ll see:

2025-09-22 15:45:13 [INFO] __main__: Processing batch with 1000 UUIDs

On final failure (all retries exhausted), you’ll see:

2025-09-22 15:45:16 [ERROR] __main__: Batch failed after retries: ConnectionError('...')

That gives you clear observability without extra boilerplate.

In summary, for batch processing:

stop + wait → control attempts and pacing.
retry → ensure you only retry on transient errors.
reraise=True → caller just sees the “real” Hive error if all retries fail.

Batch Processing with Retry

Retry-per-batch Approach

Core Arguments in @retry

1. stop (when to give up)

2. wait (how long to sleep between attempts)

3. retry (when to retry)

4. reraise

5. before / after / before_sleep

Putting It Together (Batch Example with itertools.batched)

What happens now:

Core Arguments in `@retry`

Putting It Together (Batch Example with `itertools.batched`)