Batch Processing with Retry
Retry-per-batch Approach
The chunker itertools.batched
pairs very well with Tenacityâs retry
for batch processing.
Core Arguments in @retry
Tenacityâs decorator is very flexible, but these are the ones youâll use most often:
1. stop (when to give up)
- Controls how many times to retry, or for how long.
-
Common helpers:
stop_after_attempt(n)
â stop after n attempts.stop_after_delay(seconds)
â stop after total elapsed time.
-
Example:
stop=stop_after_attempt(3)
2. wait (how long to sleep between attempts)
- Defines the wait strategy between retries.
-
Common helpers:
wait_fixed(1)
â always wait 1 second.wait_random(min=1, max=3)
â wait random between 1 and 3 seconds.wait_exponential(multiplier=1, max=60)
â exponential backoff.
-
Example:
wait=wait_fixed(1) # simple and predictable
3. retry (when to retry)
- Predicate to decide whether to retry given an exception or result.
-
Common helpers:
retry_if_exception_type(SomeException)
â retry on that error.retry_if_result(lambda r: r is None)
â retry on âbadâ results.
-
Example:
# will retry only when the function raises ConnectionError retry=retry_if_exception_type(ConnectionError) # will retry on multiple exceptions: retry=retry_if_exception_type((TimeoutError, ConnectionError))
4. reraise
This controls what exception you see after retries are exhausted.
-
reraise=False
(default):- Tenacity raises a
RetryError
. - The
RetryError
wraps the last attemptâs exception. - You can inspect details via
.last_attempt
.
try: flaky() except RetryError as e: print("wrapped:", e.last_attempt.exception())
- Tenacity raises a
-
reraise=True
:- Tenacity directly re-raises the last real exception (e.g.,
ConnectionError
). - Makes the retry logic transparent to callers.
try: flaky2() except ConnectionError: print("saw real error after retries")
- Tenacity directly re-raises the last real exception (e.g.,
reraise=True
is usually nicer for library-style code where the caller shouldnât care that Tenacity was used.reraise=False
is useful if you want retry metadata (e.g., logging or metrics on how many attempts were made).
5. before / after / before_sleep
- Hooks for logging, metrics, or side effects.
-
Example:
from tenacity import before_sleep_log import logging log = logging.getLogger(__name__) before_sleep=before_sleep_log(log, logging.WARNING)
Putting It Together (Batch Example with itertools.batched
)
import itertools
import logging
from tenacity import (
retry, stop_after_attempt, wait_fixed,
retry_if_exception_type, before_sleep_log
)
# --- set up built-in logger ---
logging.basicConfig(
level=logging.INFO, # or DEBUG if you want more detail
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s"
)
log = logging.getLogger(__name__)
# retry policy: up to 3 times, 1s apart, on ConnectionError
@retry(
stop=stop_after_attempt(3),
wait=wait_fixed(1),
retry=retry_if_exception_type(ConnectionError),
reraise=True,
before_sleep=before_sleep_log(log, logging.WARNING), # <-- logs retries
)
def process_batch(batch):
log.info("Processing batch with %d UUIDs", len(batch))
# run your Hive query here
return run_query(batch)
def process_all(uuids, batch_size=1000):
for batch in itertools.batched(uuids, batch_size):
try:
yield process_batch(batch)
except Exception as e:
log.error("Batch failed after retries: %s", e)
# re-raise if you want the whole job to stop
raise
What happens now:
-
On each retryable failure, Tenacity logs a warning:
2025-09-22 15:45:12 [WARNING] __main__: RetryError caught ... Retrying in 1 seconds ...
-
On successful processing, youâll see:
2025-09-22 15:45:13 [INFO] __main__: Processing batch with 1000 UUIDs
-
On final failure (all retries exhausted), youâll see:
2025-09-22 15:45:16 [ERROR] __main__: Batch failed after retries: ConnectionError('...')
That gives you clear observability without extra boilerplate.
In summary, for batch processing:
stop
+wait
â control attempts and pacing.retry
â ensure you only retry on transient errors.reraise=True
â caller just sees the ârealâ Hive error if all retries fail.