Tuesday, June 03, 2008

Fun with itertools

Sometimes it's hard to shake old habits, especially when you've burned them into your brain as the "standard" way to do things. For example, I've been doing network programming with C and C++ for a very long time. One of the standard pieces of code I've written again and again is the "connect with backoff" pattern.

If a program needs a continuous network connection, and that connection is lost, it should try to reconnect. On the one hand, you want to reconnect as quickly as possible; on the other hand, you don't want to keep retrying (and failing) in a tight loop. So you use a "backoff" timer: after each attempt, you wait longer (up to a maximum limit).

As a C programmer, I would implement an algorithm that resembles this Python-like pseudocode:


# After the first failure wait half a second before retrying;
# double this each time up to eight seconds.
backoff_times = [.5, 1, 2, 4, 8]
cur_backoff = None

while 1:
try:
# Try to connect
connect()
except ConnectionError:
# Failed; update the backoff counter
if cur_backoff is None:
cur_backoff = 0
else:
cur_backoff = min(len(backoff_times)-1, cur_backoff+1)
# Wait to retry
time.sleep(backoff_times[cur_backoff])
else:
# Success; reset the backoff timer
cur_backoff = None

But in Python the code to manage the current backoff timer looks out of place.

In a high level language, when the ratio of "code that says what I want" to "code that tells the language how to do what I want" gets too low, you're doing it wrong. It means that you're spending too many mental cycles on the "how," and not enough on the "what".

In this case, Python gives me a better way to tell it just "what" I want it to do: use an iterator.

import itertools

def iter_pegged(seq):
"""Return an iterator that walks the sequence, and then 'pegs' on the last item."""
return itertools.chain(seq, itertools.repeat(seq[-1]))

backoff_times = [.5, 1, 2, 4, 8]
cur_backoff = iter_pegged(backoff_times)

while 1:
try:
# Try to connect
connect()
except ConnectionError:
# Wait to retry
time.sleep(cur_backoff.next())
else:
# Success; reset the backoff timer
cur_backoff = iter_pegged(backoff_times)

Other than the definition of iter_pegged, each line of code says what only what it wants to do, not how it wants to do it.

And that's what coding in a high level language is all about, no?

3 comments:

Brodie Rao said...

That doesn't work on sequences that don't implement __getitem__. This is more generic:

>>> def iter_pegged(seq):
... for item in seq:
... yield item
... while True:
... yield item
...

The formatting's messed up, but you get the idea.

Panos Laganakos said...

Yep, a generator looks like a better choice.

Although I always use generators myself, not very familiar with itertools.

Tim Lesher said...

Thanks, brodie.

I knew iter_pegged() could also be implemented with a generator, but I didn't think about the __getitem__ dependency.