Monday, 15 October 2012

Python Discovery: Named Tuples

I had known of the named tuple idiom for ages. It is used sparingly in some python API's, and I feel it is very pythonic, practical, and simple to read and code with.

It's used like this (this example is for the timetuple in datetime.datetime, which is a namedtuple, or looks a lot like one):

>>> tt = datetime.datetime.now().timetuple()
>>> tt
time.struct_time(tm_year=2012, tm_mon=10, tm_mday=15, tm_hour=20, ...)
>>> tt.tm_year
2012
>>> tt[0]
2012
>>> tt[3]
20
>>> tt.tm_mon
10
>>> 

And it's iterable.

>>> list(tt)
[2012, 10, 15, 20, 24, 8, 0, 289, -1]

It's basically a tuple which has attributes accessible through dot notation.

The timetuple idiom is perfect for API design, and for those times where a class is what you want, but a tuple would be okay too. For example, 2D dimensions. You could store them in a class with height, width attrs, or in a tuple.

I thought it was just some pythonic pattern, but today I needed to automate it because I found myself making dumb __init__ methods like these:

def __init__(self, left, right):
    self.left = left
    self.right = right

It is a useful pattern. It's pythonic. It's implementable in pure python with just a little metaprogramming. It could be in the standard library, I thought.

I opened up my python interpreter and entered import collections and dir(collections) looking for NamedTuple or something. namedtuple was there. It wasn't just a coincidence.

collections.namedtuple is a factory for creating namedtuple types, which you can then use. The resulting classes include methods like _asdict, _fields, index and _replace.

>>> a = collections.namedtuple('a', 'b, c')
>>> a(1,2)._asdict()
OrderedDict([('b', 1), ('c', 2)])
>>> a(1,2)._fields()
('b', 'c')
>>> a(1,2).index(1)
0
>>> a(1,2).index(2)
1
>>> a(1,2)._replace(b=3)
a(b=3, c=2)
>>> 

It is hashable, unlike a dict, and pickleable.

Here is how you create one:

>>> collections.namedtuple('SomeNamedTuple', 't1 t2')

The first argument is the name of the output class, and the second argument is the list of parameter names, whitespace - and/or comma - delimited.

This creates a named tuple with the __name__ SomeNamedTuple, and the attributes t1 and t2.

Extend it

When you need more stuff out of your namedtuple you should really subclass. Keep in mind that it is immutable, so you should return a copy of it in every method.

class Dimensions(collections.namedtuple('DimensionsTuple', 'height width')):
    pass

>>> Dimensions(10, 30).height
10
>>> Dimensions(10, 30).width
30

It is rather useful sometimes, so don't forget about the named tuple! You might need it sooner or later.