Tutorial¶
The basics¶
First, create a record type like you would create a namedtuple type.
>>> from reck import make_rectype
>>> Person = make_rectype('Person', ['name', 'age'])
Next, create an instance of Person with values for name and age:
>>> p = Person(name='Eric', age=42)
>>> p # readable __repr__ with a name=value style
Person(name='Eric', age=42)
You can also pass field values as positional arguments in field order:
>>> p2 = Person('John', 44)
>>> p2
Person(name='John', age=44)
Fields are accessible by attribute lookup and by index:
>>> p.name
'Eric'
>>> p[0]
'Eric'
Field values are mutable:
>>> p.name = 'Idle'
>>> p.name
'Idle'
You can specify per-field default values when creating a record type:
>>> Person = make_rectype('Person', [('name', None), ('age', None)])
>>> p = Person(name='Eric') # no value supplied for the 'age' field
>>> p # so 'age' has been set to its default value
Person(name='Eric', age=None)
Multiple field values can be changed using the _update() method:
>>> p._update(name='John', age=44)
>>> p
Person(name='John', age=44)
Field values can be iterated over:
>>> for value in p:
... print(value)
John
44
Records are very useful for assigning fieldnames to sequences of data returned by the csv module:
import csv
reader = csv.reader(open('employees.csv', newline=''))
fieldnames = next(reader) # Get the fieldnames from the first row of the file
Employee = make_rectype('Employee', fieldnames)
for row in reader:
emp = Employee(*row)
print(emp.name, emp.title)
Type creation¶
New types are created with the make_rectype() factory function:
>>> Point = make_rectype(typename='Point', fieldnames=['x', 'y'])
Setting fieldnames¶
Fieldnames can be specified with a sequence of strings or a single string of space and/or comma separated fieldnames. These examples are equivalent:
>>> Point = make_rectype('Point', ['x', 'y'])
>>> Point = make_rectype('Point', 'x y')
>>> Point = make_rectype('Point', 'x,y')
Setting defaults¶
Per-field defaults can be set by supplying a (fieldname, default) tuple in place of a string for a fieldname:
>>> Point3D = make_rectype('Point3D', [('x', None), ('y', None), ('z', None)])
>>> p = Point3D()
>>> p
Point3D(x=None, y=None, z=None)
A default does not have to be supplied for every field:
>>> Point3D = make_rectype('Point3D', ['x', ('y', None), 'z'])
>>> p = Point3D(x=1, z=3)
>>> p
Point3D(x=1, y=None, z=3)
All fields without a default value must be given a value during instantiation, otherwise a ValueError will be raised:
>>> p = Point3D(x=1)
ValueError: field 'z' is not defined
Per-field defaults can also be specified for every field using an ordered mapping such as collections.OrderedDict:
>>> from collections import OrderedDict
>>> Point3D = make_rectype('Point3D', OrderedDict([
... ('x', None),
... ('y', None),
... ('z', None)]))
>>> p = Point3D(y=99)
>>> p
Point3D(x=None, y=99, z=None)
Factory function defaults¶
As with Python’s mutable default arguments, mutable default field values will be shared amongst all instances of the record type:
>>> Rec = make_rectype('Rec', [('a', [])])
>>> rec1 = Rec()
>>> rec2 = Rec()
>>> rec1.a.append(1)
>>> rec1.a
[1]
>>> rec2.a # the value of 'a' in rec2 has also been updated
[1]
To avoid this behaviour, mutable defaults can be created by setting the default value to a factory function wrapped with a reck.DefaultFactory object. Here is an example using the list factory with no arguments:
>>> from reck import DefaultFactory
>>> Rec = make_rectype('Rec', [('a', DefaultFactory(list))])
>>> rec1 = Rec() # calls list() to initialise field 'a'
>>> rec2 = Rec() # calls list() to initialise field 'a'
>>> rec1.a.append(1)
>>> rec1.a
[1]
>>> rec2.a # the value of 'a' remains unmodified
[]
A default factory function can also be called with positional and keyword arguments using the args and kwargs arguments of DefaultFactory(). Here is an example using dict:
>>> Rec = make_rectype('Rec', [
... ('a', DefaultFactory(dict, args=[[('b', 2)]], kwargs=dict(c=3)))])
>>> rec1 = Rec() # calls dict([('b', 2)], c=3) to initialise field 'a'
>>> rec2 = Rec() # calls dict([('b', 2)], c=3) to initialise field 'a'
>>> rec1.a
{'b': 2, 'c': 3}
>>> rec1.a['d'] = 4
>>> rec1.a
{'b': 2, 'c': 3, 'd': 4}
>>> rec2.a # the value of 'a' in rec2 remains unmodified
{'b': 2, 'c': 3}
Renaming invalid fieldnames¶
Any valid Python identifier may be used for a fieldname except keywords such as class or def for names starting with an underscore. Valid cannot be a keyword such as class or def.
You can set the rename argument of make_rectype() to True to automatically replace invalid fieldnames with position names:
>>> Rec = make_rectype('Rec', ['abc', 'def', 'ghi', 'abc'], rename=True)
>>> Rec._fieldnames # keyword 'def' and duplicate fieldname 'abc' have been renamed
('abc', '_1', 'ghi', '_3')
Instantiation¶
When instantiating records, field values can be passed by field order, fieldname, or both. The following examples all return a record equivalent to Point3D(x=1, y=2, z=3):
>>> p = Point3D(1, 2, 3) # using values by field order
>>> p = Point3D(x=1, y=2, z=3) # using values by fieldname
>>> p = Point3D(*[1, 2, 3]) # using an unpacked sequence
>>> p = Point3D(*[1, 2], z=3) # using an unpacked sequence and values by fieldname
>>> p = Point3D(**dict(x=1, y=2, z=3)) # using an unpacked mapping
>>> p
Point3D(x=1, y=2, z=3)
Record objects are iterable so they can be used to initialise other record objects of the same type:
>>> p2 = Point3D(*p)
>>> p2 == p
True
Getting and setting fields¶
By attribute¶
Fields are accessible by named attribute:
>>> p = Point3D(x=1, y=2, z=3)
>>> p.z
3
The fields of record objects are are mutable, meaning they can be modified after creation:
>>> p.z = 33
>>> p.z
33
To get or set a field whose name is stored in a string, use the getattr() and setattr() built-ins:
>>> getattr(p, 'z')
33
>>> setattr(p, 'z', 22)
>>> getattr(p, 'z')
22
By index¶
Fields are also accessible by integer index:
>>> p[1] # Get the value of field y
2
Setting works as well:
>>> p[1] = 22 # Set the value of field y to 22
>>> p[1]
22
By slice¶
Fields can also be accessed using slicing:
>>> p[:2] # Slicing returns a list of field values
[1, 22]
Setting a slice of fields works as well:
>>> p[:2] = [10, 11] # Set field x to 10 and field y to 11
>>> p
Point3D(x=10, y=11, z=22)
Note, record slice behaviour is different to that of lists. If the iterable being assigned to the slice is longer than the slice, the surplus iterable items are ignored (with a list the surplus items are inserted into the list):
>>> p[:3] = [1, 2, 3, 4, 5] # Slice has 3 items, the iterable has 5
>>> p # The last 2 items of the iterable were ignored
Point3D(x=1, y=2, z=3)
Likewise, if the iterable contains fewer items than the slice, the surplus fields in the slice remain unaffected (with a list the surplus items are deleted):
>>> p[:3] = [None, None] # Slice has 3 items, the iterable only 2
>>> p # The last slice item (field z) was unaffected
Point3D(x=None, y=None, z=3)
By iteration¶
Field values can be iterated over:
>>> p = Point3D(1, 2, 3)
>>> for value in p:
... print(value)
1
2
3
Setting multiple fields¶
Multiple field values can be updated using the _update() method, with field values passed by field order, fieldname, or both (as with instantiation). The following examples all result in a record equivalent to Point3D(x=4, y=5, z=6):
>>> p._update(4, 5, 6) # using values by field order
>>> p._update(x=4, y=5, z=6) # using values by fieldname
>>> p._update(*[4, 5, 6]) # using an unpacked sequence
>>> p._update(**dict(x=4, y=5, z=6)) # using an unpacked mapping
>>> p
Point3D(x=4, y=5, z=6)
Replacing defaults¶
A dictionary of fieldname/default_value pairs can be retrieved with the _get_defaults() class method:
>>> Point3D = make_rectype('Point3D', [('x', 1), ('y', 2), 'z'])
>>> Point3D._get_defaults()
{'x': 1, 'y': 2}
The existing per-field default values can be replaced by supplying the _replace_defaults() class method with new default values by field order, fieldname, or both:
>>> Point3D._replace_defaults(x=7, z=9)
>>> Point3D._get_defaults() # 'y' was not supplied a default so it no longer has one
{'x': 7, 'z': 9}
To remove all default field values just call _replace_defaults() with no arguments:
>>> Point3D._replace_defaults()
>>> Point3D._get_defaults()
{}
Replacing the default values can be useful if you wish to use the same record class in different contexts that require different default values:
>>> Car = make_rectype('Car', [('make', 'Ford'), 'model', 'body_type'])
>>> Car._get_defaults()
{'make': 'Ford'}
>>> # Create some Ford cars:
>>> car1 = Car(model='Focus', body_type='coupe')
>>> car2 = Car(model='Mustang', body_type='saloon')
>>> # Now create hatchback cars of different makes. To make life
>>> # easier, replace the defaults with something more appropriate:
>>> Car._replace_defaults(body_type='hatchback')
>>> Car._get_defaults() # note, 'make' no longer has a default value
{'body_type': 'hatchback'}
>>> car3 = Car(make='Fiat', model='Panda')
>>> car4 = Car(make='Volkswagon', model='Golf')
Other methods/attributes¶
The _fieldnames class attribute provides a tuple of fieldnames:
>>> p._fieldnames
('x', 'y', 'z')
You can easily convert the record to a list of (fieldname, default_value) tuples:
>>> p._asitems()
[('x', 1), ('y', 2), ('z', 3)]
You can convert the record to an OrderedDict using _asdict():
>>> p._asdict()
OrderedDict([('x', 1), ('y', 2), ('z', 3)])
Miscellaneous operations¶
Record types support various operations that are demonstrated below:
>>> p = Point3D(x=1, y=2, z=3)
>>> len(p) # get the number of fields in the record
3
>>> 4 in p # supports membership testing using the in operator
False
>>> 4 not in p
True
>>> iterator = iter(p) # supports iterators
>>> next(iterator)
1
>>> next(iterator)
2
>>> reverse_iterator = reversed(p) # iterate in reverse
>>> next(reverse_iterator)
3
>>> next(reverse_iterator)
2
>>> p._index(2) # get the index of the first occurrence of a value
1
>>> p._update(x=1, y=3, z=3)
>>> p._count(3) # find out how many times a value occurs in the record
2
>>> vars(p) # return an OrderedDict mapping fieldnames to values
OrderedDict([('x': 1), ('y': 3), ('z': 3)])
Pickling¶
Instances can be pickled:
>>> import pickle
>>> pickled_p = pickle.loads(pickle.dumps(p))
>>> pickled_p == p
True
Subclassing¶
Since record types are normal Python classes it is easy to add or change functionality with a subclass. Here is how to add a calculated field and a fixed-width print format:
>>> class Point(make_rectype('Point', 'x y')):
... __slots__ = ()
... @property
... def hypotenuse(self):
... return (self.x ** 2 + self.y ** 2) ** 0.5
... def __str__(self):
... return ('Point: x={0:6.3f} y={1:6.3f} hypotenuse={2:6.3f}'
... .format(self.x, self.y, self.hypotenuse))
>>> p = Point(x=3, y=4.5)
>>> print(p)
Point: x= 3.000 y= 4.500 hypotenuse= 5.408
The subclass shown above sets __slots__ to an empty tuple. This helps keep memory requirements low by preventing the creation of per-instance dictionaries.
Adding fields/attributes¶
Because record objects are based on slots, new fields cannot be added after object creation:
>>> Point = make_rectype('Point', 'x y')
>>> p = Point(1, 2)
>>> p.new_attribute = 4 # Can't do this!
AttributeError Traceback (most recent call last)
<ipython-input-8-55738ba62948> in <module>()
----> 1 rec.c = 3
AttributeError: 'Point' object has no attribute 'new_attribute'
Subclassing is also not useful for adding new attributes. Instead, simply create a new record type from the _fieldnames class attribute:
>>> Point3D = make_rectype('Point3D', Point._fieldnames + ('z',))
More than 255 fields¶
Record types have no limit on the number of fields whereas named tuples are limited to 255 fields:
>>> fieldnames = ['f{0}'.format(i) for i in range(1000)]
>>> values = [i for i in range(1000)]
>>> from collections import namedtuple
>>> NT = namedtuple('NT', fieldnames)
SyntaxError: more than 255 fields
>>> Rec = make_rectype('Rec', fieldnames)
>>> rec = Rec(*values)
>>> rec.f0
0
>>> rec.f999
999
Whilst it is unusual to require more than 255 fields it can sometimes be handy if reading data from a csv file (or similar) that has a lot of columns.