| Home | Trees | Indices | Help |
|
|---|
|
|
Shipyard is a module to process data in a format inspired by email headers (RFC 2822).
A character encoding can be specified similar to PEP 0263 using:
# -*- coding: <encoding name> -*-
in the first line. # is replaced with the actual comment mark.
More precisely, the first line must match the regular expression:
^#.*coding[:=]\s*([-\w.]+)
Again # is replaced by the actual comment mark. The first group of this expression is then interpreted as encoding name.
Lines starting with the comment mark (default: #) are ignored. Comments can be used in or between records.
A field is a line that has the form:
key: value
value is an arbitrary string. It can span multiple line using continuation marks.
If a line starts with the continuation mark (default: " " [one blank]) it gets appended to the preceding line, with the continuation mark removed.
>>> import shipyard
>>> input = open('nobel.sy')
>>> reader = shipyard.Parser(keep_linebreaks=False, ... keys=['id', 'discipline', 'year', ... 'name', 'country', 'rationale'])
For every record the given keys are initialized with None.
Now we can iterater through the records:
>>> for record in reader.parse(input): # doctest:+ELLIPSIS ... print record['country'] United States Japan United States ...
>>> input.seek(0) >>> lod = reader.get_list(input) >>> print lod # doctest:+ELLIPSIS [{u'discipline': u'Chemistry', u'name': u'Martin Chalfie', ...}, {u'discipline': u'Chemistry', u'name': u'Osamu Shimomura', ...}, ...]
>>> input.seek(0) >>> dod = reader.get_dict(input, key='id') >>> print dod.keys() [u'11', u'10', u'1', u'0', u'3', u'2', u'5', u'4', u'7', u'6', u'9', u'8'] >>> print dod[u'5'][u'rationale'] for the discovery of the mechanism of spontaneous brokensymmetry in subatomic physics
>>> input.seek(0) >>> los = reader.get_list(input, factory = lambda **keys: ', '.join(keys.values())) >>> print los[0] Chemistry, Martin Chalfie, United States, for the discovery and development of the green fluorescentprotein, GFP, 2008, 0
>>> input.seek(0) >>> class Laureate(object): ... def __init__(self, id, discipline, year, name, country, rationale): ... self.name = name >>> doo = reader.get_dict(input, key='id', factory = Laureate) >>> print doo[u'2'] # doctest:+ELLIPSIS <Laureate object at ...> >>> print doo[u'2'].name Roger Y. Tsien
Now let's write a Shipyard file.
>>> import StringIO >>> output = StringIO.StringIO()
>>> writer = shipyard.Writer(keys=('foo', 'bar'), coding='utf-8')
>>> writer.write(output, {'foo': 1, 'bar': 2}) >>> print output.getvalue() foo: 1 bar: 2 <BLANKLINE> <BLANKLINE>
>>> output = StringIO.StringIO() >>> d = [dict((('foo', i), ('bar', 2*i))) for i in range(3)] >>> writer.write_many(output, d) >>> print output.getvalue() foo: 0 bar: 0 <BLANKLINE> foo: 1 bar: 2 <BLANKLINE> foo: 2 bar: 4 <BLANKLINE> <BLANKLINE>
>>> output = StringIO.StringIO() >>> writer.write_coding(output) >>> print output.getvalue() #-*- coding: utf-8 -*- <BLANKLINE> <BLANKLINE>
>>> output = StringIO.StringIO() >>> writer.write_full(output, d) >>> print output.getvalue() #-*- coding: utf-8 -*- <BLANKLINE> foo: 0 bar: 0 <BLANKLINE> foo: 1 bar: 2 <BLANKLINE> foo: 2 bar: 4 <BLANKLINE> <BLANKLINE>
|
|||
|
InvalidLineError Something is wrong with a line |
|||
|
InvalidKeyError Something is wrong with a key |
|||
|
Parser Reader for Shipyard files |
|||
|
Writer Writer for Shipyard files |
|||
| Home | Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0beta1 on Sun Oct 19 18:00:34 2008 | http://epydoc.sourceforge.net |