5 Commits

Author SHA1 Message Date
Eric Mill
a7ff75442e Travis CI integratiin, with a workout script that imports every script to check for syntax errors. also updates each script to make them import-able withoue executing. 2014-04-03 12:58:57 -04:00
Joshua Tauberer
b0781fc441 Python 3 porting: charset and related issues
* Let utils.download() decode the byte stream for all of the scripts. As a result, no need to decode elsewhere (committee_membership.py, house_contacts.py, house_websites.py, thomas_ids.py, wikipedia_ids.py), but sometimes we need to encode back to UTF-8 for parsing XML in case the XML has an encoding declaration (committee_membership.py).
* In utils.unescape, to get a unicode character for a code point in another character set, the Python 2 way with chr doesn't work in Python 3. Replaced with using bytes.
* In bioguide.py, no need to decode the stream (didn't seem to be necessary in Py 2 or Py 3), and with changes in utils.py it's now already decoded.
* The json module now operates over unicode strings (cspan.py).
* Let Python handle UTF-8 encoding when writing to disk, including in CSV outputs (alternate_bulk_formats.py, export_csv.py, wikipedia_ids.py).
* In rtyaml.py, the latest logic for maintaining a comment block at the top of the file was not working at all in Py 3 because io.open doesn't provide a stream with a peek method. Since we're now operating on a seekable stream during *output*, we don't need to peek anymore anyway. So I re-did this.
* When we compute hashes over files for cache freshness checking we must read the files in binary mode and when we save pickle files we must open those files in binary mode too (utils.py's yaml_load, yaml_dump).
2014-04-02 18:15:22 -04:00
Joshua Tauberer
a1d5f0fa57 run 2to3 to start Python 2 => Python 3 conversion 2014-04-02 16:09:21 -04:00
Joshua Tauberer
2ef60a2b7f when exporting to CSV, attempt to keep the field order as the fields appear in the YAML file 2013-11-18 13:49:59 -05:00
Joshua Tauberer
26a37fdad5 add a new script to convert any YAML file to CSV by flatting objects and ignoring lists 2013-11-18 13:29:48 -05:00