Eric Mill
|
63cb58f9e0
|
whitespace
|
2013-05-08 18:09:55 -04:00 |
|
Eric Mill
|
ef2de7a39f
|
Fixed longstanding bug at only showing the most recent stack trace for other errors
|
2013-05-08 18:08:43 -04:00 |
|
Eric Mill
|
1e9391a239
|
Allow for fetching senate unprinted amendments
|
2013-05-08 17:56:32 -04:00 |
|
Joshua Tauberer
|
b9448d83ed
|
committee meetings parser
|
2013-05-02 10:12:07 -04:00 |
|
Eric Mill
|
28ced678a9
|
Set a custom user agent for the project
|
2013-04-29 22:46:29 -04:00 |
|
Joshua Tauberer
|
9b01c418c2
|
when mirroring FDSys it is considerably faster (and less memory intensive) to shell out to wget to do the download rather than using scrapelib (but what about throttling?)
|
2013-04-04 14:53:05 +00:00 |
|
Eric Mill
|
311cd1720e
|
Spit out the # of errors at the end no matter what. (Many errors can push the count up too high.)
|
2013-03-01 22:51:34 -06:00 |
|
Eric Mill
|
f1ed39011d
|
whitespace
|
2013-03-01 22:48:33 -06:00 |
|
Chris Wilson
|
be78684018
|
Added error catching for split nomination ids
|
2013-02-03 14:09:10 -05:00 |
|
Joshua Tauberer
|
4dfb1475dd
|
when saving GovTrack files, use the faster libyaml parser to load the congress-legislators data files
|
2013-01-29 13:45:21 -05:00 |
|
Chris Wilson
|
4ae205cf58
|
added option for POST data to download()
|
2013-01-26 13:03:37 -05:00 |
|
Chris Wilson
|
0795853be5
|
Added a few nomination functions directly analogous to bill functions
|
2013-01-26 11:36:53 -05:00 |
|
Eric Mill
|
d93634f532
|
Move the thomas ID correction out of just the govtrack export code and into the general data output
|
2013-01-25 11:26:19 -05:00 |
|
Eric Mill
|
67a12db5cc
|
name correction
|
2013-01-24 10:18:19 -05:00 |
|
Joshua Tauberer
|
15c095c668
|
handle THOMAS providing an incorrect ID for Rep. C. A. Dutch Ruppersberger
|
2013-01-24 07:41:16 -05:00 |
|
Joshua Tauberer
|
92a58a9017
|
a little cleanup for GovTrack output when a THOMAS ID is not found
|
2013-01-24 07:37:53 -05:00 |
|
Eric Mill
|
75bd99e359
|
Added a bill_versions task that takes a full --congress, a --bill_id, or a --bill_version_id, and saves a text-versions/[version_code].json file with the data for each version: when it was issued, its code/id, and URLs to each published version of it. Uses fdsys.py for sitemap crawling and MODS doc parsing
|
2013-01-20 18:55:54 -05:00 |
|
Eric Mill
|
69df2aebdb
|
Added a couple helper functions, made sure to transform congress into integer all the time
|
2013-01-20 16:55:53 -05:00 |
|
Eric Mill
|
782af88e85
|
Refactored download helper to have all extra options go through options hash, documented each option
|
2013-01-20 15:16:54 -05:00 |
|
Joshua Tauberer
|
04322a0029
|
fdsys: add a method to locally store mods, PDF, etc. and update when sitemap indicates changes
|
2013-01-20 11:08:46 -05:00 |
|
Joshua Tauberer
|
e181aed4ba
|
for --fast, move the part that writes the cache to be after the bill is successfully parsed
|
2013-01-17 08:16:40 -05:00 |
|
Eric Mill
|
4865ce2ad5
|
Drastically simplified fast-caching process for bills, by handling all cache detection in the bill pagination process. This allows for the possibility that we could overlook a change if the script aborted between caching the new state and completing the bill fetching/output process.
|
2013-01-15 12:08:58 -05:00 |
|
GovTrack.us
|
e13c5564f6
|
add a --fast option for parsing bills using Derek's original idea of detecting many (but not all) changes to bills by looking at the search result listing content
|
2013-01-15 08:14:30 -05:00 |
|
GovTrack.us
|
0382fb66d7
|
support Python < 2.7 by removing the dict-creation syntax { a:b ... }
|
2013-01-15 07:23:44 -05:00 |
|
GovTrack.us
|
51a143d8d8
|
parsing amendments (hopefully)
|
2013-01-06 10:43:48 -05:00 |
|
GovTrack.us
|
63d678e722
|
change some print statements to logging.warn and use a special Exception class when a GovTrack ID lookup fails
|
2013-01-06 09:41:46 -05:00 |
|
GovTrack.us
|
d816441d37
|
partial revert of 4cbf3bc2fe which added a check if a vote file is changed (module updated_at) before saving it, but this logic is better handled by my GovTrack import scripts rather than here
|
2013-01-06 09:41:09 -05:00 |
|
GovTrack.us
|
4cbf3bc2fe
|
1) fix how congress-legislators repo is updated; 2) in the votes parser change the meaning of --force and add new option --fetch
|
2013-01-03 18:36:26 -05:00 |
|
Eric Mill
|
0d6bb0bae5
|
Update current_congress function to consider the first 2.5 days of the year as the last year
|
2013-01-02 16:47:32 -05:00 |
|
GovTrack.us
|
e851ae6073
|
revised vote IDs to use canonical session years (e.g. 2012) rather than session ordinals (e.g. 1, 2)
|
2013-01-02 09:59:49 -05:00 |
|
GovTrack.us
|
eda6f0cb5a
|
correct daylight saving timezone handling so that the UTC offset in all serialized dates is correct with respect to whether DST was in effect at the time (and does not change anything else)
|
2012-12-30 17:52:12 -05:00 |
|
GovTrack.us
|
a95e2b38e0
|
refactoring mistake
|
2012-12-30 17:42:48 -05:00 |
|
GovTrack.us
|
459a37d838
|
replace the submodule with scripted clone/pull, which makes it easier to always be at the latest upstream commit and lets us control the clone depth
|
2012-12-30 15:47:49 -05:00 |
|
Joshua Tauberer
|
a21fc6e954
|
more on vote parsing
|
2012-12-30 13:24:35 -05:00 |
|
Joshua Tauberer
|
a1da46b78f
|
starting a roll call votes parser, including some refactoring of existing code so it can be reused
|
2012-12-27 16:12:24 -05:00 |
|
Eric Mill
|
e611f15378
|
whitespace
|
2012-11-27 18:00:35 -05:00 |
|
GovTrack.us
|
e8497461c2
|
missed a few log calls in the merge
|
2012-11-24 11:07:26 -05:00 |
|
GovTrack.us
|
1ad6eb72e5
|
merge... uhm this was a complicated one, hopefully not breaking anything
|
2012-11-24 08:44:47 -05:00 |
|
Eric Mill
|
544afd79ff
|
Moved committee mapping fetching to a utils method with a globally cached map, to remove it from the method signatures of the main process. Also added a (version controlled) cache dir in the test/fixtures folder so that tests don't hit the network
|
2012-11-15 19:09:49 -05:00 |
|
GovTrack.us
|
3c847d3f19
|
swtich to using python logging module
|
2012-11-11 18:22:49 -05:00 |
|
GovTrack.us
|
5b6b6b1631
|
in the log util function, also treat unicode strings as plain strings to be printed
|
2012-10-31 18:04:12 -04:00 |
|
Eric Mill
|
b974ebcfc8
|
typo in comment
|
2012-10-01 17:30:15 -04:00 |
|
Eric Mill
|
22e59e2fad
|
Incorporate control character removal into the unescape process of text, and bake that into the download/caching process
|
2012-09-08 16:54:59 -04:00 |
|
Eric Mill
|
030526bcea
|
Remove unicode control chars from titles
|
2012-09-08 16:47:00 -04:00 |
|
Eric Mill
|
d6b4bc8e63
|
Fix date serialization for actions
|
2012-09-08 16:26:20 -04:00 |
|
Joshua Tauberer
|
b029e8c295
|
remove unused import (iso8601)
|
2012-09-08 14:12:42 -04:00 |
|
Eric Mill
|
387050ddf2
|
Some better error checking, and a fix for occasional 0-byte file downloads
|
2012-09-07 19:12:05 -04:00 |
|
Eric Mill
|
524c4d9fa5
|
Add a from_name attribute, and differentiate the subject line by time
|
2012-09-07 18:46:33 -04:00 |
|
Eric Mill
|
3137053e24
|
Made cache and data dirs configurable, made it so config.yml is read in only once
|
2012-09-07 14:44:59 -04:00 |
|
Eric Mill
|
724bf2c30f
|
Swallowing errors around the admin logging function, don't want to end up in a loop
|
2012-09-07 14:30:00 -04:00 |
|