bchartoff
|
16aa96e350
|
updated icpsr.py
|
2013-07-30 11:34:07 -04:00 |
|
bchartoff
|
9f489cf1f6
|
removed scripst/build
|
2013-07-25 17:15:06 -04:00 |
|
bchartoff
|
0cf2f340f9
|
updated ICPSR id's
pulled and matched ICPSR id's from roll call source data
|
2013-07-25 17:11:31 -04:00 |
|
bchartoff
|
cb9ea0eb83
|
Handled bioguide IDs with no IE ID
|
2013-06-13 10:59:29 -04:00 |
|
bchartoff
|
f2df26e53c
|
Updated CRP_ID.py
|
2013-06-11 11:16:09 -04:00 |
|
bchartoff
|
a017bf0537
|
Revert "Update CRP_ID to no longer require lxml scraping"
This reverts commit 678f2b8a14.
|
2013-06-11 10:54:15 -04:00 |
|
bchartoff
|
678f2b8a14
|
Update CRP_ID to no longer require lxml scraping
|
2013-06-11 10:52:43 -04:00 |
|
Eric Mill
|
16737a7f65
|
Removing some cruft
|
2013-06-11 10:21:19 -04:00 |
|
Eric Mill
|
897022a186
|
Remove .DS_Store
|
2013-06-11 10:19:37 -04:00 |
|
Eric Mill
|
0f02e9ecca
|
Refactor to use json directly for IE download
|
2013-06-11 10:19:24 -04:00 |
|
bchartoff
|
39f1eced9f
|
unchanged, but this was why it wasn't synching
|
2013-06-11 09:20:46 -04:00 |
|
bchartoff
|
6d49f83d39
|
added script to update CRP ID's from IE API
|
2013-06-10 15:19:37 -04:00 |
|
Jeremy Carbaugh
|
30720d6745
|
Added method to resolve Facebook graph IDs from usernames and updated social media YAML
|
2013-06-01 03:33:26 -04:00 |
|
Joshua Tauberer
|
6586c98fa4
|
Mark Sanford was sworn in. Moved him from the historical file using a new helper script 'untire.py' (a pun on un-retire)
|
2013-05-16 07:14:17 -04:00 |
|
Joshua Tauberer
|
54d1d95e82
|
in committee-membership-current, sorting members first by party (majority first) then by rank and updating the NYT scraper to produce the same ordering when run (but I didn't rerun the script for this commit, just sorted what we had) --- the purpose of this commit is to match the sort order of the main membership scraper
|
2013-05-06 10:43:08 -04:00 |
|
Joshua Tauberer
|
6526705363
|
update committee metadata, with updated committee_membership.py scraper that now uses House committee pages again which we thought were discontinued but actually still exist and are up to date
|
2013-05-06 10:32:46 -04:00 |
|
Derek Willis
|
d5a3064247
|
added script to scrape house history ids
|
2013-04-12 15:56:28 -04:00 |
|
Eric Mill
|
13d6758a29
|
use https urls
|
2013-04-08 14:47:14 -04:00 |
|
Joshua Tauberer
|
f619d59b5d
|
add wikipedia page names to the ID field, by scanning for pages using the CongBio and CongLinks templates and keying off of bioguide IDs
|
2013-04-06 17:50:32 -04:00 |
|
Eric Mill
|
5e0c72df45
|
Increased rate limit, NYT's rate limiter auto-emailed me to complain
|
2013-03-20 16:38:27 -04:00 |
|
Eric Mill
|
70dc79e681
|
Fixed issue where it was always caching NYT committee members
|
2013-03-20 16:01:58 -04:00 |
|
Eric Mill
|
5ba221a5c3
|
Fix to include House Intelligence Committee
|
2013-03-20 15:47:19 -04:00 |
|
Eric Mill
|
d67656ccab
|
Trying it again, this time not destroying Joint committee memberships
|
2013-03-20 15:12:26 -04:00 |
|
Eric Mill
|
b9bb3e6d89
|
Revert "I believe I have the House committee members (for top-level committees) from the NYT API"
This reverts commit ce24710938.
|
2013-03-20 15:10:55 -04:00 |
|
Eric Mill
|
ce24710938
|
I believe I have the House committee members (for top-level committees) from the NYT API
|
2013-03-20 15:09:11 -04:00 |
|
Joshua Tauberer
|
7c8f489f84
|
updating committees-historical (mostly just indicating some committees present in 113th Congress, but also THOMAS made some name corrections)
|
2013-03-19 13:38:10 -04:00 |
|
Joshua Tauberer
|
2292033318
|
update our YAML dumper to use tildes for nulls, so our one tilde doesnt get changed on output from a script
|
2013-03-19 13:28:21 -04:00 |
|
Eric Mill
|
c8bee96146
|
blacklisting a campaign account
|
2013-02-27 19:52:51 -06:00 |
|
Eric Mill
|
6170889712
|
fix bug for youtube sweeping, and add some better patterns
|
2013-02-15 17:46:26 -05:00 |
|
Eric Mill
|
a2207af108
|
Updated black and whitelists for youtube
|
2013-02-15 17:46:07 -05:00 |
|
Eric Mill
|
5e99a9a155
|
Caught an old url pattern
|
2013-02-15 13:27:17 -05:00 |
|
Eric Mill
|
c170311684
|
A couple more blacklists
|
2013-02-15 13:27:01 -05:00 |
|
Eric Mill
|
73ac7c38a6
|
Wasn't meant to be committed
|
2013-02-15 13:26:49 -05:00 |
|
Eric Mill
|
7b5ea31f66
|
A ccouple of campaign account things
|
2013-02-15 12:06:30 -05:00 |
|
Eric Mill
|
e1a1f7297f
|
Switch field name from facebook_graph to facebook
|
2013-02-15 11:55:51 -05:00 |
|
Eric Mill
|
ddeff12220
|
Starting from scratch on Facebook accounts
|
2013-02-15 02:12:30 -05:00 |
|
Eric Mill
|
a6e346af39
|
Fixed up regexes and process for outputting facebook information
|
2013-02-14 18:59:49 -05:00 |
|
Eric Mill
|
30fbabe4e0
|
Lots more facebook blacklist entries
|
2013-02-14 18:58:59 -05:00 |
|
Eric Mill
|
46204f2c32
|
print out candidate URL in social media spreadsheet, add a pages regex for facebook, fix unicode error for outputting names
|
2013-02-14 17:54:33 -05:00 |
|
Eric Mill
|
8eb664a84d
|
Updated committee memberships for the Senate, included a couple of committee name changes, a couple new subcommittees on the Judiciary Committee, and some temporary code in the memberships script to leave the House data as-is for the time being
|
2013-02-14 17:41:35 -05:00 |
|
Joshua Tauberer
|
4c2a87e836
|
thomas_ids.py: update regex for change in href values (now full URLs not relative paths)
|
2013-02-08 07:40:05 -05:00 |
|
Joshua Tauberer
|
a4a9166420
|
update senate contact info
|
2013-02-06 09:42:18 -05:00 |
|
Eric Mill
|
b22e973791
|
Use the SafeLoader to avoid crazy serialization security issues
|
2013-02-01 15:10:28 -05:00 |
|
Joshua Tauberer
|
99518273d7
|
use the faster libyaml parser if it is available
|
2013-01-29 13:34:46 -05:00 |
|
Eric Mill
|
57f36be968
|
Removing some unneeded code, and adding in some output - but the script still can't output correct data until the House and Senate are both up
|
2013-01-28 17:57:14 -05:00 |
|
Eric Mill
|
5cf0c787cc
|
Made the blacklist more specific, it was cutting out an account
|
2013-01-28 16:30:38 -05:00 |
|
Eric Mill
|
9954cfd134
|
campaign account to blacklist
|
2013-01-28 14:33:19 -05:00 |
|
Eric Mill
|
a51a269466
|
runnable photo resize script
|
2013-01-25 19:23:48 -05:00 |
|
Eric Mill
|
3cbe861b1d
|
Script to also get additional fields from the other Senate member XML source
|
2013-01-25 11:51:08 -05:00 |
|
Eric Mill
|
d32b869f73
|
Sweeping old out-of-office people from committee memberships, as a temporary measure until we get updated people
|
2013-01-24 12:56:21 -05:00 |
|