Commit Graph

105 Commits

Author SHA1 Message Date
Joshua Tauberer
7c8f489f84 updating committees-historical (mostly just indicating some committees present in 113th Congress, but also THOMAS made some name corrections) 2013-03-19 13:38:10 -04:00
Joshua Tauberer
2292033318 update our YAML dumper to use tildes for nulls, so our one tilde doesnt get changed on output from a script 2013-03-19 13:28:21 -04:00
Eric Mill
c8bee96146 blacklisting a campaign account 2013-02-27 19:52:51 -06:00
Eric Mill
6170889712 fix bug for youtube sweeping, and add some better patterns 2013-02-15 17:46:26 -05:00
Eric Mill
a2207af108 Updated black and whitelists for youtube 2013-02-15 17:46:07 -05:00
Eric Mill
5e99a9a155 Caught an old url pattern 2013-02-15 13:27:17 -05:00
Eric Mill
c170311684 A couple more blacklists 2013-02-15 13:27:01 -05:00
Eric Mill
73ac7c38a6 Wasn't meant to be committed 2013-02-15 13:26:49 -05:00
Eric Mill
7b5ea31f66 A ccouple of campaign account things 2013-02-15 12:06:30 -05:00
Eric Mill
e1a1f7297f Switch field name from facebook_graph to facebook 2013-02-15 11:55:51 -05:00
Eric Mill
ddeff12220 Starting from scratch on Facebook accounts 2013-02-15 02:12:30 -05:00
Eric Mill
a6e346af39 Fixed up regexes and process for outputting facebook information 2013-02-14 18:59:49 -05:00
Eric Mill
30fbabe4e0 Lots more facebook blacklist entries 2013-02-14 18:58:59 -05:00
Eric Mill
46204f2c32 print out candidate URL in social media spreadsheet, add a pages regex for facebook, fix unicode error for outputting names 2013-02-14 17:54:33 -05:00
Eric Mill
8eb664a84d Updated committee memberships for the Senate, included a couple of committee name changes, a couple new subcommittees on the Judiciary Committee, and some temporary code in the memberships script to leave the House data as-is for the time being 2013-02-14 17:41:35 -05:00
Joshua Tauberer
4c2a87e836 thomas_ids.py: update regex for change in href values (now full URLs not relative paths) 2013-02-08 07:40:05 -05:00
Joshua Tauberer
a4a9166420 update senate contact info 2013-02-06 09:42:18 -05:00
Eric Mill
b22e973791 Use the SafeLoader to avoid crazy serialization security issues 2013-02-01 15:10:28 -05:00
Joshua Tauberer
99518273d7 use the faster libyaml parser if it is available 2013-01-29 13:34:46 -05:00
Eric Mill
57f36be968 Removing some unneeded code, and adding in some output - but the script still can't output correct data until the House and Senate are both up 2013-01-28 17:57:14 -05:00
Eric Mill
5cf0c787cc Made the blacklist more specific, it was cutting out an account 2013-01-28 16:30:38 -05:00
Eric Mill
9954cfd134 campaign account to blacklist 2013-01-28 14:33:19 -05:00
Eric Mill
a51a269466 runnable photo resize script 2013-01-25 19:23:48 -05:00
Eric Mill
3cbe861b1d Script to also get additional fields from the other Senate member XML source 2013-01-25 11:51:08 -05:00
Eric Mill
d32b869f73 Sweeping old out-of-office people from committee memberships, as a temporary measure until we get updated people 2013-01-24 12:56:21 -05:00
Derek Willis
5f160ba966 added utility tofetch cspan ids 2013-01-17 15:27:08 -05:00
Eric Mill
5e9fd57f93 Merge branch 'master' of github.com:unitedstates/congress-legislators 2013-01-16 17:27:39 -05:00
Eric Mill
5e0cf729b1 Fixed bug in regex application, and added @ check for the other twitter regex 2013-01-16 17:27:34 -05:00
Eric Mill
c43f9b8076 Removed JS embed whitelists 2013-01-16 17:27:15 -05:00
Eric Mill
d08080497b Small refactor to allow multiple regexes per service, and catch JS embeds of Twitter accounts 2013-01-16 17:19:33 -05:00
GovTrack.us
2de1484d8a ensure all strings that look like integers are quoted (only needed for zero-lead octal-integer-like strings with '8' or '9' in the value which would previously omit quotes) 2013-01-15 11:28:32 -05:00
Eric Mill
267fb39c5e one more blacklist item 2013-01-10 19:02:39 -05:00
Eric Mill
03898cfbe8 make sure whitelist is case insensitive 2013-01-10 18:42:56 -05:00
Eric Mill
ef16be19ab more white and black listing 2013-01-10 18:42:47 -05:00
Eric Mill
a696590c46 Used verify mode to find some more changed usernames 2013-01-10 18:18:38 -05:00
Eric Mill
644425cffb formatting 2013-01-10 18:18:26 -05:00
Eric Mill
197d9dca92 Bunch of updates replacing the errors I deleted 2013-01-10 18:14:04 -05:00
Eric Mill
95ebf52860 Adding a whitelist for known accounts that don't appear in source - but allow detection of new account name to still come through 2013-01-10 18:06:16 -05:00
Eric Mill
1571a1cea7 More good blacklisted patterns for twitter 2013-01-10 17:54:00 -05:00
Eric Mill
24194469cc Beginnings of a verify mode 2013-01-10 17:53:52 -05:00
Eric Mill
60d89aecc8 Did another batch of twitter updates from new freshmen House sites 2013-01-09 21:26:44 -05:00
Eric Mill
468ba7bf4e Also cut trailing slashes off senate websites, let's standardize 2013-01-09 21:15:41 -05:00
Eric Mill
a6d10ecb13 Added script to get house websites 2013-01-09 21:14:36 -05:00
Eric Mill
62cfcd38d0 blacklisted some campaign accounts and one to check back on later 2013-01-09 20:40:06 -05:00
Eric Mill
5b9465452a Fix regexes to allow https (this caught a bunch more Twitter accounts) 2013-01-09 20:39:56 -05:00
Eric Mill
a5beae4a49 Added --clean mode to remove out of date people 2013-01-09 20:28:15 -05:00
Eric Mill
7bc976d2e2 Update yaml depending on contents of (presumably manually reviewed and edited) CSV that had been generated by the --sweep step 2013-01-09 20:21:54 -05:00
Eric Mill
a32dd045a5 Beginnings of a script to generate social media leads 2013-01-09 20:07:06 -05:00
Eric Mill
4e1ab42e7b removed csv import, doc'd bioguide option 2013-01-09 19:48:05 -05:00
Eric Mill
13fdb5e006 fixed some missing and incorrect phone numbers by updating senate_contacts to update this field 2013-01-09 14:08:28 -05:00