45 Commits

Author SHA1 Message Date
Joshua Tauberer
d7057cf408 Disable twitter_id existence check since it is currently failing 2025-02-23 12:35:22 -05:00
Kevin Schaul
68320cf593 Add pictorial ids, script for 118th Congress (#943)
Adds ids mapping to GPO's Pictorial Member Guide
pictorialapi.gpo.gov

This PR includes ids for all members of 118th Congress. It may also work
for historical files -- at least going back to 110 -- but I have not
included that work here, as it requires manual fixes and I imagine most
interest is in current membership.

Closes #942
2024-12-06 17:37:18 -05:00
Joshua Tauberer
e558d15d17 Update apportionment used by validation, fixes #891 (#892)
Also removed scripts/validator.py which was an obsolete test script.
2023-05-31 14:19:30 -04:00
Joshua Tauberer
a0638240ef Make 'bio' keys non-required because we don't always have birthdays available for new legislators 2023-01-03 14:45:29 -05:00
Joshua Tauberer
d1bf22b275 Fix validation to not crash if a legislator doesn't have a bioguide ID
This can happen when new legislators are added before a bioguide ID is published by the House, especially when staging election results before the next Congress has begun.
2022-12-25 08:37:34 -05:00
Joshua Tauberer
5a76bc5000 Fix party_affiliations missing start/end dates in last update, correct an invalid end date, and add validation 2022-12-09 08:23:00 -05:00
Joshua Tauberer
1641ce5e70 Expand party checks in validate.py from only current legislators to recent legislators 2022-12-09 08:13:01 -05:00
Joshua Tauberer
cf45493e6d Check that FEC IDs are candidate IDs using a regex in the validation test 2022-09-18 16:44:16 -04:00
Joshua Tauberer
27ab666b2d Add new validation tests for the social media file (#823)
* Add some basic checks.
* If a TWITTER_API_BEARER_TOKEN environment variable is set, query the Twitter API to check that Twitter usernames and IDs match, and if usernames are in canonical case.
* Fix Rep. Coons's twitter_id to match the twitter handle. The existing ID corresponds to the Twitter handle SenCoonsOffice which also appears to be a correct account, but his website links to the ChrisCoons account.
* Update various Twitter handles to TitleCase if the account itself uses TitleCase
2022-03-21 23:37:36 -04:00
Joshua Tauberer
489e28480a Remove stray empty and invalid 'HSIG' committee membership info, move HLIG (the correct id) into sorted order, and add a test for invalid committee membership committee keys 2021-03-07 09:45:14 -05:00
Joshua Tauberer
bf67a1440c Revert "Disable bioguide and URL checks until this information becomes available for the 117th Congress"
This reverts commit 53fcdb62f8.
2021-01-03 15:48:04 -05:00
Joshua Tauberer
53fcdb62f8 Disable bioguide and URL checks until this information becomes available for the 117th Congress
And disable the time check so the PR passses before the 117th Congress starts.

This should all be reverted.
2021-01-01 10:20:07 -05:00
Joshua Tauberer
10b7a73b43 Improve some validator output messages 2021-01-01 10:20:07 -05:00
Joshua Tauberer
ee290b5dcc 117th Congress 2021-01-01 10:20:07 -05:00
Joshua Tauberer
571bc38a0f Allow 'how: special-election'
First used for Sen. Mark Kelly.
2020-12-03 06:01:41 -05:00
Joshua Tauberer
16894bc736 Amash is now a Libertarian according to Bioguide
Tests had to be updated to accept this as a party name.
2020-05-21 07:43:25 -04:00
Joshua Tauberer
8acc8e408c Fix overlapping term dates in historical data by consulting bioguide and enable the test for historical data 2020-05-17 08:20:10 -04:00
Joshua Tauberer
0ef4483df9 Add tests that term start/end dates don't span more congresses than they should and fix all the data errors by consulting bioguide 2020-05-16 23:47:14 -04:00
Joshua Tauberer
fb27bd50c6 Disable errors from offices checks since the data is not being actively maintained, so tests pass 2020-05-09 09:04:04 -04:00
Jeremy Douglass
8dcde55692 If Independent not in D/R caucus, warn (no error)
Fixes #689

Amash gave error ">caucus: ~ is invalid when party is Independent."
2019-09-04 09:37:00 -07:00
Joshua Tauberer
f4f9710659 remove the old religion data, per #657 and 9101feaae3 (#659)
This data came from my original import from GovTrack's legislator database, and the religion field probably came from my original import of data from the MIT Media Lab's Government Information Awareness project in 2003. The field was never maintained.
2019-02-06 18:45:45 -05:00
Joshua Tauberer
ced1a983c8 improve the error messages of test/validate.py 2018-12-16 17:53:55 -05:00
Joshua Tauberer
f11cb2c27d in tests/validate.py check that the 'bio' mapping has all of the fields present 2018-12-16 17:35:04 -05:00
Joshua Tauberer
9101feaae3 remove the 'religion' bio field from the README as we have not been collecting that data field for many years
Also remove it from the validator test.
2018-12-16 17:15:18 -05:00
Joshua Tauberer
2790dc898a adjust end dates to future special election dates and set end-type: special-election for senators currently serving per an appointment to fill a vacancy when a special election will take place prior to the end of the term 2018-09-05 19:28:07 -04:00
Joshua Tauberer
b219b6a422 add how: appointment to all senators appointed to fill a vacancy and update start dates that appear to be erroneous
Data from https://www.senate.gov/senators/AppointedSenators.htm. Where the appointment date was after our term start date, the term start date is updated. (In other cases I presume our term start date to correctly reflect the swearing-in date or the start of the session.) Where there was ambiguity on the Senate page about appointment date versus effective date, I used the effective date.

see #41
2018-09-05 19:08:59 -04:00
Timothy Caro-Bruce
2a9d3395c5 add option to suppress warnings 2017-10-26 15:51:28 -07:00
Timothy Caro-Bruce
7ce5a33156 add district office validation to main validate file 2017-10-18 17:18:06 -07:00
Joshua Tauberer
317ff80c31 use CircleCI to update gh-pages with latest downloadable files in YAML, CSV, JSON (#482)
* merged generate_json.py and alternate_bulk_formats.py and generate pretty JSON
* have them write to ../ rather than ../alternate_formats
* delete the old bulk data files since they'll be in gh-pages
* add CircleCI file to manage this
* add scripts/update_gh_pages.sh which updates the gh-pages branch with the latest bulk data files in multiple formats
* remove test/test_json_matches.py because it's no longer needed
* add links to downloadable files in README
2017-06-28 13:06:41 -04:00
Eric Mill
11abf45872 chmod the new test script 2017-03-18 20:01:32 -04:00
Joel Collins
45a0ab5559 (FORCE) new commit with test data removed, generate json on saving data 2017-03-08 23:11:43 -05:00
Joshua Tauberer
2da1feed01 add validation warning for missing website url for current terms 2017-01-30 20:54:16 -05:00
hugovk
43a95c881a pyflakes fixes 2017-01-22 16:12:30 +02:00
Joshua Tauberer
9ecd8de4fb add two other missing middle names and a test to ensure that a person with only a first initial also has a middle name, because GovTrack relies on having it 2017-01-14 16:26:31 -05:00
Joshua Tauberer
a58fa6dde6 allow overriding the validator test's 'now' date with NOW=YYYY-MM-DD so that we can test changes that are staged for later 2017-01-03 10:43:15 -05:00
Joshua Tauberer
e2e20518e0 validate.py now reports vacanies as a warning, and a duplicate office check bug was fixed 2017-01-03 09:30:21 -05:00
Joshua Tauberer
80b8205659 add validation tests for executive.yaml 2016-12-17 16:28:46 -05:00
Joshua Tauberer
63a2e89e6b tweak some strings in test/validate.py for clarity 2016-12-17 16:03:41 -05:00
Joshua Tauberer
b4320e45f7 remove reundant test from validate.py for missing bioguide, govtrack ids 2016-11-15 05:54:08 -05:00
Joshua Tauberer
bdc8c2724c add duplicate id check for google_entity_ids now that all are unique 2016-11-13 11:05:54 -05:00
Joshua Tauberer
0c7eceb669 update/improve validation test for google_entity_ids
All id types are white-listed so it has to be added, but the ids aren't unique assigned so that test is skipped for google_entity_ids for now, and the uniqueness test is improved to better report the duplicates.
2016-11-10 07:37:36 -05:00
Joshua Tauberer
efa01e324e remove washington_post id from historical file, finishes #368 2016-11-10 07:23:18 -05:00
Joshua Tauberer
39ee3a1d45 add a pretty thorough validation script to test the two legislators files 2016-11-02 15:45:47 -04:00
Joshua Tauberer
d93f4212ea add a travis test to check that files have been linted 2016-11-02 11:28:22 -04:00
Eric Mill
a7ff75442e Travis CI integratiin, with a workout script that imports every script to check for syntax errors. also updates each script to make them import-able withoue executing. 2014-04-03 12:58:57 -04:00