Commit Graph

  • 8184bcb131 Add check for cdata to find summary information (#321) main Rohaansandhu 2025-10-05 04:46:32 -07:00
  • a2b043342c update vote download process to handle redirects for invalid vote numbers (#319) Tom Mount 2025-05-16 14:10:13 -04:00
  • 1a5504e799 Fix bill parsing whn sponsor information is missing Joshua Tauberer 2025-04-03 10:12:30 +00:00
  • b57af9f06f Fix upcoming_house_floor exceptions when legis-num is empty Joshua Tauberer 2025-04-03 10:11:25 +00:00
  • adc21125f2 bills: fix exception in bill processing when 'text' missing in summary object (#315) Evan Shimizu 2024-11-14 03:40:41 -08:00
  • 2a75ab57e0 Don't die on failed parsing of vote "as amended" Joshua Tauberer 2024-09-01 18:58:50 +00:00
  • 60908eae88 Put back sponsorshipWithdrawnDate which in schema 3.0.0 is present just when it has a value, I guess Joshua Tauberer 2023-04-10 15:31:44 +00:00
  • c15b74138a Fix key errors when importing billstatus (#307) Paul Craciunoiu 2024-09-01 13:01:12 -06:00
  • a53ab5767c Merge pull request #312 from jerrywithaz/main Joshua Tauberer 2024-08-14 13:36:51 -04:00
  • b21c0966a7 Use not keyword Zerry Hogan 2024-08-13 22:17:40 -05:00
  • 4ca3af04e4 Fix key errors in bill_info jerrywithaz 2024-08-12 19:46:04 -05:00
  • 7d94122d4c Add Rescom.in sponsor_for in bill_info jerrywithaz 2024-08-12 19:36:09 -05:00
  • fda00d1d98 Added bill type filter when collecting govinfo bulk data (#303) Siddharth 2024-03-29 08:31:21 -07:00
  • 3267316a38 Fixing Dockerfile - Debian Jessie no longer available (#301) elia 2024-03-09 11:31:14 -05:00
  • c60be2981a Skip tarball archiving for /tests/ (#294) Michael 2024-02-26 14:20:54 -05:00
  • f8dfafcecf Don't try to fetch votes from future sessions Joshua Tauberer 2023-03-10 14:04:05 +00:00
  • 939bf01725 Fix amendment purpose, updateDate parsing which may have changed with schema 3.0.0 Joshua Tauberer 2023-02-05 21:27:20 +00:00
  • 67fb7d114e changed import to fully qualified imports (#295) Sanjeevan Yogeswaran 2023-02-02 17:39:51 -06:00
  • 2638c3b5a9 Fix relative imports (#292) Michael 2023-01-28 09:01:28 -05:00
  • ea5ba20fba Allow the votes scraper to accept a congress number without a session Joshua Tauberer 2023-01-07 23:34:33 +00:00
  • 5c37cfe0ff Update README Joshua Tauberer 2023-01-07 15:23:57 +00:00
  • f5b510a551 GPO BILLSTATUS XML schema 3.0.0 changes Joshua Tauberer 2022-12-21 22:05:47 +00:00
  • 4bedc84c6d Add a new bill status regex for 'Pursuant to .* the following bills passed under suspension of the rules" Joshua Tauberer 2022-05-14 12:08:03 +00:00
  • ef5ce90600 Add a diff option for bills to show file changes before writing updates to disk Joshua Tauberer 2022-05-14 11:47:03 +00:00
  • b8451e58a3 Fix bugs in the undocumented reparse_actions command and add a matching_action_regex option Joshua Tauberer 2022-05-14 11:36:17 +00:00
  • c8f4b5fdc1 "Placed on the ... Calendar" should not cause a bill to change state to REPORTED Joshua Tauberer 2022-05-14 11:31:31 +00:00
  • 4e936e45d0 Correct Virtual Env Suggestion (#288) Connor O'Leary 2022-06-03 09:47:26 -07:00
  • c2c156caeb Merge pull request #285 from CongressWiki/main Joshua Tauberer 2022-05-20 08:50:44 -04:00
  • 0ce452936d enabled redirects on request lib Ryan Parker 2022-05-18 20:06:07 -07:00
  • cb8491c3bc replace-http-with-https Ryan Parker 2022-05-18 19:48:38 -07:00
  • f1ea83e0e4 Add a new bill status regex for 'Pursuant to .* the following bills passed under suspension of the rules" jt Joshua Tauberer 2022-05-14 12:08:03 +00:00
  • e902fa8289 Add a diff option for bills to show file changes before writing updates to disk Joshua Tauberer 2022-05-14 11:47:03 +00:00
  • 2dc9c14d1a Fix bugs in the undocumented reparse_actions command and add a matching_action_regex option Joshua Tauberer 2022-05-14 11:36:17 +00:00
  • 6c89d46a7e "Placed on the ... Calendar" should not cause a bill to change state to REPORTED Joshua Tauberer 2022-05-14 11:31:31 +00:00
  • c10772e3f3 make congress into a python package (#267) Akash Patel 2022-02-27 20:13:50 -05:00
  • 659b293b8e Yaml safe-loader for Beanstalk contrib processing (#279) Michael 2021-12-05 17:04:53 -05:00
  • 28593b1e0c Skip parsing bills with no sponsor and title starts with 'Reserved', which is currently happening for H.R. 2, 4, 9-17, 20 Joshua Tauberer 2021-08-15 11:52:25 +00:00
  • 922843638b Fix committee meetings scraper broken by fec2b202 Joshua Tauberer 2021-05-22 14:33:18 +00:00
  • 6c89fab1ea Fix for change in main branch name in congress-legislators repository Joshua Tauberer 2021-05-22 14:31:46 +00:00
  • f831dc689a Updated documentation to include current installation steps (#273) trentmercer 2021-05-22 10:06:41 -04:00
  • d6db74279b Fix GovTrack-specific output to skip Letlow Joshua Tauberer 2021-02-28 13:34:54 +00:00
  • fe42946a7b Delegate cosponsors are suddenly coming in with different fullName values with 'Del.' and 'Resident Commissioner' titles Joshua Tauberer 2021-02-28 13:33:56 +00:00
  • 399aa2bd10 Merge pull request #265 from acxz/python-3 Joshua Tauberer 2020-10-31 09:55:40 -04:00
  • a9483f6124 File modes and line endings cleanup (#266) Akash Patel 2020-10-16 20:03:32 -04:00
  • eca61e8a47 python3 upgrade bug fixes. Addressed an integer division issue when computing current congress number in utils.py. Removed iso8601 version constraint so it works with python3. stevesdawg 2020-10-04 19:10:07 -04:00
  • e7b4434d8e use universal_newlines as compat for older python versions (3.6) acxz 2020-10-04 18:01:41 -04:00
  • f65e289575 relax constraints on python-dateutil acxz 2020-10-04 17:49:18 -04:00
  • 7118d88da9 fixed string vs. byte object in committee_meetings file when updating to python3 stevesdawg 2020-10-04 17:37:59 -04:00
  • 3d26afdddc fix parsing of xml files as unicode acxz 2020-10-04 17:22:40 -04:00
  • 7a0cb8654c properly write binary to output file acxz 2020-10-04 16:21:06 -04:00
  • 07bb5df2e8 Revert some automated 2to3 changes Joshua Tauberer 2020-10-04 15:36:43 -04:00
  • 2a00902170 perform regex on xml output properly acxz 2020-10-04 15:23:15 -04:00
  • 8d255b82c3 remove extra import from lxml acxz 2020-10-04 14:10:27 -04:00
  • 33fa9406f7 handling edge case when xml data is sometimes a string, and sometimes a unicode bytes object. govinfo task is working as expected. stevesdawg 2020-10-04 12:43:09 -04:00
  • ce9cd8c452 fix reading/opening pickle files as binary acxz 2020-10-04 10:24:11 -04:00
  • 7c991c8139 update Docker to use python3 acxz 2020-10-04 09:59:15 -04:00
  • cb7afc879c increase python version of travis acxz 2020-10-04 09:56:37 -04:00
  • 863610bd93 change instructions to recommend Python 3 acxz 2020-10-04 09:55:09 -04:00
  • fec2b2026c Run 2to3 over codebase acxz 2020-10-04 09:51:21 -04:00
  • f327c0af85 Changed print statements and exception handling to match python 3 syntax in all python files. Preserved python 2 compatibility. (#263) Shrivathsav Seshan 2020-06-24 12:24:25 -04:00
  • 1aacf15cc7 upddep scrapelib (#262) acxz 2020-06-24 09:11:45 -04:00
  • cd475aa0f4 remove sudo from commands (#200) Eric Mill 2020-06-21 15:20:19 -07:00
  • ae1915a54e Use yaml.BaseLoader when loading config (#260) James Anderson 2020-06-09 08:14:56 -04:00
  • 3c127ce801 Update sponsor_for regex to accept L(ibertarian) party for Rep. Amash Joshua Tauberer 2020-06-06 13:27:32 +00:00
  • 7e7f3aacb6 Add a --reparse_actions command to the bill task to re-run the action regexes after we discover a pattern that should be applied to existing data Joshua Tauberer 2020-04-27 01:51:08 +00:00
  • f992db8460 Store downloaded docs.house.gov bill text preprints in the data directory and when downloading a PDF, extract its text and any XML attachments Joshua Tauberer 2020-03-15 22:41:46 +00:00
  • fc0b62140a Merge pull request #255 from unitedstates/delete_corrupt_govinfo_packages Derek Willis 2020-04-02 07:55:20 -04:00
  • 8e7695702a Update beanstalkd contrib sample (#233) Michael 2020-02-17 01:06:08 +00:00
  • f3d8c5896e Delete corrupt GovInfo packages and try again on the next run Joshua Tauberer 2020-01-28 17:15:52 +00:00
  • 99541ce8db Incoming bill actions now have either <committee> or <committees> nodes (#246) Joshua Tauberer 2020-01-28 12:18:55 -05:00
  • 1892eb6013 upcoming_house_floor: handle extraneous parenthesis in bill numbers Joshua Tauberer 2019-07-20 22:19:12 +00:00
  • 027b3d6cbf suppress bs4 warning by choosing an XML parser explicitly Joshua Tauberer 2019-07-20 22:16:34 +00:00
  • 74cdd6e09c upcoming_house_floor: use --download option to download House documents (default off) Joshua Tauberer 2019-07-20 22:14:43 +00:00
  • da6b730587 govinfo: re-download if a file is somehow missing on disk Joshua Tauberer 2019-07-20 22:14:12 +00:00
  • 6c72c92324 log some errors instead of raising exceptions Joshua Tauberer 2019-07-20 22:13:55 +00:00
  • 599c7f4f20 fix motion to table status update on hres304-116 which was tabled in the REFERRED status which was not handled Joshua Tauberer 2019-05-03 10:36:45 +00:00
  • 60b4a96bad use https in upcoming_house_floor, fixes #231 Joshua Tauberer 2018-11-08 08:31:41 -05:00
  • 63fe1a2483 govinfo: when there's an error mirroring a particular package, log it and continue, rather than ending the whole process Joshua Tauberer 2018-10-14 07:18:29 -04:00
  • f1db27129b upcoming_house_floor: handle an alternative capitalization Joshua Tauberer 2018-09-23 08:08:25 -04:00
  • 4594951172 Merge pull request #229 from unitedstates/govinfogov Joshua Tauberer 2018-09-05 19:40:00 -04:00
  • ea4f859217 rename fdsys.py to govinfo.py and rename cache and data directories accordingly Joshua Tauberer 2018-08-28 11:09:11 -04:00
  • 0cb30802db fdsys: in order to reliably download files when all document formats may not be present in all packages, download and save the package ZIP file and then extract document formats from it Joshua Tauberer 2018-08-24 14:33:15 -04:00
  • 4c4c7521c6 switch fdsys scraper from fdsys to govinfo.gov Joshua Tauberer 2018-07-21 14:13:56 -04:00
  • 7d4e396602 fix parse of hr1625-115 which has a stray period in a key action Joshua Tauberer 2018-04-06 21:13:17 -04:00
  • 4c4067a68c parse 'Cloture...invoked' properly in hr1625-115 Joshua Tauberer 2018-03-24 08:25:48 -04:00
  • 50eebe4a24 regex improvement to upcoming_house_floor.py Joshua Tauberer 2018-03-22 16:46:04 -04:00
  • 6fcd26c1d8 committee meetings scraper bugs Joshua Tauberer 2018-02-12 08:42:41 -05:00
  • 8b613ead6d upcoming_house_floor: Re-do how ping-pong'd bill numbers are parsed to handle "Senate amendment to the House amendment to the Senate amendment to H.R. 1892" which crashed the scraper Joshua Tauberer 2018-02-09 19:27:06 -05:00
  • a67aca50db fdsys: speed up updates by storing package file lastmod strings in a single YAML file per sitemap rather than in a JSON file per package Joshua Tauberer 2018-02-07 09:22:32 -05:00
  • 4bb0fef82c vote regex fixes for hr195-115 Joshua Tauberer 2018-01-20 07:59:44 -05:00
  • 0b96429aea 83d18c468d broke 'On the Motion to Concur' Joshua Tauberer 2017-12-23 07:42:39 -05:00
  • f57322c05d Merge pull request #225 from camisatx/master Eric Mill 2017-12-20 15:54:38 -05:00
  • 4f9d4bdab4 parse 'Senate receded from its amendment and concurred with an amendment' as a PASS_BACK Joshua Tauberer 2017-12-20 12:15:07 -05:00
  • 83d18c468d normalize type and assign 'passage' category to 'On the Motion (Motion to Recede from the Senate Amendment to ____ and Concur with Further Amendment)' Joshua Tauberer 2017-12-20 08:02:59 -05:00
  • 3795ff60a6 Remove ipython from requirements b/c it's not used and isn't compatible w/py2.7 Josh Schertz 2017-12-19 22:35:46 -06:00
  • ca68414cd1 Delete deepbills code, this project was abandoned by Cato. (#223) Trey Moore 2017-12-09 14:01:54 +01:00
  • 329114d354 parse 'motion to table the measure Agreed to' as a failed vote on passage Joshua Tauberer 2017-12-08 07:30:12 -05:00
  • 3f34b0917d senate vote had an invalid S.Amdt. bill type Joshua Tauberer 2017-12-04 08:24:40 -05:00
  • 59df24ebd7 parse bill action links Joshua Tauberer 2017-12-04 08:01:59 -05:00
  • 751df327c4 Replace unnecessary call to has_key (#215) Hugo 2017-09-18 23:08:28 +03:00