"Sent to Archivist of the United States unsigned." indicates this. I had hard-coded bill numbers in the past, but it happened again with hr6297-114 and so now I'm doing a proper fix. I can't remove the old hard-coded bills because I can't test the change because we can no longer fetch the data from THOMAS.
There is no longer a separate amendments scraper. Amendments are saved as a part of importing bills. Amendments to treaties are no longer available.
some of this work was done by @crdunwel
Merge similar regex patterns. Also fixes parsing:
Submitted in the Senate, read twice, considered, and agreed to
which didn't quite match:
Submitted in the Senate, considered, and agreed to
When --diff is specified (for bills & votes), instead of writing output files to
disk, we run a diff over the existing file and the new content and display the
diff. This is handy for testing.
At the same time I'm removing my previous preserve_update_time flag which
I had been using for a similar purpose, but this new method is much easier.
reverts 5122ad6f966ba5899a0758ed92d81ca779314c7f
Per #106, ENACTED:SIGNED should be triggered when the President signs a bill, which is
when the bill seems to actually become law. The "Became Public Law" action is typically
dated the same but may not actually be posted until much later, when the Office of the
Federal Register assigns the public law number. This can be problematic when trying to
count laws.
The same problem might occur with vetoed bills. In principle they must also become law
when the second chamber finishes its override, or thereabouts, but our ENACTED:VETO_OVERRIDE
status is still triggered by Became Public Law. This is more rare so I'll punt this
for another time.
This left a gap for six bills (see the commit for the list) that had Became Public Law
actions but neither Signed by President nor veto actions. They appear to be instances
of the "ten Days (Sundays excepted)" provision in the Constitution. So here I'm also
adding a new status code called ENACTED:TENDAYRULE. Like the other conditions, surely
they become law on the actual 10th day regardless of whether OFR assigns them a number,
but that is harder to detect. There is a "Sent to Archivist of the United States unsigned"
action that we might want to use instead. But this is historical and incredibly rare
so it doesn't make much difference now.
Don't assume that the committee is in the bill's originating chamber. If it's a committee action,
look at the chamber of the committee the action is taking place in. Failing that, use some specific
regular expressions to see if it is a House or Senate action. And failing that, if the bill is
in an early stage when we are pretty sure actions are in the originating chamber use the originating
chamber.
This makes a number of corrections, but also some action lines lose their committee IDs. In some
cases (lots of references to the Budget Act) the original ID was incorrect. In other cases it's
ambiguous or hard to figure out.
also see #110
Because of the capital 'T', the regular expression was not parsing right
and the committee was associated with the chamber of the bill rather than
the chamber of the committee indicated in the action line. Solution is
to do regex case-insensitively.
see #108
* On the first vote in the second chamber, we were not handling 'as amended' for joint/concurrent resolutions, so we were prematurely marking these as PASSED when they should get a PASS_BACK status.
* On ping-pong votes, we were not handling 'as amended' at all, so we were prematurely markinig these as PASSED when they should be PASSED_BACK.
* Ping-votes and votes on conference reports were also not handling joint/concurrent resolutions, and on a successful ping-pong/conference report vote these were incorrectly given the status PASSED:BILL, instead of PASSED:CONSTAMD/PASSED:CONCURRENTRES.
On THOMAS and Congress.gov, the main title displayed for a bill is never a title for
a portion of the bill, and if there is no title for the whole bill in the last 'as'
group, they display a title (for the whole bill) from the previous 'as' group
This affects four bills in the 113th Congress (so far). Three bills now on longer
have a short_title (it's now null), and one's short title changed (HR 1911).