mirror of
https://github.com/apache/impala.git
synced 2025-12-22 11:28:09 -05:00
We are going to publish impala-shell release 4.1.0a1 on PyPi. This patch upgrades following three python libraries which are used for generating egg files when building impala-shell tarball. upgrade bitarray from 1.2.1 to 2.3.0 upgrade prettytable from 0.7.1 to 0.7.2 upgrade thrift_sasl from 0.4.2 to 0.4.3 Updates shell/packaging/requirements.txt for the versions of dependent Python libraries. Testing: - Ran core tests. - Built impala-shell package impala_shell-4.1.0a1.tar.gz, installed impala-shell package from local impala_shell-4.1.0a1.tar.gz, verified impala-shell was installed in ~/.local/lib/python2.7/site-packages. Verified the version of installed impala-shell and dependent Python libraries as expected. - Set IMPALA_SHELL_HOME as ~/.local/lib/python2.7/site-packages/ impala_shell, copied over egg files under installed impala-shell python package so we can run the end-to-end unit tests against the impala-shell installed with the package downloaded from PyPi. Passed end-to-end impala-shell unit tests. - Verified the impala-shell tarball generated by shell/make_shell_tarball.sh. Change-Id: I378404e2407396d4de3bb0eea4d49a9c5bb4e46a Reviewed-on: http://gerrit.cloudera.org:8080/17826 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
106 lines
3.9 KiB
Plaintext
106 lines
3.9 KiB
Plaintext
copy_n() in _bitarray.c
|
|
=======================
|
|
|
|
The following variable names are used to in this document, as well as in
|
|
the source code:
|
|
|
|
self bitarray object bits are copies onto
|
|
a start bit (in self) bits are copied onto
|
|
other bitarray object bits are copied from
|
|
b start bit (in other) bits are copied from
|
|
n number of bits being copied
|
|
|
|
There are 3 cases handled by this function: (i) aligned case (ii) small n case
|
|
(iii) general case. For all cases, it is important to handle self == other.
|
|
We will briefly discuss the first two cases, and then go into more detail
|
|
about the general case.
|
|
|
|
|
|
Aligned case
|
|
------------
|
|
|
|
In the aligned case, i.e. when both start positions (a and b) are a
|
|
multiple of 8, we use memmove() on the first 8 * n bits, and call copy_n()
|
|
on itself to handle the few remaining bits. Note that the order of these
|
|
two operations (memmove() and copy_n()) matters when copying self to self.
|
|
|
|
|
|
Small n case
|
|
------------
|
|
|
|
For n smaller than a low limit (like 24 bits), we use a sequence of getbit()
|
|
and setbit() calls which is quite slow. We could choose the low limit to be
|
|
as low as 8. However, as the general case has some overhead, we don't see
|
|
a speedup over this case until we copy at least several bytes.
|
|
As other might be self, we need to either loop forward or backwards in order
|
|
to not copy from bits already changed.
|
|
|
|
|
|
General case
|
|
------------
|
|
|
|
We choose an aligned region to be copied using copy_n() itself, and use
|
|
shift_r8() and a few fixes to create the correct copy. The is done in the
|
|
following steps:
|
|
|
|
1.) Calculate the following byte positions:
|
|
p1: start (in self) memory is copied to, i.e. a / 8
|
|
p2: last (in self) memory is copied to, i.e. (a + n - 1) / 8
|
|
p3: first (in other) memory is copied from, i.e. b / 8
|
|
|
|
2.) Store temporary bytes corresponding to p1, p2, p3 in t1, t2, t3.
|
|
We need these bytes later to restore bits which got overwritten or
|
|
shifted away. Note that we have to store t3 (the byte with b in other),
|
|
as other can be self.
|
|
|
|
3.) Calculate the total right shift of bytes p1..p2 (once copied into self).
|
|
This shift depends on a % 8 and b % 8, and has to be a right shift
|
|
below 8.
|
|
|
|
4.) Using copy_n(), copy the byte region from other at p3 or p3 + 1 into
|
|
self at p1. Because in the latter case we miss the beginning bits in
|
|
other, we need t3 and copy those bits later.
|
|
|
|
5.) Right shift self in byte range p1..p2. This region includes the
|
|
bit-range(a, a + n), but is generally larger. This is why we need t1
|
|
and t2 to restore the bytes at p1 and p2 in self later.
|
|
|
|
6.) Restore bits in self at, see step 5:
|
|
- p1 using t1 (those got overwritten and shifted)
|
|
- p2 using t2 (those got shifted)
|
|
|
|
7.) Copy the first few bits from other to self (using t3, see step 4).
|
|
|
|
|
|
Here is an example with the following parameters (created using
|
|
examples/copy_n.py):
|
|
|
|
a = 21
|
|
b = 6
|
|
n = 31
|
|
p1 = 2
|
|
p2 = 6
|
|
p3 = 0
|
|
|
|
other
|
|
bitarray('00101110 11111001 01011101 11001011 10110000 01011110 011')
|
|
b..b+n ^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^
|
|
======== ======== ======== =====
|
|
33
|
|
self
|
|
bitarray('01011101 11100101 01110101 01011001 01110100 10001010 01111011')
|
|
a..a+n ^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^
|
|
11111
|
|
2222
|
|
copy_n
|
|
======== ======== ======== =====
|
|
bitarray('01011101 11100101 11111001 01011101 11001011 10110010 01111011')
|
|
rshift 7
|
|
>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>
|
|
bitarray('01011101 11100101 00000001 11110010 10111011 10010111 01100100')
|
|
a..a+n = ======== ======== ======== ====
|
|
11111
|
|
2222
|
|
33
|
|
bitarray('01011101 11100101 01110101 11110010 10111011 10010111 01101011')
|