Files
impala/shell/ext-py/bitarray-2.3.0/bitarray/copy_n.txt
wzhou-code 9e76a8f7c3 IMPALA-10784 (part 3): Prepare to publish impala-shell on PyPi
We are going to publish impala-shell release 4.1.0a1 on PyPi.
This patch upgrades following three python libraries which are used
for generating egg files when building impala-shell tarball.
  upgrade bitarray from 1.2.1 to 2.3.0
  upgrade prettytable from 0.7.1 to 0.7.2
  upgrade thrift_sasl from 0.4.2 to 0.4.3
Updates shell/packaging/requirements.txt for the versions of dependent
Python libraries.

Testing:
 - Ran core tests.
 - Built impala-shell package impala_shell-4.1.0a1.tar.gz, installed
   impala-shell package from local impala_shell-4.1.0a1.tar.gz, verified
   impala-shell was installed in ~/.local/lib/python2.7/site-packages.
   Verified the version of installed impala-shell and dependent Python
   libraries as expected.
 - Set IMPALA_SHELL_HOME as ~/.local/lib/python2.7/site-packages/
   impala_shell, copied over egg files under installed impala-shell
   python package so we can run the end-to-end unit tests against
   the impala-shell installed with the package downloaded from PyPi.
   Passed end-to-end impala-shell unit tests.
 - Verified the impala-shell tarball generated by
   shell/make_shell_tarball.sh.

Change-Id: I378404e2407396d4de3bb0eea4d49a9c5bb4e46a
Reviewed-on: http://gerrit.cloudera.org:8080/17826
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-28 04:55:57 +00:00

106 lines
3.9 KiB
Plaintext

copy_n() in _bitarray.c
=======================
The following variable names are used to in this document, as well as in
the source code:
self bitarray object bits are copies onto
a start bit (in self) bits are copied onto
other bitarray object bits are copied from
b start bit (in other) bits are copied from
n number of bits being copied
There are 3 cases handled by this function: (i) aligned case (ii) small n case
(iii) general case. For all cases, it is important to handle self == other.
We will briefly discuss the first two cases, and then go into more detail
about the general case.
Aligned case
------------
In the aligned case, i.e. when both start positions (a and b) are a
multiple of 8, we use memmove() on the first 8 * n bits, and call copy_n()
on itself to handle the few remaining bits. Note that the order of these
two operations (memmove() and copy_n()) matters when copying self to self.
Small n case
------------
For n smaller than a low limit (like 24 bits), we use a sequence of getbit()
and setbit() calls which is quite slow. We could choose the low limit to be
as low as 8. However, as the general case has some overhead, we don't see
a speedup over this case until we copy at least several bytes.
As other might be self, we need to either loop forward or backwards in order
to not copy from bits already changed.
General case
------------
We choose an aligned region to be copied using copy_n() itself, and use
shift_r8() and a few fixes to create the correct copy. The is done in the
following steps:
1.) Calculate the following byte positions:
p1: start (in self) memory is copied to, i.e. a / 8
p2: last (in self) memory is copied to, i.e. (a + n - 1) / 8
p3: first (in other) memory is copied from, i.e. b / 8
2.) Store temporary bytes corresponding to p1, p2, p3 in t1, t2, t3.
We need these bytes later to restore bits which got overwritten or
shifted away. Note that we have to store t3 (the byte with b in other),
as other can be self.
3.) Calculate the total right shift of bytes p1..p2 (once copied into self).
This shift depends on a % 8 and b % 8, and has to be a right shift
below 8.
4.) Using copy_n(), copy the byte region from other at p3 or p3 + 1 into
self at p1. Because in the latter case we miss the beginning bits in
other, we need t3 and copy those bits later.
5.) Right shift self in byte range p1..p2. This region includes the
bit-range(a, a + n), but is generally larger. This is why we need t1
and t2 to restore the bytes at p1 and p2 in self later.
6.) Restore bits in self at, see step 5:
- p1 using t1 (those got overwritten and shifted)
- p2 using t2 (those got shifted)
7.) Copy the first few bits from other to self (using t3, see step 4).
Here is an example with the following parameters (created using
examples/copy_n.py):
a = 21
b = 6
n = 31
p1 = 2
p2 = 6
p3 = 0
other
bitarray('00101110 11111001 01011101 11001011 10110000 01011110 011')
b..b+n ^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^
======== ======== ======== =====
33
self
bitarray('01011101 11100101 01110101 01011001 01110100 10001010 01111011')
a..a+n ^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^
11111
2222
copy_n
======== ======== ======== =====
bitarray('01011101 11100101 11111001 01011101 11001011 10110010 01111011')
rshift 7
>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>
bitarray('01011101 11100101 00000001 11110010 10111011 10010111 01100100')
a..a+n = ======== ======== ======== ====
11111
2222
33
bitarray('01011101 11100101 01110101 11110010 10111011 10010111 01101011')