Commit Graph

24 Commits

Author SHA1 Message Date
Joe McDonnell
1142c7b58e IMPALA-10606: Simplify impala-python virtualenv bootstrapping
Bootstrapping the impala-python virtualenv requires multiple
rounds of pip installs with different sets of requirements.
This consolidates the requirements.txt, stage2-requirements.txt,
and compiled-requirements.txt into a single requirements.txt.
This will make it easier to upgrade python packages.

This also splits out setuptools into its own
setuptools-requirements.txt. Setuptools is used during the
pip install for several of the dependencies. Recent versions
of setuptools do not support Python 2, but some of the install
tools (like easy_install) don't know how to pick a version
of setuptools that works with Python 2. Splitting it out to its
own requirements file lets us pin the version.

To make review easier, this does not change any of the versions
of the dependencies. It also leaves the stage2-requirements.txt
and compiled-requirements.txt split out in separate sections
of requirements.txt. These will later be turned into a single
alphabetical list.

Testing:
 - Tested impala-python locally
 - Ran GVO

Change-Id: I8e920e5a257f1e0613065685078624a50d59bf2e
Reviewed-on: http://gerrit.cloudera.org:8080/17226
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2021-03-31 03:17:24 +00:00
Joe McDonnell
60f8f87b09 IMPALA-10274: Initialize impala-python as part of the CMake build
Initializing the impala-python virtualenv takes a couple minutes,
so it is useful to do that in parallel to the rest of the build.
This moves the impala-python initialization to its own step
in the CMake build. It stops using impala-python for commands
invoked from buildall.sh or the CMake build to avoid premature
or concurrent initializations of impala-python. Then, it adds
a dedicated step to initialize impala-python.

Testing:
 - Ran a core job and a couple builds
 - Rebuilt and verified that impala-python is not reinitialized
   if it is already initialized

Change-Id: Ieff51263c55bd234028fed7101c94b4a928590f0
Reviewed-on: http://gerrit.cloudera.org:8080/16607
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-04 17:03:57 +00:00
Tim Armstrong
6ec6aaae8e IMPALA-3695: Remove KUDU_IS_SUPPORTED
Testing:
Ran exhaustive tests.

Change-Id: I059d7a42798c38b570f25283663c284f2fcee517
Reviewed-on: http://gerrit.cloudera.org:8080/16085
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-18 01:11:18 +00:00
Joe McDonnell
13fbe510c0 IMPALA-9838: Switch to GCC 7.5.0
This upgrades GCC and libstdc++ to version 7.5.0. There
have been ABI changes since 4.9.2, so this means that
the native-toolchain produced with the new compiler is
not interoperable with one produced by the old compiler.
To allow that transition, IMPALA_TOOLCHAIN_PACKAGES_HOME
is now a subdirectory of IMPALA_TOOLCHAIN
(toolchain-packages-gcc${IMPALA_GCC_VERSION}) to distinguish
it from the old packages.

Some Python packages in the impala-python virtualenv are
compiled using the toolchain GCC and now use the new ABI.
This leads to two changes:
1. When constructing the LD_LIBRARY_PATH for impala-python,
we include the GCC libstdc++ libraries. Otherwise, certain
Python packages that use C++ fail on older OSes like Centos 7.
This fixes IMPALA-9804.
2. Since developers work on various branches, this changes
the virtualenv's directory location to a directory with
the GCC version in the name. This allows the virtualenv
built with GCC 7 to coexist with the current virtualenv
built with GCC 4.9.2. The location for the old virtualenv is
${IMPALA_HOME}/infra/python/env. The new location is
${IMPALA_HOME}/infra/python/env-gcc${IMPALA_GCC_VERSION}. This
required updating several impala-python scripts.

There are various odds-and-ends related to the transition:
1. Due to the small string optimization, the size of std::string
changed, which means that various data structures also changed
in size. This required updating some static asserts.
2. There is a bug in clang-tidy that reports a use-after-free
for some code using std::shared_ptr. Clang is not modeling
the shared_ptr correctly, so it is a false-positive. As a workaround,
this disables the clang-analyzer-cplusplus.NewDelete diagnostic.
3. Various small compilation fixes (includes, etc).

Performance testing:
 - Ran single-node performance tests on TPC-H for the following
   configurations:
    - TPC-H Parquet scale 30 with normal configurations
    - TPC-H Parquet scale 30 with codegen disabled
    - TPC-H Kudu scale 10
   None found any significant regressions. Full results are
   posted on the JIRA.
 - Ran single-node performance tests on targeted-perf scale 10.
   No significant regressions.
 - The size of binaries (impalad, etc) is slightly smaller with the new GCC:
   GCC 4.9.2 release impalad binary: 545664
   GCC 7.5.0 release impalad binary: 539900
 - Compilation in DEBUG mode is roughly 15-25% faster

Functional testing:
 - Ran core jobs, exhaustive release jobs, UBSAN

Change-Id: Ia0beb2b618ba669c9699f8dbc0c52d1203d004e4
Reviewed-on: http://gerrit.cloudera.org:8080/16045
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-15 23:42:12 +00:00
Joe McDonnell
56ee90c598 IMPALA-9760: Add IMPALA_TOOLCHAIN_PACKAGES_HOME to prepare for GCC7
The locations for native-toolchain packages in IMPALA_TOOLCHAIN
currently do not include the compiler version. This means that
the toolchain can't distinguish between native-toolchain packages
built with gcc 4.9.2 versus gcc 7.5.0. The collisions can cause
issues when switching back and forth between branches.

This introduces the IMPALA_TOOLCHAIN_PACKAGES_HOME environment
variable, which is a location inside IMPALA_TOOLCHAIN that would
hold native-toolchain packages. Currently, it is set to the same
as IMPALA_TOOLCHAIN, so there is no difference in behavior.
This lays the groundwork to add the compiler version to this
path when switching to GCC7.

Testing:
 - The only impediment to building with
   IMPALA_TOOLCHAIN_PACKAGES_HOME=$IMPALA_TOOLCHAIN/test is
   Impala-lzo. With a custom Impala-lzo, compilation succeeds.
   Either Impala-lzo will be fixed or it will be removed.
 - Core tests

Change-Id: I1ff641e503b2161baf415355452f86b6c8bfb15b
Reviewed-on: http://gerrit.cloudera.org:8080/15991
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-05-30 16:25:37 +00:00
Laszlo Gaal
b921d982b5 IMPALA-9668: Obey SKIP_TOOLCHAIN_BOOTSTRAP during virtualenv bootstrap
IMPALA-9626 broke the use case where the toolchain binaries are not
downloaded from the native-toolchain S3 bucket, because
SKIP_TOOLCHAIN_BOOTSTRAP is set to true.

Fix this use case by checking SKIP_TOOLCHAIN_BOOTSTRAP in
bin/bootstrap_environment.py:
- if true: just check if the specified version of the Python binary is
  present at the expected toolchain location. If it is there, use it,
  otherwise throw an exception and abort the bootstrap process.
- in any other case: proceed to download the Python binary as in
  bootstrap_toolchain.py.

Test:
- simulate the custom toolchain setup by downloading the toolchain
  binaries from the S3 bucket, copying them to a separate directory,
  symlinking them into Impala/toolchain, then executing buildall.sh
  with SKIP_BOOTSTRAP_TOOLCHAIN set to "true".

Change-Id: Ic51b3c327b3cebc08edff90de931d07e35e0c319
Reviewed-on: http://gerrit.cloudera.org:8080/15759
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-22 21:56:01 +00:00
Laszlo Gaal
c97191b6a5 IMPALA-9626: Use Python from the toolchain for Impala
Historically Impala used the Python2 version that was available on
the hosting platform, as long as that version was at least v2.6.
This caused constant headache as all Python syntax had to be kept
compatible with Python 2.6 (for Centos 6). It also caused a recent problem
on Centos 8: here the system Python version was compiled with the
system's GCC version (v8.3), which was much more recent than the Impala
standard compiler version (GCC 4.9.2). When the Impala virtualenv was
built, the system Python version supplied C compiler switches for models
containing native code that were unknown for the Impala version of GCC,
thus breaking virtualenv installation.

This patch changes the Impala virtualenv to always use the Python2
version from the toolchain, which is built with the toolchain compiler.

This ensures that
- Impala always has a known Python 2.7 version for all its scripts,
- virtualenv modules based on native code will always be installable, as
  the Python environment and the modules are built with the same compiler
  version.

Additional changes:
- Add an auto-use fixture to conftest.py to check that the tests are
  being run with Python 2.7.x
- Make bootstrap_toolchain.py independent from the Impala virtualenv:
  remove the dependency on the "sh" library

Tests:
- Passed core-mode tests on CentOS 7.4
- Passed core-mode tests in Docker-based mode for centos:7
  and ubuntu:16.04

Most content in this patch was developed but not published earlier
by Tim Armstrong.

Change-Id: Ic7b40cef89cfb3b467b61b2d54a94e708642882b
Reviewed-on: http://gerrit.cloudera.org:8080/15624
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-16 01:08:00 +00:00
Akshesh
e4388740f7 Make infra/python compatible with both Python 2 & 3
Change-Id: If4285a021bb581f88425daa52ef8a3f844017d82
Reviewed-on: http://gerrit.cloudera.org:8080/13070
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Akshesh Doshi <aksheshdoshi@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-04-25 20:43:45 +00:00
Thomas Tauber-Marshall
85f3bb0178 IMPALA-7499: build against CDH Kudu
This patch transitions from pulling in Kudu (libkudu_client.so and the
minicluster tarballs) from the toolchain to instead pull Kudu in with
the other CDH components.

For OSes where the CDH binaries are not provided but the toolchain
binaries are (only Ubuntu 14), we set USE_CDH_KUDU to false to
continue to download the toolchain binaries. We also continue
to use the toolchain binaries to build the client stub for OSes
where KUDU_IS_SUPPORTED is false.

This patch also fixes an issue in bootstrap_toolchain.py where we were
using the wrong g++ to compile the Kudu stub.

Testing:
- Verified building and running Impala works as expected for supported
  combinations of KUDU_IS_SUPPORTED/USE_CDH_KUDU

Change-Id: If6e1048438b6d09a1b38c58371d6212bb6dcc06c
Reviewed-on: http://gerrit.cloudera.org:8080/11363
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-11 01:01:06 +00:00
David Knupp
6e5ec22b12 IMPALA-7399: Emit a junit xml report when trapping errors
This patch will cause a junitxml file to be emitted in the case of
errors in build scripts. Instead of simply echoing a message to the
console, we set up a trap function that also writes out to a
junit xml report that can be consumed by jenkins.impala.io.

Main things to pay attention to:

- New file that gets sourced by all bash scripts when trapping
  within bash scripts:

  https://gerrit.cloudera.org/c/11257/1/bin/report_build_error.sh

- Installation of the python lib into impala-python venv for use
  from within python files:

  https://gerrit.cloudera.org/c/11257/1/bin/impala-python-common.sh

- Change to the generate_junitxml.py file itself, for ease of
  https://gerrit.cloudera.org/c/11257/1/lib/python/impala_py_lib/jenkins/generate_junitxml.py

Most of the other changes are to source the new report_build_error.sh
script to set up the trap function.

Change-Id: Idd62045bb43357abc2b89a78afff499149d3c3fc
Reviewed-on: http://gerrit.cloudera.org:8080/11257
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-23 18:33:58 +00:00
Tim Armstrong
bc3e4fb376 IMPALA-6752: fix pip --no-binary usage
The key facts here are:
* --no-cache-dir is crucial because it prevents us pulling in a
  cached package compiled with the wrong compiler.
* --no-binary takes a argument specifying the set of packages it
  should apply to.

The latent bug was that we didn't provide an argument to --no-binary
and it instead it took --no-index as the argument, which was a no-op
because there are no packages of that name. IMPALA-6731 moved the
arguments, and instead --no-cache-dir became the argument to
--no-binary

Testing:
I could reliably reproduce the failure in my environment by deleting
infra/python/env then running a test with impala-py.test. This patch
is sufficient to solve it.

Change-Id: I118738347ca537b2dddfa6142c3eb5608c49c2e0
Reviewed-on: http://gerrit.cloudera.org:8080/9829
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2018-03-28 04:14:40 +00:00
Lars Volker
37565e3812 IMPALA-6731: Use private index in bootstrap_virtualenv
This change switches to using a private pypi index url when using a
private pypi mirror. This allows to run the tests without relying on the
public Python pypi mirrors.

Some packages can not detect their dependencies correctly when they get
installed together with the dependencies in the same call to pip. This
change adds a second stage of package installation to separate these
packages from their dependencies.

It also adds a few missing packages and updates some packages to newer
versions.

Testing: Ran this on a box where I blocked DNS resolution to Python's
upstream pypi.

Change-Id: I85f75f1f1a305f3043e0910ab88a880eeb30f00b
Reviewed-on: http://gerrit.cloudera.org:8080/9798
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Lars Volker <lv@cloudera.com>
2018-03-26 22:45:03 +00:00
Sailesh Mukil
4e5497995b IMPALA-5375: Builds on CentOS 6.4 failing with broken python dependencies
Builds on CentOS 6.4 fail due to dependencies not met for the new
'cryptography' python package.

The ADLS commit states that the new packages are only required for ADLS
and that ADLS on a dev environment is only supported from CentOS 6.7.

This patch moves the compiled requirements for ADLS from
compiled-requirements.txt to adls-requirements.txt and passing a
compiler to the Pip environment while installing the ADLS
requirements.

Testing: Tested it on a machine that with TARGET_FILESYSTEM='adls'
and also tested it on a CentOS 6.4 machine with the default
configuration.

Change-Id: I7d456a861a85edfcad55236aa8b0dbac2ff6fc78
Reviewed-on: http://gerrit.cloudera.org:8080/6998
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-05-26 07:52:40 +00:00
Sailesh Mukil
50bd015f2d IMPALA-5333: Add support for Impala to work with ADLS
This patch leverages the AdlFileSystem in Hadoop to allow
Impala to talk to the Azure Data Lake Store. This patch has
functional changes as well as adds test infrastructure for
testing Impala over ADLS.

We do not support ACLs on ADLS since the Hadoop ADLS
connector does not integrate ADLS ACLs with Hadoop users/groups.

For testing, we use the azure-data-lake-store-python client
from Microsoft. This client seems to have some consistency
issues. For example, a drop table through Impala will delete
the files in ADLS, however, listing that directory through
the python client immediately after the drop, will still show
the files. This behavior is unexpected since ADLS claims to be
strongly consistent. Some tests have been skipped due to this
limitation with the tag SkipIfADLS.slow_client. Tracked by
IMPALA-5335.

The azure-data-lake-store-python client also only works on CentOS 6.6
and over, so the python dependencies for Azure will not be downloaded
when the TARGET_FILESYSTEM is not "adls". While running ADLS tests,
the expectation will be that it runs on a machine that is at least
running CentOS 6.6.
Note: This is only a test limitation, not a functional one. Clusters
with older OSes like CentOS 6.4 will still work with ADLS.

Added another dependency to bootstrap_build.sh for the ADLS Python
client.

Testing: Ran core tests with and without TARGET_FILESYSTEM as
'adls' to make sure that all tests pass and that nothing breaks.

Change-Id: Ic56b9988b32a330443f24c44f9cb2c80842f7542
Reviewed-on: http://gerrit.cloudera.org:8080/6910
Tested-by: Impala Public Jenkins
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
2017-05-25 19:35:24 +00:00
Tim Armstrong
c8e15e484c IMPALA-4593,IMPALA-4635: fix some python build issues
Build C/C++ packages with toolchain GCC to avoid ABI compatibility
issues. This requires a multi-step bootstrapping process:
1. install basic non-C/C++ packages into the virtualenv
2. use Python 2.7 from the virtualenv to bootstrap the toolchain
3. use toolchain gcc to build C/C++ packages
4. build the kudu-python package with toolchain gcc and Cython

To avoid potentially pulling in cached versions of packages
built with a different compiler, this patch also disables pip's
caching. This should not have a significant effect on performance
since we've enabled ccache and cache downloaded packages in
infra/python/deps.

Improve bootstrapping time significantly by using ccache and by
parallelising the numpy build - the most expensive part of the
install process. On a system with a warmed-up ccache,
bootstrapping after deleting infra/python/env takes 1m16s. Previously
it could take over 5m.

Testing:
Tested manually on Ubuntu 16.04 to confirm that it fixes the ABI
problem mentioned in IMPALA-4593. Initially "import kudu" failed
in my dev environment. After deleting infra/python/env and
re-bootstrapping, "import kudu" succeeded.

Also ran the standard test suite on CentOS 6 and built Impala on
a range of platforms (CentOS 5,6,7; SLES 11,12; Debian 6,7;
Ubuntu12.04,14.04,16.04) to make sure nothing broke.

Change-Id: I9e807510eddeb354069e0478363f649a1c1b75cf
Reviewed-on: http://gerrit.cloudera.org:8080/6218
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-03-07 02:56:18 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Michael Ho
a07fc367ee Revert "IMPALA-1619: Support 64-bit allocations."
This reverts commit 1ffb2bd5a2a2faaa759ebdbaf49bf00aa8f86b5e.

Unbreak the packaging builds for now.

Change-Id: Id079acb83d35b51ba4dfe1c8042e1c5ec891d807
Reviewed-on: http://gerrit.cloudera.org:8080/3543
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Michael Ho <kwho@cloudera.com>
2016-07-05 13:37:26 -07:00
Michael Ho
5f3dfdf6c7 IMPALA-1619: Support 64-bit allocations.
This change extends MemPool, FreePool and StringBuffer to support
64-bit allocations, fixes a bug in decompressor and extends various
places in the code to support 64-bit allocation sizes. With this
change, the text scanner can now decompress compressed files larger
than 1GB.

Note that the UDF interfaces FunctionContext::Allocate() and
FunctionContext::Reallocate() still use 32-bit for the input
argument to avoid breaking compatibility.

In addition, the byte size of a tuple is still assumed to be
within 32-bit. If it needs to be upgraded to 64-bit, it will be
done in a separate change.

Change-Id: I7ed28083d809a86d801a9c063a0aa32c50d32b20
Reviewed-on: http://gerrit.cloudera.org:8080/2781
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-07-05 13:37:25 -07:00
Jim Apple
140220323d IMPALA-3767: bootstrap_virtualenv fails to find cython distribution
This patch avoids trying to download a Cython binary 0.23.4, which pip
has trouble finding.

Change-Id: Ic6733ccb71bcf99196075faa2fb6cf2a1d6276ce
Reviewed-on: http://gerrit.cloudera.org:8080/3427
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
2016-06-21 19:38:19 -07:00
Michael Brown
5112e65be2 Revert "Revert "Add Kudu test helpers""
This reverts commit f8dd5413b65d30646c3745dfc738ed812d50a51f and
effectively re-adds commit 9248dcb70478b8f93f022893776a0960f45fdc28. The
difference between this patch and its original is that I fixed the
changes introduced in infra/python/bootstrap_virtualenv.py to be
python2.4-compatible:

- removed the use of str.format(), preferring a str.join() pattern
- removed the call of the exit() builtin to prefer sys.exit()

The only testing I did for this patch was to ensure
CDH Impala-packaging-on-demand works.

Change-Id: I02ed97473868eacf45b25abe89b41e6fa2fce325
Reviewed-on: http://gerrit.cloudera.org:8080/3160
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-05-24 16:40:59 -07:00
Shiraz Ali
08eff2bc09 Revert "Add Kudu test helpers"
This reverts commit 9248dcb70478b8f93f022893776a0960f45fdc28.
2016-05-20 08:46:00 -07:00
casey
36b524f68c Add Kudu test helpers
Changes:

1) Add the python Kudu module to the virtualenv. Building the virtualenv
is much slower now because Cython and numpy are required. To help with
the rebuild time --no-cache was removed. That option was added to help
when using the dev version of impyla, the version number would be the
same but the module contents were different and the cache used the old
module contents.

2) Add some py.test fixtures to help create Kudu and Impala connections.

Change-Id: I8e5e22b38d5bd09a36238e66a69aa42d1a941de7
Reviewed-on: http://gerrit.cloudera.org:8080/2855
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2016-05-19 19:45:48 -07:00
Casey Ching
2f62d30ccb Python: Update impyla for Hive changes
Impyla now has better support for Hive with commit c83a11. This is
useful for testing.

Change-Id: I2814cbff6f1fcfce51a0271b68daababca33dc65
Reviewed-on: http://gerrit.cloudera.org:8080/744
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-09-09 18:53:50 +00:00
Casey Ching
ca5856b8f8 Python: Bootstrap a virtualenv and add impala-python command
This adds a bootstrap script and a "impala-python" command to
$IMPALA_HOME/bin that automatically runs the bootstrap and redirects to
the virtualenv python. Existing python scripts will later be updated to
use the this new "impala-python" command.

The bootstrap script will build a virtualenv to ensure a minimum python
version (2.6) and a well known set of dependencies. The bootstrap script
can be run with python 2.4 but 2.6 must already be installed on the
system. The resulting virtualenv will use 2.6 at a minimum.

Only dependencies explicitly listed in requirements.txt will be
installed and available (no system packages will ever be used). No
packages will ever be downloaded when setting up the virtualenv. In the
future new dependencies can be added by editing the requirements.txt
file. Installation through requirements.txt is a standard pip feature.
When requirements.txt is updated, the next run of "impala-python"  will
rebuild the virtualenv.

Change-Id: I150595d7e09a45d5f2e3c30a845bc8d6a761eeed
Reviewed-on: http://gerrit.cloudera.org:8080/424
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-08-01 01:30:12 +00:00