Currently, Impala supports building and testing Docker
images on Ubuntu. This extends that same support to
Redhat-based distributions:
1. This splits out the Docker build's OS package
installation into a separate install_os_packages.sh
script. This script detects the OS and calls apt
or yum as appropriate. The script takes the argument
--install-debug-tools, which installs extra tools
like iproute2 and ping. This defaults to true for debug
images and false for release images.
2. This modifies daemon_entrypoint.sh to detect the
OS and set LD_LIBRARY_PATH appropriate to account
for different locations of Java.
3. This modifies docker/setup_build_context.py to
handle different locations of libkudu_client.so
and add extra sanity checks on various libraries
found via globs.
4. This modifies bin/jenkins/dockerized-*.sh test
infrastructure to be able to install docker on
either Ubuntu or Redhat. It also changes the exit
logic to collect the container logs.
Developers can override the base image for Redhat 7
and Redhat 8 builds via the IMPALA_REDHAT7_DOCKER_BASE
and IMPALA_REDHAT8_DOCKER_BASE environment variables.
These default to open source Redhat equivalents
(Centos 7.9 and Rocky 8.5 respectively), but they are
also known to work with Redhat UBI images.
Testing:
- Ran dockerised testing on Rocky 8.5 via the
rocky-8.5-dockerised-tests job.
- Ran GVO
- Ran a Docker build on Centos7 with UBI7 as the base image
Change-Id: Ibaff2560ef971ac2c2231a8e43921164ea1d2f4d
Reviewed-on: http://gerrit.cloudera.org:8080/19006
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
- If external_fe_port flag is >0, spins up a new HS2 compatible
service port
- Added enable_external_fe_support option to start-impala-cluster.py
- which when detected will start impala clusters with
external_fe_port on 21150-21152
- Modify impalad_coordinator Dockerfile to expose external frontend
port at 21150
- The intent of this commit is to separate external frontend
connections from normal hs2 connections
- This allows different security policy to be applied to
each type of connection. The external_fe_port should be considered
a privileged service and should only be exposed to an external
frontend that does user authentication and does authorization
checks on generated plans
Change-Id: I991b5b05e12e37d8739e18ed1086bbb0228acc40
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Reviewed-on: http://gerrit.cloudera.org:8080/17125
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Maven Changes:
Splits out all executor specific jar files into a separate pom file
under mvn-deps/executor-deps. The new pom file lists out all executor
specific jar files. fe/pom.xml has a dependency on
mvn-deps/executor-deps/pom.xml so that all executor specific jars are
still built as part of the fe/ build. mvn-deps/executor-deps/pom.xml
writes out a build-classpath.txt file that contains all dependencies in
the pom.xml file (similar to what is already done in fe/pom.xml).
Docker Build Changes:
setup_build_context.py was changed to leverage the aformentioned Maven
changes. The script still symlinks all dependencies into the lib/ folder,
but also creates an exec-lib/ and statestore-lib/ folder. The exec-lib/
folder contains all dependencies necessary to run Impala Executors, but
excludes any dependencies that are Coordinator specific. The
statestore-lib/ folder excludes all jar files entirely since it does not
run an embedded JVM.
The docker/CMakeLists.txt was modified to support the new library layout
created by setup_build_context.py. Prior to this patch only the build
for the Impala base image has access to the dependencies created by
setup_build_context.py. This patch changes the build logic so all images
have access to the dependencies. This does increase build time because
the built context has to be copied and sent to the Docker daemon for
each image build.
Docker Image Changes:
The copy command for the lib/ folder was removed from the impala_base
Dockerfile and a corresponding copy command was added to each daemon
Docker image. This allows each daemon image to only copy in the
dependencies it actually requires to run.
Other:
* Deleted the hive-3 profile since Impala 4.0 only supports hive-3 builds
* Moved shaded-deps into the mvn-deps folder
Overall, this decreases the size of the impalad_executor image by 120 MB,
and the statestored image by 700 MB.
impalad_coordinator and impalad_coordinator images are now 771 MB, and
impalad_executor images are 651MB.
Further improvements might be possible by decreasing the number of
transitive dependencies in mvn-deps/executor-deps/pom.xml. Moreover,
any new Coordinator specific jar files will not be included in the
Executor image.
Testing:
* Ran core tests
Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd
Reviewed-on: http://gerrit.cloudera.org:8080/16320
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Sahil Takiar <stakiar@cloudera.com>
This adds a flag --use_resolved_hostname, which replaces
--hostname with a resolved IP on startup. This is useful
for containerized environments where the hostname -> IP
mapping can be very dynamic.
This flag is used by default in the dockerized minicluster.
This also fixes a bug in the test code that incorrectly
identified command line flags. Specifically it only checked
the suffix, so it confused use_resolved_hostname and hostname.
Change-Id: I0d5cb9c68c60ce8dc838cde9dcf1c590017f5c9a
Reviewed-on: http://gerrit.cloudera.org:8080/16108
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Andrew Sherman <asherman@cloudera.com>
This patch enables catalog-v2 by default in all the tests.
Test fixes:
1. Modified test_observability which fails on catalog-v2 since
the profile emits different metadata load events. The test now looks for
the right events on the profile depending on whether catalogv2 is
enabled or not.
2. TableName.java constructor allows non-lowercased
table and database names. This causes problems at the local catalog
cache which expects the tablenames to be always in lowercase. More
details on this failure are available in IMPALA-8627. The patch makes
sure that the loadTable requests in local catalog do a explicit
conversion of tablename to lowercase in order to get around the issue.
3. Fixes the JdbcTest which checks for existence of table comment in the
getTables metadata jdbc call. In catalog-v2 since the columns are not
requested, LocalTable is not loaded and hence the test needs to be
modified to check if catalog-v2 is enabled.
4. Skips test_sanity which creates a Hive db and issues a invalidate
metadata to make it visible in catalog. Unfortunately, in catalog-v2
currently there is no way to see a newly created database when event
polling is disabled.
5. Similar to above (4) test_metadata_query_statements.py creates a hive
db and issues a invalidate metadata. The test runs QueryTest/describe-db
which is split into two one for checking the hive-db and other contains
rest of the queries of the original describe-db. The split makes it
possible to only execute the test partially when catalog-v2 is enabled
Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d
Reviewed-on: http://gerrit.cloudera.org:8080/13933
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
* Build scripts are generalised to have different targets for release
and debug images.
* Added new targets for the debug images: docker_debug_images,
statestored_debug images. The release images still have the
same names.
* Separate build contexts are set up for the different base
images.
* The debug or release base image can be specified as the FROM
for the daemon images.
* start-impala-cluster.py picks the correct images for the build type
Future work:
We would like to generalise this to allow building from
non-ubuntu-16.04 base images. This probably requires another
layer of dockerfiles to specify a base image for impala_base
with the required packages installed.
Change-Id: I32d2e19cb671beacceebb2642aba01191bd7a244
Reviewed-on: http://gerrit.cloudera.org:8080/13905
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Adds a plain-text space-separated image list in
docker/docker-images.txt. This is generated based on the images built by
CMake, so is kept in sync with images added to or removed from the
CMake file.
Duplicated logic per image is removed - instead there is a helper
function that is called for each daemon image to be built.
Rips out the timestamp mechanism that was intended to avoid unnecessary
container rebuilds, but has turned out to be brittle. Instead the
containers are rebuilt each time the rule is invoked.
This moves some subdirectories so that the image tag matches the
subdirectory, to simplify the build scripts.
Change-Id: I4d8e215e9b07c6491faa4751969a30f0ed373fe3
Reviewed-on: http://gerrit.cloudera.org:8080/13899
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Lars Volker <lv@cloudera.com>