impala

mirror of https://github.com/apache/impala.git synced 2025-12-29 09:04:47 -05:00

Author	SHA1	Message	Date
Joe McDonnell	3962ae1972	IMPALA-8770: Support building Docker images on Redhat-based distributions Currently, Impala supports building and testing Docker images on Ubuntu. This extends that same support to Redhat-based distributions: 1. This splits out the Docker build's OS package installation into a separate install_os_packages.sh script. This script detects the OS and calls apt or yum as appropriate. The script takes the argument --install-debug-tools, which installs extra tools like iproute2 and ping. This defaults to true for debug images and false for release images. 2. This modifies daemon_entrypoint.sh to detect the OS and set LD_LIBRARY_PATH appropriate to account for different locations of Java. 3. This modifies docker/setup_build_context.py to handle different locations of libkudu_client.so and add extra sanity checks on various libraries found via globs. 4. This modifies bin/jenkins/dockerized-*.sh test infrastructure to be able to install docker on either Ubuntu or Redhat. It also changes the exit logic to collect the container logs. Developers can override the base image for Redhat 7 and Redhat 8 builds via the IMPALA_REDHAT7_DOCKER_BASE and IMPALA_REDHAT8_DOCKER_BASE environment variables. These default to open source Redhat equivalents (Centos 7.9 and Rocky 8.5 respectively), but they are also known to work with Redhat UBI images. Testing: - Ran dockerised testing on Rocky 8.5 via the rocky-8.5-dockerised-tests job. - Ran GVO - Ran a Docker build on Centos7 with UBI7 as the base image Change-Id: Ibaff2560ef971ac2c2231a8e43921164ea1d2f4d Reviewed-on: http://gerrit.cloudera.org:8080/19006 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2022-10-11 20:30:50 +00:00
John Sherman	ca17e307ab	IMPALA-10550: Add External Frontend service port - If external_fe_port flag is >0, spins up a new HS2 compatible service port - Added enable_external_fe_support option to start-impala-cluster.py - which when detected will start impala clusters with external_fe_port on 21150-21152 - Modify impalad_coordinator Dockerfile to expose external frontend port at 21150 - The intent of this commit is to separate external frontend connections from normal hs2 connections - This allows different security policy to be applied to each type of connection. The external_fe_port should be considered a privileged service and should only be exposed to an external frontend that does user authentication and does authorization checks on generated plans Change-Id: I991b5b05e12e37d8739e18ed1086bbb0228acc40 Reviewed-by: Aman Sinha <amsinha@cloudera.com> Reviewed-on: http://gerrit.cloudera.org:8080/17125 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-03-03 22:46:05 +00:00
Sahil Takiar	a2d5471cd5	IMPALA-10016: Split jars for Impala exec and coord Docker images Maven Changes: Splits out all executor specific jar files into a separate pom file under mvn-deps/executor-deps. The new pom file lists out all executor specific jar files. fe/pom.xml has a dependency on mvn-deps/executor-deps/pom.xml so that all executor specific jars are still built as part of the fe/ build. mvn-deps/executor-deps/pom.xml writes out a build-classpath.txt file that contains all dependencies in the pom.xml file (similar to what is already done in fe/pom.xml). Docker Build Changes: setup_build_context.py was changed to leverage the aformentioned Maven changes. The script still symlinks all dependencies into the lib/ folder, but also creates an exec-lib/ and statestore-lib/ folder. The exec-lib/ folder contains all dependencies necessary to run Impala Executors, but excludes any dependencies that are Coordinator specific. The statestore-lib/ folder excludes all jar files entirely since it does not run an embedded JVM. The docker/CMakeLists.txt was modified to support the new library layout created by setup_build_context.py. Prior to this patch only the build for the Impala base image has access to the dependencies created by setup_build_context.py. This patch changes the build logic so all images have access to the dependencies. This does increase build time because the built context has to be copied and sent to the Docker daemon for each image build. Docker Image Changes: The copy command for the lib/ folder was removed from the impala_base Dockerfile and a corresponding copy command was added to each daemon Docker image. This allows each daemon image to only copy in the dependencies it actually requires to run. Other: * Deleted the hive-3 profile since Impala 4.0 only supports hive-3 builds * Moved shaded-deps into the mvn-deps folder Overall, this decreases the size of the impalad_executor image by 120 MB, and the statestored image by 700 MB. impalad_coordinator and impalad_coordinator images are now 771 MB, and impalad_executor images are 651MB. Further improvements might be possible by decreasing the number of transitive dependencies in mvn-deps/executor-deps/pom.xml. Moreover, any new Coordinator specific jar files will not be included in the Executor image. Testing: * Ran core tests Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd Reviewed-on: http://gerrit.cloudera.org:8080/16320 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Sahil Takiar <stakiar@cloudera.com>	2020-10-08 23:11:52 +00:00
Tim Armstrong	a11b8b687a	IMPALA-9790: option to use resolved hostname everywhere This adds a flag --use_resolved_hostname, which replaces --hostname with a resolved IP on startup. This is useful for containerized environments where the hostname -> IP mapping can be very dynamic. This flag is used by default in the dockerized minicluster. This also fixes a bug in the test code that incorrectly identified command line flags. Specifically it only checked the suffix, so it confused use_resolved_hostname and hostname. Change-Id: I0d5cb9c68c60ce8dc838cde9dcf1c590017f5c9a Reviewed-on: http://gerrit.cloudera.org:8080/16108 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Andrew Sherman <asherman@cloudera.com>	2020-06-26 19:46:15 +00:00
Vihang Karajgaonkar	39613c8226	IMPALA-8627: Enable catalog-v2 in tests This patch enables catalog-v2 by default in all the tests. Test fixes: 1. Modified test_observability which fails on catalog-v2 since the profile emits different metadata load events. The test now looks for the right events on the profile depending on whether catalogv2 is enabled or not. 2. TableName.java constructor allows non-lowercased table and database names. This causes problems at the local catalog cache which expects the tablenames to be always in lowercase. More details on this failure are available in IMPALA-8627. The patch makes sure that the loadTable requests in local catalog do a explicit conversion of tablename to lowercase in order to get around the issue. 3. Fixes the JdbcTest which checks for existence of table comment in the getTables metadata jdbc call. In catalog-v2 since the columns are not requested, LocalTable is not loaded and hence the test needs to be modified to check if catalog-v2 is enabled. 4. Skips test_sanity which creates a Hive db and issues a invalidate metadata to make it visible in catalog. Unfortunately, in catalog-v2 currently there is no way to see a newly created database when event polling is disabled. 5. Similar to above (4) test_metadata_query_statements.py creates a hive db and issues a invalidate metadata. The test runs QueryTest/describe-db which is split into two one for checking the hive-db and other contains rest of the queries of the original describe-db. The split makes it possible to only execute the test partially when catalog-v2 is enabled Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d Reviewed-on: http://gerrit.cloudera.org:8080/13933 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-08-07 01:41:15 +00:00
Tim Armstrong	def70c241d	IMPALA-8785: give debug docker images a different name * Build scripts are generalised to have different targets for release and debug images. * Added new targets for the debug images: docker_debug_images, statestored_debug images. The release images still have the same names. * Separate build contexts are set up for the different base images. * The debug or release base image can be specified as the FROM for the daemon images. * start-impala-cluster.py picks the correct images for the build type Future work: We would like to generalise this to allow building from non-ubuntu-16.04 base images. This probably requires another layer of dockerfiles to specify a base image for impala_base with the required packages installed. Change-Id: I32d2e19cb671beacceebb2642aba01191bd7a244 Reviewed-on: http://gerrit.cloudera.org:8080/13905 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-07-30 23:36:48 +00:00
Tim Armstrong	f689daef7f	IMPALA-8622,IMPALA-8696: fix docker dependencies, add image list Adds a plain-text space-separated image list in docker/docker-images.txt. This is generated based on the images built by CMake, so is kept in sync with images added to or removed from the CMake file. Duplicated logic per image is removed - instead there is a helper function that is called for each daemon image to be built. Rips out the timestamp mechanism that was intended to avoid unnecessary container rebuilds, but has turned out to be brittle. Instead the containers are rebuilt each time the rule is invoked. This moves some subdirectories so that the image tag matches the subdirectory, to simplify the build scripts. Change-Id: I4d8e215e9b07c6491faa4751969a30f0ed373fe3 Reviewed-on: http://gerrit.cloudera.org:8080/13899 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Lars Volker <lv@cloudera.com>	2019-07-23 23:57:43 +00:00

7 Commits