impala

mirror of https://github.com/apache/impala.git synced 2025-12-21 10:58:31 -05:00

Author	SHA1	Message	Date
gaurav1086	61e90e9e90	IMPALA-13182: Support uploading additional jars This patch enables adding custom jars from the absolute path: /opt/impala/aux-jars to the CLASSPATH. Steps: 1. Download the jars into the /opt/impala/aux-jars directory 2. Restart impala cluster. Testing: * Tested manually: Added jar files in /opt/impala/aux-jars before impala start. After starting impala, asserted that the new jars were appended to the value of CLASSPATH as printed in the impalad logs. Change-Id: Ica5fa4c0cd1a5c938f331f3a4bba85d4910db90e Reviewed-on: http://gerrit.cloudera.org:8080/21556 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-09-12 23:57:20 +00:00
stiga-huang	daa7f8ad88	IMPALA-13328: Fix missing krb5-config in building impala_quickstart_client docker image Building the impala_quickstart_client docker image failed by krb5-config not found. It's installed by the libkrb5-dev package. This patch adds it to fix the build failure. Also improves docker/publish_images_to_apache.sh to skip inexisting images (usually due to not be built). Updates the quickstart_hms image to base on Ubuntu 18.04. Also fixes an issue that docker/CMakeLists.txt doesn't dump all the image names to docker/docker-images.txt Tests: - Verified the quickstart images on MacOS. Change-Id: Ieaa9878fa9cd9902ac883866c82e224889940615 Reviewed-on: http://gerrit.cloudera.org:8080/21725 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-08-29 03:09:29 +00:00
Andrew Sherman	78b9b09a16	IMPALA-13076 Add pstack and jstack to Impala Redhat docker images When the Impala docker images are deployed in production environments, it can be hard to add debugging tools at runtime. Two of the most useful diagnostic tools are jstack and pstack, which can be used to print Java and native stack traces. Install these tools into Redhat images which are the most commonly used in production. To install pstack we install gdb To install jstack we install a development jdk on top of the headless jdk. Extend the install_os_packages.sh script to add an argument to --install-debug-tools to set the level of diagnostic tools to install. The possible arguments are: none - install no extra tools basic - install pstack and jstack full - install more debugging tools. In a Centos 8.5 build, the size of a impalad_coord_exec image increased from 1.74GB to 1.85GB, as reported by ‘docker image list’. What other tools might be added? - Installing perf is tricky as in a container perf requires an installation specific to the underlying linux kernel image, which is hard to predict at build time. - Installing pprof is hard as installation seems to require compiling from sources. Clearly there are many options and we cannot install everything. TESTING Built release and debug docker images, and used jstack and pstack in a running container to print Impala's stacks. Change-Id: I25e6827b86564a9c0fc25678e4a194ee8e0be0e9 Reviewed-on: http://gerrit.cloudera.org:8080/21433 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>	2024-06-21 22:40:28 +00:00
Michael Smith	a7d7336531	IMPALA-12566: Fix RpcMgrKerberizedTest on RedHat 8 On RedHat 8, RpcMgrKerberizedTest cases fail with Jan 09 14:47:03 msmith.vpc.cloudera.com krb5kdc[609624](info): TGS_REQ (1 etypes {aes128-cts-hmac-sha1-96(17)}) 127.0.0.1: LOOKING_UP_SERVER: authtime 0, etypes {rep=UNSUPPORTED:(0)} impala-test/msmith.vpc.cloudera.com@KRBTEST.COM for impala-test/msmith@KRBTEST.COM, Server not found in Kerberos database This happens because bootstrap_system.sh adds an entry to /etc/hosts to resolve 127.0.0.1 to hostname and puts the short hostname first. During negotiation, Kudu RPC will call GetFQDN to retrieve the FQDN, which for our tests running on localhost returns the short hostname. Fixes RpcMgrKerberizedTest by swapping the order of entries added to /etc/hosts so the FQDN comes first. This is consistent with the example provided in https://man7.org/linux/man-pages/man5/hosts.5.html. Avoids 'hostname -f'; on RedHat it's identical to 'hostname', and on Ubuntu it causes this test to fail. Change-Id: I1eb24f9faec766e388d793408aedecdc92107185 Reviewed-on: http://gerrit.cloudera.org:8080/20876 Reviewed-by: Alexey Serbin <alexey@apache.org> Reviewed-by: Jason Fehr <jfehr@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>	2024-01-18 00:00:01 +00:00
Michael Smith	472dea5c3c	IMPALA-12355: Make utility_entrypoint arch-agnostic Updates utility_entrypoint.sh for the impala_profile_tool image to detect the correct JVM native library paths based on a glob, as they're architecture-specific. Follows the pattern established in daemon_entrypoint.sh, except impala_profile_tool only uses Java 8 on Ubuntu. Excepted output $ docker run --entrypoint bash -i impala_profile_tool_debug /opt/impala/bin/utility_entrypoint.sh LD_LIBRARY_PATH: /opt/impala/lib:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server Change-Id: I8e6b781bef52e60072ff02f4098d5ad9405aa2be Reviewed-on: http://gerrit.cloudera.org:8080/20629 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com>	2023-11-03 16:51:00 +00:00
stiga-huang	8d0ab2b684	IMPALA-10262: RPM/DEB Packaging Support This patch bases on a previous patch contributed by Shant Hovsepian: https://gerrit.cloudera.org/c/16612/ It adds a new option, -package, to buildall.sh for building a package for the current OS type (e.g. CentOS/Ubuntu). You can also use "make/ninja package" to build the package. Scripts for launching the services and the required configuration files are also added. Tests: - Built on Ubuntu 18.04/20.04 and CentOS 7 using ./buildall.sh -noclean -skiptests -release -package - Deployed the RPM package on a CDP cluster. Verifed the scripts. - Deployed the DEB package on a docker container. Verified the scripts. Change-Id: I64419fd400fe8d233dac016b6306157fe9461d82 Reviewed-on: http://gerrit.cloudera.org:8080/18939 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-07-16 11:13:23 +00:00
Michael Smith	3b0705ba63	IMPALA-11941: Support Java 17 in Impala Enables building for Java 17 - and particularly using Java 17 in containers - but won't run a minicluster fully with Java 17 as some projects (Hadoop) don't yet support it. Starting with Java 15, ehcache.sizeof encounters UnsupportedOperationException: can't get field offset on a hidden class in class members pointing to capturing lambda functions. Java 17 also introduces new modules that need to be added to add-opens. Both of these pose problems for continued use of ehcache. Adds https://github.com/jbellis/jamm as a new cache weigher for Java 15+. We build from HEAD as an external project until Java 17 support is released (https://github.com/jbellis/jamm/issues/44). Adds the 'java_weigher' option to select 'sizeof' or 'jamm'; defaults to 'auto', which uses jamm for Java 15+ and sizeof for everything else. Also adds metrics for viewing cache weight results. Adds JAVA_HOME/lib/server to LD_LIBRARY_PATH in run-jvm-binary to simplify switching between JDK versions for testing. You can now - export IMPALA_JDK_VERSION=11 - source bin/impala-config.sh - start-impala-cluster.py and have Impala running a different JDK (11) version. Retains add-opens calls that are still necessary due to dependencies' use of lambdas for jamm, and all others for ehcache. Add-opens are still required as a fallback, as noted in https://github.com/jbellis/jamm#object-graph-crawling. We catch the exceptions jamm and ehcache throw - CannotAccessFieldException, UnsupportedOperationException - to avoid crashing Impala, and add it to the list of banned log messages (as we should add-opens when we find them). Testing: - container test run with Java 11 and 17 (excludes custom cluster) - manual custom_cluster/test_local_catalog.py + test_banned_log_messages.py run with Java 11 and 17 (Java 8 build) - full Java 11 build (passed except IMPALA-12184) - add test catalog cache entry size metrics fit reasonable bounds - add unit test for utility to find jamm jar file in classpath Change-Id: Ic378896f572e030a3a019646a96a32a07866a737 Reviewed-on: http://gerrit.cloudera.org:8080/19863 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-06-24 10:11:54 +00:00
Joe McDonnell	234d641d7b	IMPALA-11961/IMPALA-12207: Add Redhat 9 / Ubuntu 22 support This adds support for Redhat 9 / Ubuntu 22. It updates to a newer toolchain that has those builds, and it adds supporting code in bootstrap_system.sh. Redhat 9 and Ubuntu 22 use python = python3, which requires various changes to build scripts and tests. Ubuntu 22 uses Python 3.10, which deprecates certain ssl.PROTOCOL_TLS, so this adapts test_client_ssl.py to that change until it can be fully addressed in IMPALA-12219. Various OpenSSL methods have been deprecated. As a workaround until these can be addressed properly, this specifies -Wno-deprecated-declarations. This can be removed once the code is adapted to the non-deprecated APIs in IMPALA-12226. Impala crashes with tcmalloc errors unless we update to a newer gperftools, so this moves to gperftools 2.10. gperftools changed the default for tcmalloc.aggressive_memory_decommit to off, so this adapts our code to set it for backend tests. The gperftools upgrade does not show any performance regression: +----------+-----------------------+---------+------------+------------+----------------+ \| Workload \| File Format \| Avg (s) \| Delta(Avg) \| GeoMean(s) \| Delta(GeoMean) \| +----------+-----------------------+---------+------------+------------+----------------+ \| TPCH(42) \| parquet / none / none \| 3.08 \| -0.64% \| 2.20 \| -0.37% \| +----------+-----------------------+---------+------------+------------+----------------+ With newer Python versions, the impala-virtualenv command fails to create a Python 3 virtualenv. This switches to using Python 3's builtin venv command for Python >=3.6. Kudu needed a newer version and LLVM required a couple patches. Testing: - Ran a core job on Ubuntu 22 and Redhat 9. The tests run to completion without crashing. There are test failures that will be addressed in follow-up JIRAs. - Ran dockerised tests on Ubuntu 22. - Ran dockerised tests on Ubuntu 20 and Rocky 8.5. Change-Id: If1fcdb2f8c635ecd6dc7a8a1db81f5f389c78b86 Reviewed-on: http://gerrit.cloudera.org:8080/20073 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-06-21 05:21:01 +00:00
Joe McDonnell	9c6df6a691	IMPALA-12179 (part 1): Remove dependency on lsb_release for docker CMake Newer operating systems like Redhat 9 do not supply lsb_release as an official package. The /etc/os-release file provides the same information in a more convenient form. CMake 3.22 added support for reading those /etc/os-release values directly via cmake_host_system_information(). This changes docker/CMakeLists.txt to use the new CMake cmake_host_system_information() APIs to get values from /etc/os-release. This removes the lsb_release code. Testing: - Ran a docker build locally and verified it detected the distribution / version correctly Change-Id: I04afd2b1c923f1331f7234d53a105a17956e3e18 Reviewed-on: http://gerrit.cloudera.org:8080/20069 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-06-15 16:22:15 +00:00
Michael Smith	3346d070ad	IMPALA-11260: (Addendum) Restrict add-opens to Java 9+ Restricts jvm_automatic_add_opens to only apply to Java 9+ where the option exists. Previously it would also include it in Java 8, which caused the JVM to ignore all options in JAVA_TOOL_OPTIONS. Tests for Java version by running $JAVA_HOME/bin/java -version (or "java" if JAVA_HOME is unset) and parsing version from the first line. All JVM implementations are expected to include the version in a quoted string, such as "1.8.0_42" and "11.0.1". Also added add-opens flags for frontend tests. test_no_inaccessible_objects detected this in a test run. Testing: - manually confirmed -agentlib options are present with both Java 8 and Java 11. - promoted test_jvm_mem_tracking to run in all strategies, as it's fast and ensures JAVA_TOOL_OPTIONS is honored. Change-Id: I85953e685f6bbbd213afd93f389066e82f193ddf Reviewed-on: http://gerrit.cloudera.org:8080/19939 Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-06-04 00:38:22 +00:00
Michael Smith	879afbab1f	IMPALA-11260: Add add-opens to JAVA_TOOL_OPTIONS on startup During Impala startup, Before starting the JVM (by calling libhdfs), adds add-opens calls to JAVA_TOOL_OPTIONS to ensure Ehcache has access to non-public members so it can accurately calculate object size. This effectively circumvents new security precautions in Java 9+. Use '--jvm_automatic_add_opens=false' to disable it. Tested with Java 11 JDBC_TEST=false EE_TEST=false FE_TEST=false BE_TEST=false \ CLUSTER_TEST_FILES=custom_cluster/test_local_catalog.py \ run-all-tests.sh Change-Id: I47a6533b2aa94593d9348e8e3606633f06a111e8 Reviewed-on: http://gerrit.cloudera.org:8080/19845 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-05-19 22:32:00 +00:00
Michael Smith	c8a21c51ef	IMPALA-12081: Produce multiple Java docker images This changes the docker image build code so that both Java 8 and Java 11 images can be built in the same build. Specifically, it introduces new Make targets for Java 11 docker images in addition to the regular Java 8 targets. The "docker_images" and "docker_debug_images" targets continue to behave the same way and produce Java 8 images of the same name. The "docker_java11_images" and "docker_debug_java11_images" produce the daemon docker images for Java 11. Preserves IMPALA_DOCKER_USE_JAVA11 for selecting Java 11 images when starting a cluster with container images. Change-Id: Ic2b124267c607242bc2fd6c8cd6486293a938f50 Reviewed-on: http://gerrit.cloudera.org:8080/19722 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-05-19 22:19:24 +00:00
Michael Smith	7d07192e89	IMPALA-9627: Use universal_newlines for Python 3 Fixes subprocess.check_output calls for Python 3 using universal_newlines=True. Change-Id: I3dae9113635cf23ae02f1f630de311e64119c456 Reviewed-on: http://gerrit.cloudera.org:8080/19812 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-04-28 23:28:49 +00:00
Michael Smith	0a42185d17	IMPALA-9627: Update utility scripts for Python 3 (part 2) We're starting to see environments where the system Python ('python') is Python 3. Updates utility and build scripts to work with Python 3, and updates check-pylint-py3k.sh to check scripts that use system python. Fixes other issues found during a full build and test run with Python 3.8 as the default for 'python'. Fixes a impala-shell tip that was supposed to have been two tips (and had no space after period when they were printed). Removes out-of-date deploy.py and various Python 2.6 workarounds. Testing: - Full build with /usr/bin/python pointed to python3 - run-all-tests passed with python pointed to python3 - ran push_to_asf.py Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb Reviewed-on: http://gerrit.cloudera.org:8080/19697 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2023-04-26 18:52:23 +00:00
Joe McDonnell	b57f56f3f8	IMPALA-12039 (addendum): Verify presence of pgrep during docker build As a followup to the fix for IMPALA-12039, this verifies the presence of pgrep at docker build time as well as at daemon startup time. Testing: - Build docker images locally - Ran Redhat 8 dockerised tests Change-Id: I67e000b64cf6c1ab2225745f6b95b7a5e7ac3d36 Reviewed-on: http://gerrit.cloudera.org:8080/19713 Reviewed-by: Andrew Sherman <asherman@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-04-11 00:44:40 +00:00
Abhishek Rawat	3e0a422c2e	IMPALA-12039: graceful shutdown doesn't work in redhat docker image 'pgrep' was missing in redhat docker image and as a result graceful shutdown script (bin/graceful_shutdown_backends.sh) was terminating the impalad immediately without waiting for the 'shutdown_grace_period_s' grace period. Since, there wasn't enough time window for cluster membership changes to propagate to coordinator, it was scheduling query fragments on already deleted executors and queries were failing. Built an ubuntu 20 image and it had the 'pgrep' utility already installed. Testing: - Built redhat 8 image and manually tested graceful shutdown in a docker container. Change-Id: I91ffc1fe3e022ce7f7507b2bd79a3e2c3851956d Reviewed-on: http://gerrit.cloudera.org:8080/19711 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-04-09 21:42:50 +00:00
Joe McDonnell	aa4050b4d9	IMPALA-11976: Fix use of deprecated functions/fields removed in Python 3 Python 3 moved several things around or removed deprecated functions / fields: - sys.maxint was removed, but sys.maxsize provides similar functionality - long was removed, but int provides the same range - file() was removed, but open() already provided the same functionality - Exception.message was removed, but str(exception) is equivalent - Some encodings (like hex) were moved to codecs.encode() - string.letters -> string.ascii_letters - string.lowercase -> string.ascii_lowercase - string.strip was removed This fixes all of those locations. Python 3 also has slightly different rounding behavior from round(), so this changes round() to use future's builtins.round() to get the Python 3 behavior. This fixes the following pylint warnings: - file-builtin - long-builtin - invalid-str-codec - round-builtin - deprecated-string-function - sys-max-int - exception-message-attribute Testing: - Ran cores tests Change-Id: I094cd7fd06b0d417fc875add401d18c90d7a792f Reviewed-on: http://gerrit.cloudera.org:8080/19591 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Joe McDonnell	ba3518366a	IMPALA-11952 (part 4): Fix odds and ends: Octals, long, lambda, etc. There are a variety of small python 3 syntax differences: - Octal constants need to start with 0o rather than just 0 - Long constants are not supported (i.e. numbers ending with L) - Lambda syntax is slightly different - The 'ur' string mode is no longer supported Testing: - check-python-syntax.sh now passes Change-Id: Ie027a50ddf6a2a0db4b34ec9b49484ce86947f20 Reviewed-on: http://gerrit.cloudera.org:8080/19554 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com>	2023-02-28 17:11:50 +00:00
Joe McDonnell	c71de994b0	IMPALA-11952 (part 1): Fix except syntax Python 3 does not support this old except syntax: except Exception, e: Instead, it needs to be: except Exception as e: This uses impala-futurize to fix all locations of the old syntax. Testing: - The check-python-syntax.sh no longer shows errors for except syntax. Change-Id: I1737281a61fa159c8d91b7d4eea593177c0bd6c9 Reviewed-on: http://gerrit.cloudera.org:8080/19551 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2023-02-28 17:11:50 +00:00
Joe McDonnell	52956bae14	IMPALA-11741: Verify that 'hostname' is installed in Docker images Some deployments rely on having the 'hostname' utility installed in Impala's Docker image (e.g. for constructing daemon startup arguments). Most distributions include it by default, but Redhat UBI8 does not. This adds 'hostname' to the list of installed packages for both Ubuntu and the Redhat family. This also verifies that 'hostname' runs properly. Testing: - Verified that this adds hostname for UBI8 images Change-Id: I5a760680294a3ad7e74e843d3f4c06cd38819e88 Reviewed-on: http://gerrit.cloudera.org:8080/19273 Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-11-23 23:50:51 +00:00
Joe McDonnell	1899b2e34b	IMPALA-11703: Set appropriate permissions on /var/tmp in Docker build Impala will fail to start if the permissions on /var/tmp do not have the sticky bit set (i.e. +t). Some Redhat UBI images do not set the sticky bit (+t) on /tmp and /var/tmp. This sets the sticky bit on those directories during Docker build. Testing: - Verified that the sticky bit is set on one of the affected base images and that Impala can start up Change-Id: I7ff32a035f40cb41d3a8dc80a07fd9924f41b942 Reviewed-on: http://gerrit.cloudera.org:8080/19222 Reviewed-by: Abhishek Rawat <arawat@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-11-08 06:21:38 +00:00
Joe McDonnell	11e66523d6	IMPALA-11526: Install en_US.UTF-8 locale into docker images In IMPALA-11492, ExprTest.Utf8MaskTest was failing on some configurations because the en_US.UTF-8 was missing. Since the Docker images don't contain en_US.UTF-8, they are subject to the same bug. This was confirmed by adding tests cases to the test_utf8_strings.py end-to-end test and running it in the dockerized tests. This add the appropriate language pack to the list of packages installed for the Docker build. Testing: - This adds end-to-end tests to test_utf8_strings.py covering the same cases that were failing in ExprTest.Utf8MaskTest. They failed without the added languages packs, and now succeed. Change-Id: I353f257b3cb6d45f7d0a28f7d5319fdb457e6e3d Reviewed-on: http://gerrit.cloudera.org:8080/19080 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>	2022-10-11 20:30:50 +00:00
Joe McDonnell	3d269e465e	IMPALA-11634: Provide an option to use Java 11 for docker images Currently, Docker images install Java 8 for Impala's use. This adds the IMPALA_DOCKER_USE_JAVA11 environment variable. When set to true, this installs Java 11 rather than Java 8. It defaults to false. The daemon_entrypoint.sh script is modified to detect Java 11 correctly. As a workaround for IMPALA-11260, this appends a list of "--add-opens" statements to JAVA_TOOL_OPTIONS when running with Java 11. Testing: - Ran a set of dockerized tests on Rocky 8.5 with Java 11 Change-Id: Icc1dbd3f6a2279840218dc1da2b60077e211a328 Reviewed-on: http://gerrit.cloudera.org:8080/19031 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2022-10-11 20:30:50 +00:00
Joe McDonnell	3962ae1972	IMPALA-8770: Support building Docker images on Redhat-based distributions Currently, Impala supports building and testing Docker images on Ubuntu. This extends that same support to Redhat-based distributions: 1. This splits out the Docker build's OS package installation into a separate install_os_packages.sh script. This script detects the OS and calls apt or yum as appropriate. The script takes the argument --install-debug-tools, which installs extra tools like iproute2 and ping. This defaults to true for debug images and false for release images. 2. This modifies daemon_entrypoint.sh to detect the OS and set LD_LIBRARY_PATH appropriate to account for different locations of Java. 3. This modifies docker/setup_build_context.py to handle different locations of libkudu_client.so and add extra sanity checks on various libraries found via globs. 4. This modifies bin/jenkins/dockerized-*.sh test infrastructure to be able to install docker on either Ubuntu or Redhat. It also changes the exit logic to collect the container logs. Developers can override the base image for Redhat 7 and Redhat 8 builds via the IMPALA_REDHAT7_DOCKER_BASE and IMPALA_REDHAT8_DOCKER_BASE environment variables. These default to open source Redhat equivalents (Centos 7.9 and Rocky 8.5 respectively), but they are also known to work with Redhat UBI images. Testing: - Ran dockerised testing on Rocky 8.5 via the rocky-8.5-dockerised-tests job. - Ran GVO - Ran a Docker build on Centos7 with UBI7 as the base image Change-Id: Ibaff2560ef971ac2c2231a8e43921164ea1d2f4d Reviewed-on: http://gerrit.cloudera.org:8080/19006 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2022-10-11 20:30:50 +00:00
Laszlo Gaal	68650057a1	Speed up default configuration for Docker-based tests Docker-based parallelized test runs have proven themselves to be quite a bit faster than regular core or exhaustive mode builds. While regular sequential builds have also enjoyed shorter runtimes recently, Docker-based parallel builds still enjoy a speed advantage. Scheduling the parallel build segments is currently driven from the test driver script test-with-docker.py, and the order in which the segments are considered is currently hard-coded. The ordering was originally devised experimentally, by timing several test runs, then ordering the test segments based on expected duration, from longest to shortest. The average wall-clock run times for various test segments have changed since this original ordering was committed: FE tests have gotten significantly longer, while upgrading the default worker instance type cut shortened the serial phase(s) of E2E tests. This patch makes two changes to achieve a shorter overall run time for the Docker-based tests: 1. Reorders the default scheduling order of the test segments, based on currently measured durations 2. Increases the default suite concurrency for execution hosts: bumps suite concurrency from 4 to 5 for machines with memory sizes between 96 and 140 GBs (the currently used worker size) The latter change is also based on measurements: memory usage reports for total peak memory (RSS) and peak memory (RSS) per test segment both showed significant amounts of unused memory on the current default worker instance size (having 32 CPUs and 128 GB of RAM). Experiments showed that this machine size can reliable handle five concurrent containerized test sessions with some safety margin remaining, so the patch increases the default concurrency for this machine category. with both changes applied the duration of a core-mode test run with default settings is reduced from 2h45 to 2h25 (on average). Tested by running the Docker-based default test suite in core mode, with Ubuntu 16.04 and Rocky Linux 8.5 base images. Change-Id: Ifb609bcfb10e9f9b281cc6b375c36c9638db168b Reviewed-on: http://gerrit.cloudera.org:8080/19038 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-09-29 21:53:54 +00:00
Michael Smith	f6151b0aa1	IMPALA-11585: Build quickstart_client with Ubuntu 20 Ubuntu 20.04 only provides the python3-pip package. Update building quickstart_client to use python3-pip on Ubuntu 20.04. Change-Id: Ife89b7db88dd58e96ba1b3e3972ca97204332dd4 Reviewed-on: http://gerrit.cloudera.org:8080/18984 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-09-26 23:10:19 +00:00
Joe McDonnell	d0cfdd139f	IMPALA-10199: Add Ubuntu 20 toolchain configuration Ubuntu 20 has been using the toolchain from Ubuntu 18. Since Ubuntu 20 has been added to the toolchain, this switches Impala to use a toolchain with Ubuntu 20 support and uses the Ubuntu 20 bits. This is expected to help with IMPALA-10962. Testing: - Ran a core build on Ubuntu 20 Change-Id: If2394b668ef3c56b1a4c0773fd5e4ff92be4a846 Reviewed-on: http://gerrit.cloudera.org:8080/18559 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-05-24 20:42:04 +00:00
Joe McDonnell	51b537a3cd	IMPALA-11244: Run the minicluster for docker-based BE tests As an optimization, the docker-based tests didn't run the minicluster for BE tests. Some BE tests now require the minicluster (DiskIoMgrTest.WriteToRemote*), so this cannot work with the optimization. This changes the docker-based tests to start the minicluster for the BE tests. Testing: - Ran a docker-based test job Change-Id: I784a63a02886852e10ccca7c118c22ff7d38b8a3 Reviewed-on: http://gerrit.cloudera.org:8080/18414 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2022-05-10 00:19:18 +00:00
stiga-huang	35375b3287	IMPALA-2019(part-4): Add UTF-8 support for case conversion functions There are 3 builtin case conversion string functions: upper(), lower(), and initcap(). Previously they only convert English alphabetic characters. This patch adds support to deal with Unicode characters. There are many corner cases in case conversion depending on the locale and context. E.g. 1) Case conversion is locale-sensitive. Turkish has 4 letter "I"s. English has only two, a lowercase dotted i and an uppercase dotless I. Turkish has lowercase and uppercase forms of both dotted and dotless I. So simply converting "i" to "I" for upper case is wrong in Turkish: +-------+--------+---------+ \| \| Dotted \| Dotless \| +-------+--------+---------+ \| Upper \| İ \| I \| +-------+--------+---------+ \| Lower \| i \| ı \| +-------+--------+---------+ 2) Case conversion may change a string's length. The German word "grüßen" should be converted to "GRÜSSEN" in upper case: the letter "ß" should be converted to "SS". 3) Case conversion is context-sensitive. The Greek word "ὈΔΥΣΣΕΎΣ" should be converted to "ὀδυσσεύς", where the Greek letter "Σ" is converted to "σ" or to "ς", depending on its position in the word. The above cases will be focus in follow-up JIRAs. This patch addes the initial implementation of UTF-8 aware case conversion functions. -------- Implementation: In UTF-8 mode (turned on by set UTF8_MODE=true) of these functions, the bytes in strings are converted to wide characters using std::mbrtowc(). Each wide character (wchar_t) will then be converted using std::towupper or std::towlower correspondingly. We then convert them back to multi bytes using std::wcrtomb(). Note that these builtins are locale aware. If impalad is launched without a UTF-8 aware locale, e.g. LC_ALL="C", these builtins can't recognize non-ascii characters, which will return unexpected results. Thus we modify our docker images to set LC_ALL="C.UTF-8" instead of "C". This patch also logs the current locale when launching impala daemons for better debugging. We will support customized locale in IMPALA-11080. Test: - Add BE unit tests and e2e tests. Change-Id: I443e89d46f4638ce85664b021666bc4f03ee8abd Reviewed-on: http://gerrit.cloudera.org:8080/17785 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-02-15 18:40:59 +00:00
Zoltan Garaguly	45d3eddc05	IMPALA-8680: Docker-based tests fail to archive the minicluster component logs Inside docker container copy logs of cluster components hdfs, yarn, kudu from folder testdata/cluster/cdh<version-number>/node-<node-id>/var/log/ to folder logs/cluster/ Testing: - running docker-based tests and checked that minicluster logs are preserved and archived - test if minicluster logs get copied also in case when something gets wrong during build Change-Id: I23e25d42992cec47c593dc388bcf0bcef828c05e Reviewed-on: http://gerrit.cloudera.org:8080/15898 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-31 06:58:34 +00:00
Vihang Karajgaonkar	5a9dcd108d	IMPALA-8795: Turn on events processing by default This commit turns on events processing by default. The default polling interval is set as 1 second which can be overrriden by setting hms_event_polling_interval_s to non-default value. When the event polling turned on by default this patch also moves the test_event_processing.py to tests/metadata instead of custom cluster test. Some tests within test_event_processing.py which needed non-default configurations were moved to tests/custom_cluster/test_events_custom_configs.py. Additionally, some other tests were modified to take into account the automatic ability of Impala to detect newly added tables from hive. Testing done: 1. Ran exhaustive tests by turning on the events processing multiple times. 2. Ran exhaustive tests by disabling events processing. 3. Ran dockerized tests. Change-Id: I9a8b1871a98b913d0ad8bb26a104a296b6a06122 Reviewed-on: http://gerrit.cloudera.org:8080/17612 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>	2021-08-09 17:22:31 +00:00
John Sherman	ca17e307ab	IMPALA-10550: Add External Frontend service port - If external_fe_port flag is >0, spins up a new HS2 compatible service port - Added enable_external_fe_support option to start-impala-cluster.py - which when detected will start impala clusters with external_fe_port on 21150-21152 - Modify impalad_coordinator Dockerfile to expose external frontend port at 21150 - The intent of this commit is to separate external frontend connections from normal hs2 connections - This allows different security policy to be applied to each type of connection. The external_fe_port should be considered a privileged service and should only be exposed to an external frontend that does user authentication and does authorization checks on generated plans Change-Id: I991b5b05e12e37d8739e18ed1086bbb0228acc40 Reviewed-by: Aman Sinha <amsinha@cloudera.com> Reviewed-on: http://gerrit.cloudera.org:8080/17125 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-03-03 22:46:05 +00:00
Tim Armstrong	79bee3befb	IMPALA-10469: push quickstart to apache repo This adds a script, docker/publish_images_to_apache.sh, that allows uploading images to the apache/impala docker hub repo, prefixed with a version string. E.g. with the following commands: ninja docker_images quickstart_docker_images ./docker/publish_images_to_apache.sh -v `81d5377c2` The uploaded images can then be used for the quickstart cluster, as documented in docker/README. Updated docs for quickstart to use a prefix from apache/impala Remove IMPALA_QUICKSTART_VERSION, which doesn't interact well with the tagging since the image name and version are now encoded in the tag. Fix an incorrect image name added to docker-images.txt: impala_profile_tool_image. Testing: Ran Impala quickstart with data loading using instructions in README. export IMPALA_QUICKSTART_IMAGE_PREFIX="apache/impala:81d5377c2-" docker network create -d bridge quickstart-network export QUICKSTART_IP=$(docker network inspect quickstart-network -f '{{(index .IPAM.Config 0).Gateway}}') export QUICKSTART_LISTEN_ADDR=$QUICKSTART_IP docker-compose -f docker/quickstart.yml \ -f docker/quickstart-kudu-minimal.yml \ -f docker/quickstart-load-data.yml up -d docker run --network=quickstart-network -it \ ${IMPALA_QUICKSTART_IMAGE_PREFIX}impala_quickstart_client impala-shell Change-Id: I535d77e565b73d732ae511d7525193467086c76a Reviewed-on: http://gerrit.cloudera.org:8080/17030 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-02-10 06:56:45 +00:00
Tim Armstrong	93d4348b54	IMPALA-10389: impala-profile-tool container Add a build step for an impala-profile-tool docker image that makes it easy to run the binary on any system. This container is automatically built as part of the docker build. This sets up a new build context that doesn't pull in all of the same dependencies or depend on the Java build Testing: cat logs/cluster/profiles/* \| \ docker run -i impala_profile_tool I uploaded a build of the container to dockerhub too: timgarmstrong/impala_profile_tool Change-Id: I36915cd686ab930dcc934bc0c81bff8c16d46714 Reviewed-on: http://gerrit.cloudera.org:8080/17015 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-02-05 11:22:55 +00:00
Tim Armstrong	eb85c6eeca	IMPALA-9793: Impala quickstart cluster with docker-compose What works: * A single node cluster can be started up with docker-compose * HMS data is stored in Derby database in a docker volume * Filesystem data is stored in a shared docker volume, using the localfs support in the Hadoop client. * A Kudu cluster with a single master can be optionally added on to the Impala cluster. * TPC-DS data can be loaded automatically by a data loading container. We need to set up a docker network called quickstart-network, purely because docker-compose insists on generating network names with underscores, which are part of the FQDN and end up causing problems with Java's URL parsing, which rejects these technically invalid domain names. How to run: Instructions for running the quickstart cluster are in docker/README.md. How to build containers: ./buildall.sh -release -noclean -notests -ninja ninja quickstart_hms_image quickstart_client_image docker_images How to upload containers to dockerhub: IMPALA_QUICKSTART_IMAGE_PREFIX=timgarmstrong/ for i in impalad_coord_exec impalad_coordinator statestored \ impalad_executor catalogd impala_quickstart_client \ impala_quickstart_hms do docker tag $i ${IMPALA_QUICKSTART_IMAGE_PREFIX}$i docker push ${IMPALA_QUICKSTART_IMAGE_PREFIX}$i done I pushed containers build from commit f260cce22, which was branched from `6cb7cecacf` on master. Misc other stuff: * Added more metadata to all images. TODO: * Test and instructions to run against Kudu quickstart * Upload latest version of containers before merging. Change-Id: Ifc0b862af40a368381ada7ec2a355fe4b0aa778c Reviewed-on: http://gerrit.cloudera.org:8080/15966 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-01-26 11:22:08 +00:00
Laszlo Gaal	6d4756da01	IMPALA-10448: Build impala-profile-tool early for Docker-based tests impala-profile-tool is a new dependency for end-to-end tests. The tool is built together with all the other backend tests (so the buildall.sh flag '-notests' can turn off building it), it is actually used in the parallel phase of end-to-end tests. This means a problem for Docker-based builds for the following reasons: - Docker-based tests run BE, FE and various phases of the EE test in separate Docker containers for parallel executions - Test binaries are only built inside the container running BE tests to cut down on the build time and the size of the Docker image that all test containers are based on. - This means that the EE_TEST_PARALLEL container will miss the tool required for running test designed to test it. The solution is to build the tool early, at the end of the build phase running in the build container. There is already another such tool built there (parquet-reader) for similar reason, so just add impala-profile-tool to the same 'make' command there. Tested by running BE_TEST and EE_TEST_PARALLEL phases in a Docker-based build. Change-Id: I60e78ea883f3057c59a345feca38ef08a7f6a0b8 Reviewed-on: http://gerrit.cloudera.org:8080/16965 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-01-22 20:39:59 +00:00
Thomas Tauber-Marshall	91adb33b22	IMPALA-9975 (part 2): Introduce new admission control daemon A recent patch (IMPALA-9930) introduces a new admission control rpc service, which can be configured to perform admission control for coordinators. In that patch, the admission service runs in an impalad. This patch separates the service out to run in a new daemon, called the admissiond. It also integrates this new daemon with the build infrastructure around Docker. Some notable changes: - Adds a new class, AdmissiondEnv, which performs the same function for the admissiond as ExecEnv does for impalads. - The '/admission' http endpoint is exposed on the admissiond's webui if the admission control service is in use, otherwise it is exposed on coordinator impalad's webuis. - start-impala-cluster.py takes a new flag --enable_admission_service which configures the minicluster to have an admissiond with all coordinators using it for admission control. - Coordinators are now configured to use the admission service by specifying the startup flag --admission_service_host. This is intended to mirror the configuration of the statestored/catalogd location. Testing: - Existing tests for the admission control serivce are modified to run with an admissiond. - Manually ran start-impala-cluster.py with --enable_admission_service and --docker_network to verify Docker integration. Change-Id: Id677814b31e9193035e8cf0d08aba0ce388a0ad9 Reviewed-on: http://gerrit.cloudera.org:8080/16891 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-01-13 06:03:37 +00:00
Bikramjeet Vig	8542924fca	IMPALA-10373: Run impala docker containers with uid/gid 1000 The convention in in linux is to that anything below 1000 is reserved for system accounts, services, and other special accounts, and regular user UIDs and GIDs stay above 1000. This will ensure that the 'impala' user created that runs the impala executable inside the docker container gets assigned 1000 uid and gid. Testing: Manually tested by running the docker container and checking the user. Change-Id: I51b846ca5fb2c55ac1707b9581cee18447467b41 Reviewed-on: http://gerrit.cloudera.org:8080/16807 Reviewed-by: Andrew Sherman <asherman@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-12-03 00:59:12 +00:00
Joe McDonnell	cfa8a7a5e5	IMPALA-10278: Use full libraries for impalad_executor Docker container This backs out the piece of IMPALA-10016 that used a pared-down set of libraries for the impalad_executor. That pared-down set was missing org.apache.impala.common.JniUtil, which prevented the impalad_executor container from starting up. Testing: - Ran a docker core job with one coord_exec and two executors and it was able to startup where it wouldn't before Change-Id: Ieecca61cd3c11f446b922a04fdeb5fd0c90fc971 Reviewed-on: http://gerrit.cloudera.org:8080/16640 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-10-23 21:20:44 +00:00
Joe McDonnell	97792c4bad	IMPALA-10198 (part 2): Add support for mvn versions:set This adds support for setting the version of Java artifacts through "mvn versions:set". It changes the modules to inherit the version from the parent pom. Previously, we used a mix of 0.1-SNAPSHOT and 1.0-SNAPSHOT. This now uses 4.0.0-SNAPSHOT across the board. With each release, we can use "mvn versions:set" to update the versions. The only exception is the Hive UDF code that we build for testing. This remains at version 1.0 to avoid test changes. Testing: - Ran core job - Added build-all-flag-combinations.sh case that does "mvn versions:set" and runs a build Change-Id: I661b32e1e445169bac2ffe4f9474f14090031743 Reviewed-on: http://gerrit.cloudera.org:8080/16559 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-10-15 19:30:13 +00:00
Joe McDonnell	97856478ec	IMPALA-10198 (part 1): Unify Java in a single java/ directory This changes all existing Java code to be submodules under a single root pom. The root pom is impala-parent/pom.xml with minor changes to add submodules. This avoids most of the weird CMake/maven interactions, because there is now a single maven invocation for all the Java code. This moves all the Java projects other than fe into a top level java directory. fe is left where it is to avoid disruption (but still is compiled via the java directory's root pom). Various pieces of code that reference the old locations are updated. Based on research, there are two options for dealing with the shaded dependencies. The first is to have an entirely separate Maven project with a separate Maven invocation. In this case, the consumers of the shaded jars will see the reduced set of transitive dependencies. The second is to have the shaded dependencies as modules with a single Maven invocation. The consumer would see all of the original transitive dependencies and need to exclude them all. See MSHADE-206/MNG-5899. This chooses the second. This only moves code around and does not focus on version numbers or making "mvn versions:set" work. Testing: - Ran a core job - Verified existing maven commands from fe/ directory still work - Compared the *-classpath.txt files from fe and executor-deps and verified they are the same except for paths Change-Id: I08773f4f9d7cb269b0491080078d6e6f490d8d7a Reviewed-on: http://gerrit.cloudera.org:8080/16500 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2020-10-15 19:30:13 +00:00
Joe McDonnell	1f3160b4c0	IMPALA-8304: Generate JUnitXML if a command run by CMake fails This wraps each command executed by CMake with a wrapper that generates a JUnitXML file if the command fails. If the command succeeds, the wrapper does nothing. The wrapper applies to C++ compilation, linking, and custom shell commands (such as building the frontend via maven). It does not apply to failures coming from CMake itself. It can be disabled by setting DISABLE_CMAKE_JUNITXML. The command output can include Unicode (e.g. smart quotes for g++), so this also updates generate_junitxml.py to handle Unicode. The wrapper interacts poorly with add_custom_command/add_custom_target CMake commands that use 'cd directory && do_something', so this switches those locations (in /docker) to use CMake's WORKING_DIRECTORY. Testing: - Verified it does not impact a successful build (including with ccache and/or distcc). - Verified it generates JUnitXML for C++ and Java compilation failures. - Verified it doesn't use the wrapper when DISABLE_CMAKE_JUNITXML is set. Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23 Reviewed-on: http://gerrit.cloudera.org:8080/12668 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-10-09 15:52:05 +00:00
Sahil Takiar	a2d5471cd5	IMPALA-10016: Split jars for Impala exec and coord Docker images Maven Changes: Splits out all executor specific jar files into a separate pom file under mvn-deps/executor-deps. The new pom file lists out all executor specific jar files. fe/pom.xml has a dependency on mvn-deps/executor-deps/pom.xml so that all executor specific jars are still built as part of the fe/ build. mvn-deps/executor-deps/pom.xml writes out a build-classpath.txt file that contains all dependencies in the pom.xml file (similar to what is already done in fe/pom.xml). Docker Build Changes: setup_build_context.py was changed to leverage the aformentioned Maven changes. The script still symlinks all dependencies into the lib/ folder, but also creates an exec-lib/ and statestore-lib/ folder. The exec-lib/ folder contains all dependencies necessary to run Impala Executors, but excludes any dependencies that are Coordinator specific. The statestore-lib/ folder excludes all jar files entirely since it does not run an embedded JVM. The docker/CMakeLists.txt was modified to support the new library layout created by setup_build_context.py. Prior to this patch only the build for the Impala base image has access to the dependencies created by setup_build_context.py. This patch changes the build logic so all images have access to the dependencies. This does increase build time because the built context has to be copied and sent to the Docker daemon for each image build. Docker Image Changes: The copy command for the lib/ folder was removed from the impala_base Dockerfile and a corresponding copy command was added to each daemon Docker image. This allows each daemon image to only copy in the dependencies it actually requires to run. Other: * Deleted the hive-3 profile since Impala 4.0 only supports hive-3 builds * Moved shaded-deps into the mvn-deps folder Overall, this decreases the size of the impalad_executor image by 120 MB, and the statestored image by 700 MB. impalad_coordinator and impalad_coordinator images are now 771 MB, and impalad_executor images are 651MB. Further improvements might be possible by decreasing the number of transitive dependencies in mvn-deps/executor-deps/pom.xml. Moreover, any new Coordinator specific jar files will not be included in the Executor image. Testing: * Ran core tests Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd Reviewed-on: http://gerrit.cloudera.org:8080/16320 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Sahil Takiar <stakiar@cloudera.com>	2020-10-08 23:11:52 +00:00
Sahil Takiar	3e77650dcf	IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries Strip debug symbols from libkudu_client.so and libstdc++.so. The same technique used to strip debug symbols from impalad binaries is used. This decreases the Docker image sizes by about 100 MB. Test: * Ran Dockerized tests Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Reviewed-on: http://gerrit.cloudera.org:8080/16263 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-08-12 20:20:34 +00:00
Joe McDonnell	19f16a0f48	Fix concurrency for docker-based tests on 140+GB memory machines A prior change increased the suite concurrency for the docker-based tests on machines with 140+GB of memory. This new rung should also bump the parallel test concurrency (i.e. for parallel EE tests). This sets the parallel test concurrency to 12 for this rung (which is what we use for the 95GB-140GB rung). Testing: - Ran test-with-docker.py on a m5.12xlarge Change-Id: Ib7299abd585da9ba1a838640dadc0bef9c72a39b Reviewed-on: http://gerrit.cloudera.org:8080/16326 Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2020-08-11 16:41:43 +00:00
Tim Armstrong	b29cb4ca82	IMPALA-10006: handle non-writable /opt/impala/logs The shutdown script should not abort if it can't write a log - it should continue to try and shut down impala. The entrypoint script should abort with an explicit error if the log directory isn't writable by the current user. Change-Id: If32d6eef75422b51f8877478bbfb1a709c02f756 Reviewed-on: http://gerrit.cloudera.org:8080/16237 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Attila Jeges <attilaj@cloudera.com> Reviewed-by: Andrew Sherman <asherman@cloudera.com>	2020-07-30 17:46:44 +00:00
Tim Armstrong	a11b8b687a	IMPALA-9790: option to use resolved hostname everywhere This adds a flag --use_resolved_hostname, which replaces --hostname with a resolved IP on startup. This is useful for containerized environments where the hostname -> IP mapping can be very dynamic. This flag is used by default in the dockerized minicluster. This also fixes a bug in the test code that incorrectly identified command line flags. Specifically it only checked the suffix, so it confused use_resolved_hostname and hostname. Change-Id: I0d5cb9c68c60ce8dc838cde9dcf1c590017f5c9a Reviewed-on: http://gerrit.cloudera.org:8080/16108 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Andrew Sherman <asherman@cloudera.com>	2020-06-26 19:46:15 +00:00
Tim Armstrong	6ec6aaae8e	IMPALA-3695: Remove KUDU_IS_SUPPORTED Testing: Ran exhaustive tests. Change-Id: I059d7a42798c38b570f25283663c284f2fcee517 Reviewed-on: http://gerrit.cloudera.org:8080/16085 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-06-18 01:11:18 +00:00
Joe McDonnell	f15a311065	IMPALA-9709: Remove Impala-lzo from the development environment This removes Impala-lzo from the Impala development environment. Impala-lzo is not built as part of the Impala build. The LZO plugin is no longer loaded. LZO tables are not loaded during dataload, and LZO is no longer tested. This removes some obsolete scan APIs that were only used by Impala-lzo. With this commit, Impala-lzo would require code changes to build against Impala. The plugin infrastructure is not removed, and this leaves some LZO support code in place. If someone were to decide to revive Impala-lzo, they would still be able to load it as a plugin and get the same functionality as before. This plugin support may be removed later. Testing: - Dryrun of GVO - Modified TestPartitionMetadataUncompressedTextOnly's test_unsupported_text_compression() to add LZO case Change-Id: I3a4f12247d8872b7e14c9feb4b2c58cfd60d4c0e Reviewed-on: http://gerrit.cloudera.org:8080/15814 Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2020-06-15 23:42:12 +00:00

1 2 3

109 Commits