21 Commits

Author SHA1 Message Date
Michael Smith
7d07192e89 IMPALA-9627: Use universal_newlines for Python 3
Fixes subprocess.check_output calls for Python 3 using
universal_newlines=True.

Change-Id: I3dae9113635cf23ae02f1f630de311e64119c456
Reviewed-on: http://gerrit.cloudera.org:8080/19812
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-04-28 23:28:49 +00:00
Michael Smith
0a42185d17 IMPALA-9627: Update utility scripts for Python 3 (part 2)
We're starting to see environments where the system Python ('python') is
Python 3. Updates utility and build scripts to work with Python 3, and
updates check-pylint-py3k.sh to check scripts that use system python.

Fixes other issues found during a full build and test run with Python
3.8 as the default for 'python'.

Fixes a impala-shell tip that was supposed to have been two tips (and
had no space after period when they were printed).

Removes out-of-date deploy.py and various Python 2.6 workarounds.

Testing:
- Full build with /usr/bin/python pointed to python3
- run-all-tests passed with python pointed to python3
- ran push_to_asf.py

Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb
Reviewed-on: http://gerrit.cloudera.org:8080/19697
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-04-26 18:52:23 +00:00
Joe McDonnell
ba3518366a IMPALA-11952 (part 4): Fix odds and ends: Octals, long, lambda, etc.
There are a variety of small python 3 syntax differences:
 - Octal constants need to start with 0o rather than just 0
 - Long constants are not supported (i.e. numbers ending with L)
 - Lambda syntax is slightly different
 - The 'ur' string mode is no longer supported

Testing:
 - check-python-syntax.sh now passes

Change-Id: Ie027a50ddf6a2a0db4b34ec9b49484ce86947f20
Reviewed-on: http://gerrit.cloudera.org:8080/19554
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Laszlo Gaal
68650057a1 Speed up default configuration for Docker-based tests
Docker-based parallelized test runs have proven themselves to be quite a
bit faster than regular core or exhaustive mode builds. While regular
sequential builds have also enjoyed shorter runtimes recently,
Docker-based parallel builds still enjoy a speed advantage.

Scheduling the parallel build segments is currently driven from the
test driver script test-with-docker.py, and the order in which the
segments are considered is currently hard-coded. The ordering was
originally devised experimentally, by timing several test runs, then
ordering the test segments based on expected duration, from longest
to shortest.

The average wall-clock run times for various test segments have changed
since this original ordering was committed: FE tests have gotten
significantly longer, while upgrading the default worker instance
type cut shortened the serial phase(s) of E2E tests.

This patch makes two changes to achieve a shorter overall run time for
the Docker-based tests:
1. Reorders the default scheduling order of the test segments, based
   on currently measured durations
2. Increases the default suite concurrency for execution hosts:
   bumps suite concurrency from 4 to 5 for machines with memory sizes
   between 96 and 140 GBs (the currently used worker size)

The latter change is also based on measurements: memory usage reports for
total peak memory (RSS) and peak memory (RSS) per test segment both
showed significant amounts of unused memory on the current default
worker instance size (having 32 CPUs and 128 GB of RAM).
Experiments showed that this machine size can reliable handle five
concurrent containerized test sessions with some safety margin remaining,
so the patch increases the default concurrency for this machine
category.

with both changes applied the duration of a core-mode test run with
default settings is reduced from 2h45 to 2h25 (on average).

Tested by running the Docker-based default test suite in core mode,
with Ubuntu 16.04 and Rocky Linux 8.5 base images.

Change-Id: Ifb609bcfb10e9f9b281cc6b375c36c9638db168b
Reviewed-on: http://gerrit.cloudera.org:8080/19038
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-09-29 21:53:54 +00:00
stiga-huang
35375b3287 IMPALA-2019(part-4): Add UTF-8 support for case conversion functions
There are 3 builtin case conversion string functions: upper(), lower(),
and initcap(). Previously they only convert English alphabetic
characters. This patch adds support to deal with Unicode characters.

There are many corner cases in case conversion depending on the locale
and context. E.g.
1) Case conversion is locale-sensitive.
Turkish has 4 letter "I"s. English has only two, a lowercase dotted i
and an uppercase dotless I. Turkish has lowercase and uppercase forms of
both dotted and dotless I. So simply converting "i" to "I" for upper
case is wrong in Turkish:
    +-------+--------+---------+
    |       | Dotted | Dotless |
    +-------+--------+---------+
    | Upper | İ      | I       |
    +-------+--------+---------+
    | Lower | i      | ı       |
    +-------+--------+---------+

2) Case conversion may change a string's length.
The German word "grüßen" should be converted to "GRÜSSEN" in upper case:
the letter "ß" should be converted to "SS".

3) Case conversion is context-sensitive.
The Greek word "ὈΔΥΣΣΕΎΣ" should be converted to "ὀδυσσεύς", where the
Greek letter "Σ" is converted to "σ" or to "ς", depending on its
position in the word.

The above cases will be focus in follow-up JIRAs. This patch addes the
initial implementation of UTF-8 aware case conversion functions.

--------
Implementation:
In UTF-8 mode (turned on by set UTF8_MODE=true) of these functions, the
bytes in strings are converted to wide characters using std::mbrtowc().
Each wide character (wchar_t) will then be converted using std::towupper
or std::towlower correspondingly. We then convert them back to multi
bytes using std::wcrtomb().

Note that these builtins are locale aware. If impalad is launched
without a UTF-8 aware locale, e.g. LC_ALL="C", these builtins can't
recognize non-ascii characters, which will return unexpected results.
Thus we modify our docker images to set LC_ALL="C.UTF-8" instead of "C".
This patch also logs the current locale when launching impala daemons
for better debugging. We will support customized locale in IMPALA-11080.

Test:
 - Add BE unit tests and e2e tests.

Change-Id: I443e89d46f4638ce85664b021666bc4f03ee8abd
Reviewed-on: http://gerrit.cloudera.org:8080/17785
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-15 18:40:59 +00:00
Joe McDonnell
19f16a0f48 Fix concurrency for docker-based tests on 140+GB memory machines
A prior change increased the suite concurrency for the
docker-based tests on machines with 140+GB of memory.
This new rung should also bump the parallel test
concurrency (i.e. for parallel EE tests). This sets
the parallel test concurrency to 12 for this rung
(which is what we use for the 95GB-140GB rung).

Testing:
 - Ran test-with-docker.py on a m5.12xlarge

Change-Id: Ib7299abd585da9ba1a838640dadc0bef9c72a39b
Reviewed-on: http://gerrit.cloudera.org:8080/16326
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2020-08-11 16:41:43 +00:00
Joe McDonnell
f15a311065 IMPALA-9709: Remove Impala-lzo from the development environment
This removes Impala-lzo from the Impala development environment.
Impala-lzo is not built as part of the Impala build. The LZO plugin
is no longer loaded. LZO tables are not loaded during dataload,
and LZO is no longer tested.

This removes some obsolete scan APIs that were only used by Impala-lzo.
With this commit, Impala-lzo would require code changes to build
against Impala.

The plugin infrastructure is not removed, and this leaves some
LZO support code in place. If someone were to decide to revive
Impala-lzo, they would still be able to load it as a plugin
and get the same functionality as before. This plugin support
may be removed later.

Testing:
 - Dryrun of GVO
 - Modified TestPartitionMetadataUncompressedTextOnly's
   test_unsupported_text_compression() to add LZO case

Change-Id: I3a4f12247d8872b7e14c9feb4b2c58cfd60d4c0e
Reviewed-on: http://gerrit.cloudera.org:8080/15814
Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2020-06-15 23:42:12 +00:00
Laszlo Gaal
88c2f9a526 Bump test-with-docker test concurrency for large instances
The concurrency limits (i.e. how many concurrent Docker containers are
running test shards at the same time) were conservative at the high end:
the largest memory configuration they considered was under 100 GBs.
Bump these limits for the usual m5.12xlarge test worker that has 192 GBs
of RAM, of which about 186 GBs are available.
Also, swap the order of FE and BE tests, as FE tests have now grown
pretty long with the long delay in AuthorizationStmtTest.

Test: ran test-with-docker.py with all default parameters. Verified that
default concurrency was 6 on an m5.12xlarge and core-mode tests passed
in an Ubuntu 16.04 container.

Change-Id: I5c03a78ee65d09212d9bfa007e87fd069cdaabb6
Reviewed-on: http://gerrit.cloudera.org:8080/15834
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
2020-05-14 08:20:17 +00:00
Philip Zeyliger
365e35a36f Using 'master' branch of Impala-lzo and allowing test-with-docker to configure it.
This updates bootstrap_system.sh to check out the 'master' branch of
Impala-lzo. (I've separately updated the 'master' branch to
be identical to today's cdh5-trunk branch; it had grown a few
years stale.) I've also added support to teasing the configuration
through test-with-docker.

This allows for Impala 2.x and 3.x to diverge here, and it allows
for testing changes to Impala-lzo.

Change-Id: Ieba45fc18d9e490f75d16c477cdc1cce26f41ce9
Reviewed-on: http://gerrit.cloudera.org:8080/12259
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-01-23 23:28:10 +00:00
Laszlo Gaal
12ce20b09d IMPALA-7913: Separate ccache TEMP directories by Docker container
ccache v3.1 (the default version for CentOS 6) has a problem when
multiple copies are run inside concurrent Docker containers: it
can get confused when creating/using temporary files. Version 3.2
and later are free of this problem, see:
https://ccache.samba.narkive.com/o4BSOjxG/shared-ccache-directory-between-docker-containers

This patch points each copy of ccache to a separate, private temporary
directory by passing an explicit CCACHE_TEMPDIR environment variable
to each launched container.

Verified by looking into each running container using
"docker exec -it .... /bin/bash", checking the value of CCACHE_TEMPDIR
and observing tempfile traffic within the directory.

Change-Id: I8e6f1e31ca9419224a2a73a3e5ff46b004bb10c6
Reviewed-on: http://gerrit.cloudera.org:8080/12030
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-12-10 17:35:03 +00:00
Philip Zeyliger
de0c6bd6bd test-with-docker: allow built images to be used with "docker run" easily.
Configures the built container to enter into a script that
starts the minicluster. As a result, "docker run -ti <container>" will
launch the user into a shell with the Impala minicluster and the
impala development cluster running.

To handle cases where users don't specify --privileged, we skip
Kudu if it NTP seems unavailable.

Change-Id: Ib8d6a28d4cb4ab019cd72415024b23374a6d9e2f
Reviewed-on: http://gerrit.cloudera.org:8080/11781
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-26 18:44:58 +00:00
Philip Zeyliger
c1701074d6 IMPALA-7698: Add centos support to bootstrap_system.
Largely, the changes involve conditionalizing some invocations to
account for differences between RH and Ubuntu. The trickiest bits were
timezone-related test errors (see below), postgresql permissions (need
to accept md5 passwords from localhost) and default ulimits (1024 user
processes/threads is not enough).

To test this, I built using test-with-docker. In additional to the
ulimit issue, I ran into the fact that /tmp needed 1777 permissions for
the postgresql socket, and entrypoint.sh had a few places that needed
special cases. At the moment, the data load ran fine, as did most of the
tests. I observed a test that relied on a python2.7-ism fail, which is
part of the point of this.

In the course of development, I encountered a handful of tests fail with
"Encounter parse error: failed to open /usr/share/zoneinfo/GMT-08:00 -
No such file or directory.", which was reproduced as follows:

    [localhost:21000] default> use functional_orc_def; select * from alltypes;
    ...
    WARNINGS: Encounter parse error: failed to open /usr/share/zoneinfo/GMT-08:00 - No such file or directory.

With Quanlong's help, I learned what was happening. test-with-docker was
translating my time zone (America/Los_Angeles) to US/Pacific-New,
because realpath(/etc/localtime) = US/Pacific-New. This timezone exists
in centos:6, so that wasn't a problem. However, this timezone does not
exist in the package "tzdata-java", which is the copy of the timezone
information used by Java. (There are bugs here that may have been fixed
in centos:7.) As a result, when ORC asks (by using
TimeZone.getDefault().getID()) the JDK
(src/solaris/native/java/util/TimeZone_md.c) for the default timezone,
it can't find the same name as /etc/localtime points to in its
repository and defaults to "GMT-08:00". This string then gets written
into the ORC files generated by Hive as part of data load, and then the
C++ library can't read them. This is fixed by changing "realpath"
to "readlink" in test-with-docker.py.

centos:7 is not addressed by this change. The move to systemd makes
"service sshd start" (and the same for postgresql) not work, and
additional care needs to be done to work around that.

This change is a joint effort with Laszlo Gaal.

Change-Id: Id54294d7607f51de87a9de373dcfc4a33f4bedf5
Reviewed-on: http://gerrit.cloudera.org:8080/11731
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-26 08:43:22 +00:00
Philip Zeyliger
1dbb3c1be6 test-with-docker: add --env option to pass through env variables
Adds simple mechanism to pass environment variables into
test-with-docker, which is occasionally useful, especially for
development and tinkering with tests. It's typically the right thing to
codify the environment variables into tests, but a pass through can be
handy.

The implementation is simple enough, passing the
variables into the docker containers.

Change-Id: I03c2feda8edc2983e423f862ed210fabb845714f
Reviewed-on: http://gerrit.cloudera.org:8080/11730
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-25 23:52:58 +00:00
Laszlo Gaal
ee3da43709 Prettify the timeline produced by test-with-docker.py
The change tweaks the HTML template for the timeline summary
to make it slightly more readable:
- Adds legend strings to the CPU graphs
- Inserts the test run name into the CPU chart title to clarify
  which chart show which build/test phase
- Stretches the CPU charts a bit wider
- Identifes the common prefix of the phase/container names (the build
  name) and delete it from the chart labels. This increases legibility
  by cutting down on noise and growing the chart real estate.

  To support this change the Python drivers are also changed:
  the build name parameter, which is the common prefix, is passed
  to monitor.py and written to the JSON output

- The name of the build and data load phase container is suffixed with
  "-build" so that it shares the naming convention for the other
  containers.
- The timeline graph section is sized explicitly byt computing the height
  from the number of distinct tasks. This avoids having a second scrollbar
  for the timeline, which is annoying.
  The formula is pretty crude: it uses empirical constants, but produces
  an OK layout for the default font sizes in Chrome (both on Linux
  and the Mac).

Tested so far by tweaking the HTML template and an HTML result file
from an earlier build.

Change-Id: I7a41bea762b0e33f3d71b0be57eedbacb19c680c
Reviewed-on: http://gerrit.cloudera.org:8080/11578
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-09 19:12:50 +00:00
Philip Zeyliger
91673fee60 IMPALA-7624: Workaround docker/kernel bug causing test-with-docker to sometimes hang.
I've observed that builds of test-with-docker that have "suite
parallelism" sometimes hang when the Docker containers are
being created. (The implementation had multiple threads calling
"docker create" simultaneously.) Trolling the mailing lists,
it's maybe a bug in Docker or the kernel. I've never caught
it live enough to strace it.

A hopeful workaround is to serialize the docker create calls, which is
easy and harmless, given that "docker create" is usually pretty quick
(subsecond) and the overall run time here is hours+.

With this change, I was able to run test-with-docker with
--suite-concurrency=6 on a c5.9xlarge in AWS, with a total runtime of
1h35m.

The hangs are intermittent and cause, in the typical case, inconsistency
in runtimes because less parallelism happens when one of the "docker
create" calls hang. (I've seen them resume after one of the other
containers finishes.) We'll find out with time whether this stabilizes
it or has no effect.

Change-Id: I3e44db7a6ce08a42d6fe574d7348332578cd9e51
Reviewed-on: http://gerrit.cloudera.org:8080/11481
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-26 02:20:45 +00:00
Philip Zeyliger
cf5de09761 IMPALA-7385: Fix test-with-docker errors having to do with time zones.
ExprTest.TimestampFunctions,
query_test.test_scanners.TestOrc.test_type_conversions, and
query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node were all
failing when using test-with-docker with mismatched dates.

As it turns out, there is code that calls readlink(/etc/localtime)
and parses the output to identify the current timezone name.
This is described in localtime(5) on Ubuntu16:

  It should be an absolute or relative symbolic link pointing to
  /usr/share/zoneinfo/, followed by a timezone identifier such as
  "Europe/Berlin" or "Etc/UTC". ...  Because the timezone identifier is
  extracted from the symlink target name of /etc/localtime, this file
  may not be a normal file or hardlink."

To honor this requirement, and to make the tests pass, I re-jiggered
how I pass the time zone information from the host into the container.

The previously failing tests now pass.

Change-Id: Ia9facfd9741806e7dbb868d8d06d9296bf86e52f
Reviewed-on: http://gerrit.cloudera.org:8080/11106
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-06 22:41:02 +00:00
Philip Zeyliger
abf6f8f465 Fix TestKuduOperations tests in test-with-docker by using consistent hostname.
TestKuduOperations, when run using test-with-docker, failed with errors
like:

  Remote error: Service unavailable: Timed out: could not wait for desired
  snapshot timestamp to be consistent: Tablet is lagging too much to be able to
  serve snapshot scan. Lagging by: 1985348 ms, (max is 30000 ms):

The underlying issue, as discovered by Thomas Tauber-Marshall, is that Kudu
serializes the hostnames of Kudu tablet servers, and, different containers in
test-with-docker use different hostnames.  This was exposed after "IMPALA-6812:
Fix flaky Kudu scan tests" switched to using READ_AT_SNAPSHOT for Kudu reads.

Using the same hostname for all the containers is easy and harmless;
this change does just that.

Change-Id: Iea8c5096b515a79601be2e919d32585fb2796b3d
Reviewed-on: http://gerrit.cloudera.org:8080/11082
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-01 01:25:38 +00:00
Philip Zeyliger
85ed7ae88b IMPALA-6070: Adding ASAN, --tail to test-with-docker.
* Adds -ASAN suites to test-with-docker.
* Adds --tail flag, which starts a tail subprocess. This
  isn't pretty (there's potential for overlap), but it's a dead simple
  way to keep an eye on what's going on.
* Fixes a bug wherein I could call "docker rm <container>" twice
  simultaneously, which would make Docker fail the second call,
  and then fail the related "docker rmi". It's better to serialize,
  and I did that with a simple lock.

Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Reviewed-on: http://gerrit.cloudera.org:8080/10319
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-05-19 00:37:50 +00:00
Philip Zeyliger
6454b74d2e test-with-docker: work with git worktree
This commit adds a little of git-wrangling to allow test-with-docker to
work when invoked from git directories managed by "git worktree". These
are different in that they reference another git directory elsewhere on
the file system, which also needs to be mounted into the container.

Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Reviewed-on: http://gerrit.cloudera.org:8080/10335
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-05-18 02:18:38 +00:00
Philip Zeyliger
2e6a63e31e IMPALA-6070: Further improvements to test-with-docker.
This commit tackles a few additions and improvements to
test-with-docker. In general, I'm adding workloads (e.g., exhaustive,
rat-check), tuning memory setting and parallelism, and trying to speed
things up.

Bug fixes:

* Embarassingly, I was still skipping thrift-server-test in the backend
  tests. This was a mistake in handling feedback from my last review.

* I made the timeline a little bit taller to clip less.

Adding workloads:

* I added the RAT licensing check.

* I added exhaustive runs. This led me to model the suites a little
  bit more in Python, with a class representing a suite with a
  bunch of data about the suite. It's not perfect and still
  coupled with the entrypoint.sh shell script, but it feels
  workable. As part of adding exhaustive tests, I had
  to re-work the timeout handling, since now different
  suites meaningfully have different timeouts.

Speed ups:

* To speed up test runs, I added a mechanism to split py.test suites into
  multiple shards with a py.test argument. This involved a little bit of work in
  conftest.py, and exposing $RUN_CUSTOM_CLUSTER_TESTS_ARGS in run-all-tests.sh.

  Furthermore, I moved a bit more logic about managing the
  list of suites into Python.

* Doing the full build with "-notests" and only building
  the backend tests in the relevant target that needs them. This speeds
  up "docker commit" significantly by removing about 20GB from the
  container.  I had to indicates that expr-codegen-test depends on
  expr-codegen-test-ir, which was missing.

* I sped up copying the Kudu data: previously I did
  both a move and a copy; now I'm doing a move followed by a move. One
  of the moves is cross-filesystem so is slow, but this does half the
  amount of copying.

Memory usage:

* I tweaked the memlimit_gb settings to have a higher default. I've been
  fighting empirically to have the tests run well on c4.8xlarge and
  m4.10xlarge.

The more memory a minicluster and test suite run uses, the fewer parallel
suites we can run. By observing the peak processes at the tail of a run (with a
new "memory_usage" function that uses a ps/sort/awk trick) and by observing
peak container total_rss, I found that we had several JVMs that
didn't have Xmx settings set. I added Xms/Xmx settings in a few
places:

 * The non-first Impalad does very little JVM work, so having
   an Xmx keeps it small, even in the parallel tests.
 * Datanodes do work, but they essentially were never garbage
   collecting, because JVM defaults let them use up to 1/4th
   the machine memory. (I observed this based on RSS at the
   end of the run; nothing fancier.) Adding Xms/Xmx settings
   helped.
 * Similarly, I piped the settings through to HBase.

A few daemons still run without resource limitations, but they don't
seem to be a problem.

Change-Id: I43fe124f00340afa21ad1eeb6432d6d50151ca7c
Reviewed-on: http://gerrit.cloudera.org:8080/10123
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-04-26 20:47:29 +00:00
Philip Zeyliger
2896b8d127 IMPALA-6070: Expose using Docker to run tests faster.
Allows running the tests that make up the "core" suite in about 2 hours.
By comparison, https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/buildTimeTrend
tends to run in about 3.5 hours.

This commit:
* Adds "echo" statements in a few places, to facilitate timing.
* Adds --skip-parallel/--skip-serial flags to run-tests.py,
  and exposes them in run-all-tests.sh.
* Marks TestRuntimeFilters as a serial test. This test runs
  queries that need > 1GB of memory, and, combined with
  other tests running in parallel, can kill the parallel test
  suite.
* Adds "test-with-docker.py", which runs a full build, data load,
  and executes tests inside of Docker containers, generating
  a timeline at the end. In short, one container is used
  to do the build and data load, and then this container is
  re-used to run various tests in parallel. All logs are
  left on the host system.

Besides the obvious win of getting test results more quickly, this
commit serves as an example of how to get various bits of Impala
development working inside of Docker containers. For example, Kudu
relies on atomic rename of directories, which isn't available in most
Docker filesystems, and entrypoint.sh works around it.

In addition, the timeline generated by the build suggests where further
optimizations can be made. Most obviously, dataload eats up a precious
~30-50 minutes, on a largely idle machine.

This work is significantly CPU and memory hungry. It was developed on a
32-core, 120GB RAM Google Compute Engine machine. I've worked out
parallelism configurations such that it runs nicely on 60GB of RAM
(c4.8xlarge) and over 100GB (eg., m4.10xlarge, which has 160GB). There is
some simple logic to guess at some knobs, and there are knobs.  By and
large, EC2 and GCE price machines linearly, so, if CPU usage can be kept
up, it's not wasteful to run on bigger machines.

Change-Id: I82052ef31979564968effef13a3c6af0d5c62767
Reviewed-on: http://gerrit.cloudera.org:8080/9085
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-04-06 06:40:07 +00:00