This patch adds the ability to create a new log for each spawn of the
sentry service. This will enable better trouble shooting for the
custom cluster tests that restart the sentry service.
Testing:
- Ran all custom cluster tests.
Change-Id: I6e538af7fd6e6ea21dc3f4442bdebf3b31558516
Reviewed-on: http://gerrit.cloudera.org:8080/11624
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes the SHOW GRANT USER statement to show all privileges
granted to a user, either directly via object ownership, or granted
through a role via a group the user belongs to. The output for SHOW
GRANT USER will have two additional columns for privilege name and
privilege type so the user can know where the privilege comes from.
Truncated sample showing two columns that are different from role:
+----------------+----------------+--------+----------+-...
| principal_type | principal_name | scope | database | ...
+----------------+----------------+--------+----------+-...
| USER | foo | table | foo_db | ...
| ROLE | foo_role | server | | ...
+----------------+----------------+--------+----------+-...
Testing:
- Create new custom cluster test with custom group mapping.
- Ran FE and custom cluster tests.
Change-Id: Ie9f6c88f5569e1c414ceb8a86e7b013eaa3ecde1
Reviewed-on: http://gerrit.cloudera.org:8080/11531
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The problem was caused by update in Hive with changed notifications.
HIVE-15180 was added but was incomplete and resulted in the break.
HIVE-17747 fixed the issue by properly creating the messages.
Change-Id: I4b9276c36bf96afccd7b8ff48803a30b47062c3d
Reviewed-on: http://gerrit.cloudera.org:8080/11466
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch adds calls to automatically create or remove owner
privileges in the catalog based on the statement. This is similar to
the existing pattern where after privileges are granted in Sentry,
they are created in the catalog directly instead of pulled from
Sentry.
When object ownership is enabled:
CREATE DATABASE will grant the user OWNER privileges to that database.
ALTER DATABASE SET OWNER will transfer the OWNER privileges to the
new owner.
DROP DATABASE will revoke the OWNER privileges from the owner.
This will apply to DATABASE, TABLE, and VIEW.
Example:
If ownership is enabled, when a table is created, the creator is the
owner, and Sentry will create owner privileges for the created table so
the user can continue working with it without waiting for Sentry
refresh. Inserts will be available immediately.
Testing:
- Created new custom cluster tests for object ownership
Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Reviewed-on: http://gerrit.cloudera.org:8080/11314
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
gcovr is a python library that uses gcov to generate
code coverage reports. This adds gcovr to the python
dependencies and adds bin/impala-gcovr to provide
easy access to gcovr's command line. gcovr 3.4
supports python 2.6+.
This also adds bin/coverage_helper.sh to provide a
simplified interface to generate reports and zero
coverage counters.
Code coverage data is written out when a program
exits, so it is important to avoid hard kills
to shut down the impalads when generating coverage.
This modifies testdata/bin/kill-all.sh to call
start-impala-cluster.py --kill when shutting down
the minicluster to try to avoid doing a hard kill.
It will still do a hard kill if impala is still
running after the softer kill.
Change-Id: I5b2e0b794c64f9343ec976de7a3f235e54d2badd
Reviewed-on: http://gerrit.cloudera.org:8080/10791
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch removes the use of IMPALA_MINICLUSTER_PROFILE. The code that
uses IMPALA_MINICLUSTER_PROFILE=2 is removed and it defaults to code from
IMPALA_MINICLUSTER_PROFILE=3. In order to reduce having too many code
changes in this patch, there is no code change for the shims. The shims
for IMPALA_MINICLUSTER_PROFILE=3 automatically become the default
implementation.
Testing:
- Ran core and exhaustive tests
Change-Id: Iba4a81165b3d2012dc04d4115454372c41e39f08
Reviewed-on: http://gerrit.cloudera.org:8080/10940
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This change ensures that the planner computes parquet conjuncts
only when for scans containing parquet files. Additionally, it
also handles PARQUET_DICTIONARY_FILTERING and
PARQUET_READ_STATISTICS query options in the planner.
Testing was carried out independently on parquet and non-parquet
scans:
1. Parquet scans were tested via the existing parquet-filtering
planner test. Additionally, a new test
[parquet-filtering-disabled] was added to ensure that the
explain plan generated skips parquet predicates based on the
query options.
2. Non-parquet scans were tested manually to ensure that the
functions to compute parquet conjucts were not invoked.
Additional test cases were added to the parquet-filtering
planner test to scan non parquet tables and ensure that the
plans do not contain conjuncts based on parquet statistics.
3. A parquet partition was added to the alltypesmixedformat
table in the functional database. Planner tests were added
to ensure that Parquet conjuncts are constructed only when
the Parquet partition is included in the query.
Change-Id: I9d6c26d42db090c8a15c602f6419ad6399c329e7
Reviewed-on: http://gerrit.cloudera.org:8080/10704
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
In this patch we add a query option ALLOW_ERASURE_CODED_FILES, that
allows us to enable or disable the support of erasure coded files. Even
though Impala should be able to handle HDFS erasure coded files already,
this feature hasn't been tested thoroughly yet. Also, Impala lacks
metrics, observability and DDL commands related to erasure coding. This
is a query option instead of a startup flag because we want to make it
possible for advanced users to enable the feature.
We may also need a follow on patch to also disable the write path with
this flag.
Cherry-picks: not for 2.x
Change-Id: Icd3b1754541262467a6e67068b0b447882a40fb3
Reviewed-on: http://gerrit.cloudera.org:8080/10646
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
As part of IMPALA-3307, we copy a time-zone database
into HDFS. This command is failing on local filesystem
due to a missing FILESYSTEM_PREFIX.
This adds FILESYSTEM_PREFIX for this command.
Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d
Reviewed-on: http://gerrit.cloudera.org:8080/10803
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Impala currently uses two different libraries for timestamp
manipulations: boost and glibc.
Issues with boost:
- Time-zone database is currently hard coded in timezone_db.cc.
Impala admins cannot update it without upgrading Impala.
- Time-zone database is flat, therefore can’t track year-to-year
changes.
- Time-zone database is not updated on a regular basis.
Issues with glibc:
- Uses /usr/share/zoneinfo/ database which could be out of sync on
some of the nodes in the Impala cluster.
- Uses the host system’s local time-zone. Different nodes in the
Impala cluster might use a different local time-zone.
- Conversion functions take a global lock, which causes severe
performance degradation.
In addition to the issues above, the fact that /usr/share/zoneinfo/
and the hard-coded boost time-zone database are both in use is a
source of inconsistency in itself.
This patch makes the following changes:
- Instead of boost and glibc, impalad uses Google's CCTZ to implement
time-zone conversions.
- Introduces a new startup flag (--hdfs_zone_info_zip) to impalad to
specify an HDFS/S3/ADLS path to a zip archive that contains the
shared compiled IANA time-zone database. If the startup flag is set,
impalad will use the specified time-zone database. Otherwise,
impalad will use the default /usr/share/zoneinfo time-zone database.
- Introduces a new startup flag (--hdfs_zone_alias_conf) to impalad to
specify an HDFS/S3/ADLS path to a shared config file that contains
definitions for non-standard time-zone aliases.
- impalad reads the entire time-zone database into an in-memory
map on startup for fast lookups.
- The name of the coordinator node’s local time-zone is saved to the
query context when preparing query execution. This time-zone is used
whenever the current time-zone is referred afterwards in an
execution node.
- Adds a new ZipUtil class to extract files from a zip archive. The
implementation is not vulnerable to Zip Slip.
Cherry-picks: not for 2.x.
Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Reviewed-on: http://gerrit.cloudera.org:8080/9986
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Attila Jeges <attilaj@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The minicluster args for dataload changed to
a bash array in IMPALA-7119, and this requires
a special syntax to derefence and get the whole
array.
This fixes the invocation to use the right
syntax ($BASH_VAR[@] rather than $BASH_VAR).
Change-Id: Ie9a24c0e9fa34e43697b16b48cf219f47f30c0cc
Reviewed-on: http://gerrit.cloudera.org:8080/10782
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
After loading data, we wait for HDFS to replicate
all of the blocks appropriately. If this takes too long,
we restart HDFS. However, HBase can fail if HDFS is
restarted and HBase is unable to write its logs.
In general, there is no real reason to keep HBase
and the other minicluster components running while
restarting HDFS.
This changes the HDFS health check to restart the
whole minicluster and Impala rather than just HDFS.
Testing:
- Tested with a modified version that always does
the restart in the HDFS health check and verified
that the tests pass
Change-Id: I58ffe301708c78c26ee61aa754a06f46c224c6e2
Reviewed-on: http://gerrit.cloudera.org:8080/10665
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Some frontend PlannerTests rely on HBase tables being
arranged in a deterministic way. Specifically, the
HBase tables need to be split with specific region
boundaries and those regions need to be assigned to
specific HBase region servers.
Currently, the tables are created without splits and
testdata/bin/split-hbase.sh runs Java code in
HBaseTestDataRegionAssignment to split and assign
the tables. This runs during dataload via
testdata/bin/create-load-data.sh and during tests
with bin/run-all-tests.sh. There are problems with
both parts of this process. The table splitting is
flaky. Since significant time can pass between the
assignments and the tests, rebalancing means the
assignments are not always stable.
This changes the process so that the HBase tables are
created with the splits already specified via the
HBase shell. The splits remain stable over time.
PlannerTestBase runs the assignment code in
HBaseTestDataRegionAssignment at the start of
the PlannerTests. This makes the assignments
deterministic. No other tests depends on the
exact assignments, so this does not regress anything.
Testing:
- Local testing
- Ran gerrit-verify-dryrun-external
- Verified minicluster profile 2 compiles
Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897
Reviewed-on: http://gerrit.cloudera.org:8080/10447
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit builds on the previous work of
Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/
The commit implements the write path of PARQUET-922:
"Add column indexes to parquet.thrift". As specified in the
parquet-format, Impala writes the page indexes just before
the footer. This allows much more efficient page filtering
than using the same information from the 'statistics' field
of DataPageHeader.
I updated Pooja's python tests as well.
Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Reviewed-on: http://gerrit.cloudera.org:8080/9693
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
HBase splitting can fail due to changes in HBase code. It
is useful to still do tests even if HBase splitting failed.
As it is today, buildall.sh will abort if
create-load-data.sh's invocation of split-hbase.sh fails.
No tests run, even though the HBase splitting affects only
a small portion of our tests.
This changes create-load-data.sh to keep going with
dataload if HBase splitting fails. It outputs the same
errors to the log as it would before this change.
It adds a message to explain that it is ignoring
the failure and there may be related test failures.
Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
Reviewed-on: http://gerrit.cloudera.org:8080/10437
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Erasure coding data loading is flaky in two ways:
1. HBase sometimes doesn't work because of HBase-19369
2. Nested data loading sometimes fails because the HDFS namenode cannot
find enough good datanodes.
For problem 1, this patch enables erasure coding only on /test-warehouse
directory. For problem 2, this patch sets
dfs.namenode.redundancy.considerLoad to false, preventing namenode from
excluding heavily-loaded datanodes.
Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c
Reviewed-on: http://gerrit.cloudera.org:8080/10362
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
There is a Hive bug in Hive 1.1.0 that can result
in a NullPointerException when doing parallel Hive
operations (see IMPALA-6532). Since dataload goes
parallel on Hive loads starting with IMPALA-6372,
dataload can hit this error on Hive 1.1.0 (i.e.
IMPALA_MINICLUSTER_PROFILE=2). This is impacting
builds on the 2.x branch.
This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2.
IMPALA_MINICLUSTER_PROFILE=3 uses a newer version
of Hive that has a fix for this, so this continues
to use parallel dataload for that case.
Parallelism can be reenabled when Hive 1.1.0 gets the
fix from Hive 2.1.1.
Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Reviewed-on: http://gerrit.cloudera.org:8080/10306
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
In this patch we add the "ERASURE_CODING" enviornment variable. If we
enable it, a cluster with 5 data nodes will be created during data
loading and HDFS will be started with erasure coding enabled.
Testing:
I ran the core build, and verified that erasure coding gets enabled in
HDFS. Many of our EE tests failed however.
Cherry-picks: not for 2.x
Change-Id: I397aed491354be21b0a8441ca671232dca25146c
Reviewed-on: http://gerrit.cloudera.org:8080/10275
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Running Hadoop 3 with Java 7 can result in some obscure error messages.
This change adds a warning to impala-config.sh when using Hadoop 3 with
Java 7.
Your development environment is configured for Hadoop 3 and Java 7.
Hadoop 3 requires at least Java 8. Your JAVA binary currently points
to /usr/lib/jvm/java-7-oracle-amd64/bin/java and reports the
following version:
java version "1.7.0_75"
Java(TM) SE Runtime Environment (build 1.7.0_75-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.75-b04, mixed mode)
It also catches failure of the minicluster start and prints an
additional warning when running with Hadoop 3 and Java 7.
Cherry-picks: not for 2.x
Change-Id: I4d8b505cf045eeb562d16ce4ce09da0712dc03eb
Reviewed-on: http://gerrit.cloudera.org:8080/10244
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This has two related changes.
IMPALA-6679: defer scanner reservation increases
------------------------------------------------
When starting each scan range, check to see how big the initial scan
range is (the full thing for row-based formats, the footer for
Parquet) and determine whether more reservation would be useful.
For Parquet, base the ideal reservation on the actual column layout
of each file. This avoids reserving memory that we won't use for
the actual files that we're scanning. This also avoid the need to
estimate ideal reservation in the planner.
We also release scanner thread reservations above the minimum as
soon as threads complete, so that resources can be released slightly
earlier.
IMPALA-6678: estimate Parquet column size for reservation
---------------------------------------------------------
This change also reduces reservation computed by the planner in certain
cases by estimating the on-disk size of column data based on stats. It
also reduces the default per-column reservation to 4MB since it appears
that < 8MB columns are generally common in practice and the method for
estimating column size is biased towards over-estimating. There are two
main cases to consider for the performance implications:
* Memory is available to improve query perf - if we underestimate, we
can increase the reservation so we can do "efficient" 8MB I/Os for
large columns.
* The ideal reservation is not available - query performance is affected
because we can't overlap I/O and compute as much and may do smaller
(probably 4MB I/Os). However, we should avoid pathological behaviour
like tiny I/Os.
When stats are not available, we just default to reserving 4MB per
column, which typically is more memory than required. When stats are
available, the memory required can be reduced below when some heuristic
tell us with high confidence that the column data for most or all files
is smaller than 4MB.
The stats-based heuristic could reduce scan performance if both the
conservative heuristics significantly underestimate the column size
and memory is constrained such that we can't increase the scan
reservation at runtime (in which case the memory might be used by
a different operator or scanner thread).
Observability:
Added counters to track when threads were not spawned due to reservation
and to track when reservation increases are requested and denied. These
allow determining if performance may have been affected by memory
availability.
Testing:
Updated test_mem_usage_scaling.py memory requirements and added steps
to regenerate the requirements. Loops test for a while to flush out
flakiness.
Added targeted planner and query tests for reservation calculations and
increases.
Change-Id: Ifc80e05118a9eef72cac8e2308418122e3ee0842
Reviewed-on: http://gerrit.cloudera.org:8080/9757
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit tackles a few additions and improvements to
test-with-docker. In general, I'm adding workloads (e.g., exhaustive,
rat-check), tuning memory setting and parallelism, and trying to speed
things up.
Bug fixes:
* Embarassingly, I was still skipping thrift-server-test in the backend
tests. This was a mistake in handling feedback from my last review.
* I made the timeline a little bit taller to clip less.
Adding workloads:
* I added the RAT licensing check.
* I added exhaustive runs. This led me to model the suites a little
bit more in Python, with a class representing a suite with a
bunch of data about the suite. It's not perfect and still
coupled with the entrypoint.sh shell script, but it feels
workable. As part of adding exhaustive tests, I had
to re-work the timeout handling, since now different
suites meaningfully have different timeouts.
Speed ups:
* To speed up test runs, I added a mechanism to split py.test suites into
multiple shards with a py.test argument. This involved a little bit of work in
conftest.py, and exposing $RUN_CUSTOM_CLUSTER_TESTS_ARGS in run-all-tests.sh.
Furthermore, I moved a bit more logic about managing the
list of suites into Python.
* Doing the full build with "-notests" and only building
the backend tests in the relevant target that needs them. This speeds
up "docker commit" significantly by removing about 20GB from the
container. I had to indicates that expr-codegen-test depends on
expr-codegen-test-ir, which was missing.
* I sped up copying the Kudu data: previously I did
both a move and a copy; now I'm doing a move followed by a move. One
of the moves is cross-filesystem so is slow, but this does half the
amount of copying.
Memory usage:
* I tweaked the memlimit_gb settings to have a higher default. I've been
fighting empirically to have the tests run well on c4.8xlarge and
m4.10xlarge.
The more memory a minicluster and test suite run uses, the fewer parallel
suites we can run. By observing the peak processes at the tail of a run (with a
new "memory_usage" function that uses a ps/sort/awk trick) and by observing
peak container total_rss, I found that we had several JVMs that
didn't have Xmx settings set. I added Xms/Xmx settings in a few
places:
* The non-first Impalad does very little JVM work, so having
an Xmx keeps it small, even in the parallel tests.
* Datanodes do work, but they essentially were never garbage
collecting, because JVM defaults let them use up to 1/4th
the machine memory. (I observed this based on RSS at the
end of the run; nothing fancier.) Adding Xms/Xmx settings
helped.
* Similarly, I piped the settings through to HBase.
A few daemons still run without resource limitations, but they don't
seem to be a problem.
Change-Id: I43fe124f00340afa21ad1eeb6432d6d50151ca7c
Reviewed-on: http://gerrit.cloudera.org:8080/10123
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
HDFS commandline calls can be expensive due to JVM
startup and other costs. Since most HDFS commandline
calls can take multiple paths, one way to reduce
execution time is to consolidate multiple HDFS
commands into a single HDFS call. Since HDFS put
commands will follow symbolic links and can copy
recursively, this can allow for further consolidation
by creating the full directory structure and
copying it in a single HDFS call.
This does several of these optimizations throughout
the dataload codepath. It saves a few seconds here
and there:
Loading Hive Builtins: 1:10 -> 0:30
Loading custom schemas: 0:35 -> 0:20
Loading Hive UDFs: 0:45 -> 0:25
Change-Id: I0934353329dc7312394fc4457ab8db2a272c6282
Reviewed-on: http://gerrit.cloudera.org:8080/10120
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
testdata/bin/create-load-data.sh does bin/load-data.py for
functional/exhaustive, tpch/core, and tpcds/core in a
first phase, then it loads functional and tpch for Kudu
in a second phase. For a full dataload, this second phase
is not necessary. functional/exhaustive and tpch/core
already include Kudu.
This avoids the second phase when doing a full dataload.
The second phase is still necessary when loading from
a snapshot, and this does not change that behavior.
This saves a couple minutes off of full dataload.
Change-Id: Ic023d230f99126ed37795106c38faae5f0cb608e
Reviewed-on: http://gerrit.cloudera.org:8080/10128
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The patch puts the output of Sentry to
$IMPALA_CLUSTER_LOGS_DIR/sentry/sentry.out to follow the
same convention as other service output logs.
Testing:
- Injected some failure in run-sentry-service.sh script to see if the
error message was captured
Change-Id: I76627bb5b986a548ec6e4f12b555bd6fc8c4dab8
Reviewed-on: http://gerrit.cloudera.org:8080/10064
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This changes generate-schema-statements.py to produce
separate SQL files for different file formats for Hive.
This changes load-data.py to go parallel on these
separate Hive SQL files. For correctness, the text
version of all tables must be loaded before any
of the other file formats.
load-data.py runs DDLs to create the tables in Impala
and goes parallel. Currently, there are some minor
dependencies so that text tables must be created
prior to creating the other table formats. This
changes the definitions of some tables in
testdata/datasets/functional/functional_schema_template.sql
to remove these dependencies. Now, the DDLs for the
text tables can run in parallel to the other file formats.
To unify the parallelism for Impala and Hive, load-data.py
now uses a single fixed-size pool of processes to run all
SQL files rather than spawning a thread per SQL file.
This also modifies the locations that do invalidate to
use refresh where possible and eliminate global
invalidates.
For debuggability, different SQL executions output to
different log files rather than to standard out. If an
error occurs, this will point out the relevant log
file.
This saves about 10-15 minutes on dataload (including
for GVO).
Change-Id: I34b71e6df3c8f23a5a31451280e35f4dc015a2fd
Reviewed-on: http://gerrit.cloudera.org:8080/8894
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch integrates the orc library into Impala and implements
HdfsOrcScanner as a middle layer between them. The HdfsOrcScanner
supplies input needed from the orc-reader, tracks memory consumption of
the reader and transfers the reader's output (orc::ColumnVectorBatch)
into impala::RowBatch. The ORC version we used is release-1.4.3.
A startup option --enable_orc_scanner is added for this feature. It's
set to true by default. Setting it to false will fail queries on ORC
tables.
Currently, we only support reading primitive types. Writing into ORC
table has not been supported neither.
Tests
- Most of the end-to-end tests can run on ORC format.
- Add tpcds, tpch tests for ORC.
- Add some ORC specific tests.
- Haven't enabled test_scanner_fuzz for ORC yet, since the ORC library
is not robust for corrupt files (ORC-315).
Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
Reviewed-on: http://gerrit.cloudera.org:8080/9134
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Allows running the tests that make up the "core" suite in about 2 hours.
By comparison, https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/buildTimeTrend
tends to run in about 3.5 hours.
This commit:
* Adds "echo" statements in a few places, to facilitate timing.
* Adds --skip-parallel/--skip-serial flags to run-tests.py,
and exposes them in run-all-tests.sh.
* Marks TestRuntimeFilters as a serial test. This test runs
queries that need > 1GB of memory, and, combined with
other tests running in parallel, can kill the parallel test
suite.
* Adds "test-with-docker.py", which runs a full build, data load,
and executes tests inside of Docker containers, generating
a timeline at the end. In short, one container is used
to do the build and data load, and then this container is
re-used to run various tests in parallel. All logs are
left on the host system.
Besides the obvious win of getting test results more quickly, this
commit serves as an example of how to get various bits of Impala
development working inside of Docker containers. For example, Kudu
relies on atomic rename of directories, which isn't available in most
Docker filesystems, and entrypoint.sh works around it.
In addition, the timeline generated by the build suggests where further
optimizations can be made. Most obviously, dataload eats up a precious
~30-50 minutes, on a largely idle machine.
This work is significantly CPU and memory hungry. It was developed on a
32-core, 120GB RAM Google Compute Engine machine. I've worked out
parallelism configurations such that it runs nicely on 60GB of RAM
(c4.8xlarge) and over 100GB (eg., m4.10xlarge, which has 160GB). There is
some simple logic to guess at some knobs, and there are knobs. By and
large, EC2 and GCE price machines linearly, so, if CPU usage can be kept
up, it's not wasteful to run on bigger machines.
Change-Id: I82052ef31979564968effef13a3c6af0d5c62767
Reviewed-on: http://gerrit.cloudera.org:8080/9085
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit slightly loosens the coupling between IMPALA_HIVE_VERSION
and "hive.version" in the Maven sense.
Cherry-picks: not for 2.x
Change-Id: Ifbe6f5208b4ad0ffc9cbfe4e93d712ce698beb23
Reviewed-on: http://gerrit.cloudera.org:8080/9925
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
When an impalad is in executor-only mode, it receives no
catalog updates. As a result, lib-cache entries are never
refreshed. A consequence is that udf queries can return
incorrect results or may not run due to resolution issues.
Both cases are caused by the executor using a stale copy
of the lib file. For incorrect results, an old version of
the method may be used. Resolution issues can come up if
a method is added to a lib file.
The solution in this change is to capture the coordinator's
view of the lib file's last modified time when planning.
This last modified time is then shipped with the plan to
executors. Executors must then use both the lib file path
and the last modified time as a key for the lib-cache.
If the coordinator's last modified time is more recent than
the executor's lib-cache entry, then the entry is refreshed.
Brief discussion of alternatives:
- lib-cache always checks last modified time
+ easy/local change to lib-cache
- adds an fs lookup always. rejected for this reason
- keep the last modified time in the catalog
- bound on staleness is too loose. consider the case where
fn's f1, f2, f3 are created with last modified times of
t1, t2, t3. treat the fn's last modified time as a low-watermark;
if the cache entry has a more recent time, use it. Such a scheme
would allow the version at t2 to persist. An old fn may keep the
state from converging to the latest. This could end up with strange
cases where different versions of the lib are used across executors
for a single query.
In contrast, the change in this path relies on the statestore to
push versions forward at all coordinators, so will push all
versions at all caches forward as well.
Testing:
- added an e2e custom cluster test
Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6
Reviewed-on: http://gerrit.cloudera.org:8080/9697
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Adds support for building against two sets of Hadoop ecosystem
components. The control variable is IMPALA_MINICLUSTER_PROFILE_OVERRIDE,
which can either be set to 2 (for Hadoop 2, Hive 1, and so on) or 3 (for
Hadoop 3, Hive 2, and so on).
We intend (in a trivial follow-on change soon) to make 3 the new default
and to explicitly deprecate 2, but this change only does not switch the
default yet. We support both to facilitate a smoother transition, but
support will be removed soon in the Impala 3.x line.
The switch is done at build time, following the pattern from IMPALA-5184
(build fe against both Hive 1 & 2 APIs). Switching back and forth
requires running 'cmake' again. Doing this at build-time avoids
complicating the Java code with classloader configuration.
There are relatively few incompatible APIs. This implementation
encapsulates that by extracting some Java code into
fe/src/compat-minicluminicluster-profile-{2,3}. (This follows the
pattern established by IMPALA-5184, but, to avoid a proliferation
of directories, I've moved the Hive files into the same tree.)
pattern from IMPALA-5184 (build fe against both Hive 1 & 2 APIs). I
consolidated the Hive changes into the same directory structure.
For Maven, I introduced Maven "profiles" to handle the two cases where
the dependencies (and exclusions) differ. These are driven by the
$IMPALA_MINICLUSTER_PROFILE environment variable.
For Sentry, exception class names changed. We work around this by adding
"isSentry...(Exception)" methods with two different implementations.
Sentry is also doing some odd shading, whereby some exceptions are
"sentry.org.apache.sentry..."; we handle both. Similarly, the mechanism
to create a SentryAuthProvider is slightly different. The easiest way to
see the differences is to run:
diff -u fe/src/compat-minicluster-profile-{2,3}/java/org/apache/impala/util/SentryUtil.java
diff -u fe/src/compat-minicluster-profile-{2,3}/java/org/apache/impala/authorization/SentryAuthProvider.java
The Sentry work is based on a change by Zach Amsden.
In addition, we recently added an explicit "refresh" permission. In
Sentry 2, this required creating an ImpalaPrivilegeModel to capture
that. It's a slight customization of Hive's equivalent class.
For Parquet, the difference is even more mechanical. The package names
gone from "parquet" to "org.apache.parquet". The affected code
was extracted into ParquetHelper, but only one copy exists. The second
copy is generated at build-time using sed.
In the rare cases where we need to behave differently at runtime,
MiniclusterProfile.MINICLUSTER_PROFILE is a class which encapsulates
what version we were built aginst. One of the cases is the results
expected by various frontend tests. I avoided the issue by translating
one error string into another, which handled the diversion in one place,
rather than complicating the several locations which look for "No
FileSystem for scheme..." errors.
The HBase APIs we use for splitting regions at test time changed.
This patch includes a re-write of that code for the new APIs. This
piece was contributed by Zach Amsden.
To work with newer versions of dependencies, I updated the version of
httpcomponents.core we use to 4.4.9.
We (Thomas Tauber-Marshall and I) uploaded new Hadoop/Hive/Sentry/HBase
binaries to s3://native-toolchain, and amended the shell scripts to
launch the right things. There are minor mechanical differences. Some
of this was based on earlier work by Joe McDonnell and Zach Amsden.
Hive's logging is changed in Hive 2, necessitating creating a
log4j2.properties template and using it appropriately. Furthermore,
Hadoop3's new shell script re-writes do a certain amount of classpath
de-duplication, causing some issues with locating the relevant logging
configurations. Accomodations exist in the code to deal with that.
parquet-filtering.test was updated to turn off stats filtering. Older
Hive didn't write Parquet statistics, but newer Hive does. By turning
off stats filtering, we test what the test had intended to test.
For views-compatibility.test, it seems that Hive 2 has fixed certain
bugs that we were testing for in Hive. I've added a
HIVE=SUCCESS_PROFILE_3_ONLY mechanism to capture that.
For AuthorizationTest, different hive versions show slightly different
things for extended output.
To facilitate easier reviewing, the following files are 100% renames as identified by git; nothing
to see here.
rename fe/src/{compat-hive-1 => compat-minicluster-profile-2}/java/org/apache/hive/service/rpc/thrift/TGetCatalogsReq.java (100%)
rename fe/src/{compat-hive-1 => compat-minicluster-profile-2}/java/org/apache/hive/service/rpc/thrift/TGetColumnsReq.java (100%)
rename fe/src/{compat-hive-1 => compat-minicluster-profile-2}/java/org/apache/hive/service/rpc/thrift/TGetFunctionsReq.java (100%)
rename fe/src/{compat-hive-1 => compat-minicluster-profile-2}/java/org/apache/hive/service/rpc/thrift/TGetInfoReq.java (100%)
rename fe/src/{compat-hive-1 => compat-minicluster-profile-2}/java/org/apache/hive/service/rpc/thrift/TGetSchemasReq.java (100%)
rename fe/src/{compat-hive-1 => compat-minicluster-profile-2}/java/org/apache/hive/service/rpc/thrift/TGetTablesReq.java (100%)
rename fe/src/{compat-hive-1 => compat-minicluster-profile-2}/java/org/apache/impala/compat/MetastoreShim.java (100%)
rename fe/src/{compat-hive-2 => compat-minicluster-profile-3}/java/org/apache/impala/compat/MetastoreShim.java (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/hadoop/conf/kms-acls.xml.tmpl (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/hadoop/conf/kms-site.xml.tmpl (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/hadoop/conf/yarn-site.xml.tmpl (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/init.d/kudu-common (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/init.d/kudu-master (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/init.d/kudu-tserver (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/kudu/master.conf.tmpl (100%)
rename testdata/cluster/node_templates/{cdh5 => common}/etc/kudu/tserver.conf.tmpl (100%)
CreateTableLikeFileStmt had a chunk of code moved to ParquetHelper.java. This
was done manually, but without changing anything except what Java required in
terms of accessibility and boilerplate.
rewrite fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java (80%)
copy fe/src/{main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java => compat-minicluster-profile-3/java/org/apache/impala/analysis/ParquetHelper.java} (77%)
Testing: Ran core & exhaustive tests with both profiles.
Cherry-picks: not for 2.x.
Change-Id: I7a2ab50331986c7394c2bbfd6c865232bca975f7
Reviewed-on: http://gerrit.cloudera.org:8080/9716
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
In wait-hdfs-replication, the frequent and eager restart might slow the
HDFS replication down. HDFS should be restarted only if no progress is
made in a certain amount of time, and we should wait longer before
failing the data loading.
Testing: It's tested with a fake HDFS fsck script.
Change-Id: Ib059480254643dc032731b4b3c55204a93b61e77
Reviewed-on: http://gerrit.cloudera.org:8080/9698
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
The retries in split-hbase.sh don't work in the common case,
because $MINIKDC_PRINC_HIVE is not set in non-kerberized (common)
environments. The regular data load scripts (create-load-data.sh)
have code to manage that, but split-hbase.sh blindly forges ahead,
leading to errors like:
/home/impdev/Impala/testdata/bin/split-hbase.sh: line 49: MINIKDC_PRINC_HIVE: unbound variable
Error in /home/impdev/Impala/testdata/bin/create-load-data.sh at line 48: LOAD_DATA_ARGS=""
Since this hasn't been working, I opted to remove it entirely, as a failure on
the line where HBase splitting actually failed would be significantly more
useful than the error here. A search of mailing lists suggested that I was at
least the second person to have run into this. (In my case, I did break HBase
splitting, but it took me a second to identify the error, since the log was
spammed with unrelated information relating to the cluster restart.)
Testing: core tests.
Change-Id: I715891c9e744f21002330c3ae3ebc14095d94ffd
Reviewed-on: http://gerrit.cloudera.org:8080/9588
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins
HDFS sometimes fails to fully replicate all the blocks in 30 seconds
and no progress is made. This patch tries to restart HDFS several times
before aborting the data loading.
Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915
Reviewed-on: http://gerrit.cloudera.org:8080/9469
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
When loading from an up-to-date snapshot, dataload will
load all of the metadata and load data into HDFS. Then,
it will skip load-data.py for functional/exhaustive,
tpch/core, and tpcds/core. It will invoke a special
round of load-data.py calls to populate Kudu tables,
and it always runs these with a force reload.
However, when loading from an old snapshot, dataload will
still load all of the metadata and load the data into
HDFS, but then it will still invoke load-data.py for
functional/exhaustive, tpch/core, and tpcds/core.
These invocations mostly do DDLs with very few load
statements. However, these invocations are a problem
for Kudu. The metadata of Impala tables referencing
Kudu entities have been imported along with all the other
metadata, but the Kudu entities have not been created, as
they are separate from HDFS. This means that Kudu tables
are not really valid in this circumstance.
Since Kudu has been added to the list of data formats
for tpch/core (see IMPALA-6475), load-data.py with
tpch/core will attempt to insert into these invalid
Kudu tables.
To avoid this, always force reload any Kudu tables.
generate-schema-statements.py will always generate a
drop table statement before any create of a Kudu table.
This guarantees that the create will also create the
corresponding Kudu entity.
Change-Id: I2d07f3513c543e2590f2f62b96b37472316868ee
Reviewed-on: http://gerrit.cloudera.org:8080/9445
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins
Dataload currently executes bin/load-data.py for TPC-H,
TPC-DS, and functional-query concurrently. One of the final
steps for bin/load-data.py is to run a global "invalidate
metadata". Global "invalidate metadata" commands are known
to cause problem on concurrent systems. See IMPALA-5087.
For dataload, if TPC-H executes "invalidate metadata" while
TPC-DS is still creating tables and adding partitions,
the TPC-DS executor might erroneously believe that a table
does not exist.
This changes dataload to invalidate metadata at an
individual table level rather than globally. This
prevents the concurrency issue.
This also changes the names of some of the intermediate
SQL files generated by generate-schema-statements.py
and consumed by load-data.py to make them less confusing.
Change-Id: Ibc3a6d8a674a0bf6b02069bfe8a5e12034335b1f
Reviewed-on: http://gerrit.cloudera.org:8080/9009
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins
When the data loading finishes, it is possible for some HDFS blocks to
be under replicated. If impala gets the metadata before the replication
is done, some tests may fail. This patch adds a replication waiting step
in the data loading script.
Resubmitted with filesystem type check.
Change-Id: I64d9a8ea1d0a32b40047321b50a7139a8f48eac8
Reviewed-on: http://gerrit.cloudera.org:8080/8916
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Using fsck breaks non-HDFS builds: local, S3, and Isilon.
This reverts commit 5a7c10ec3d.
Change-Id: I0b12a42049543ca0b267b5146a0bbcdd2316abfc
Reviewed-on: http://gerrit.cloudera.org:8080/8880
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
When the data loading finishes, it is possible for some HDFS blocks to
be under replicated. If impala gets the metadata before the replication
is done, some tests may fail. This patch adds a replication waiting step
in the data loading script.
Change-Id: I88dfb7165b7515b3e96111436be490f2068ec322
Reviewed-on: http://gerrit.cloudera.org:8080/8846
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
The two Kudu loads and Hive UDFs can all run in parallel. This
should shave about 4 minutes off of the data load. (Current
timings are 3.5, 4, and 0.6 minutes, see below.)
I've run dataload with this change many times.
Loading Kudu functional (logging to /home/ubuntu/Impala/logs/data_loading/load-kudu.log)...
Loading workload 'functional-query' using exploration strategy 'core' in table formats 'kudu/none/none' OK (Took: 3 min 29 sec)
Loading Kudu TPCH (logging to /home/ubuntu/Impala/logs/data_loading/load-kudu-tpch.log)...
Loading workload 'tpch' using exploration strategy 'core' in table formats 'kudu/none/none' OK (Took: 4 min 0 sec)
Loading Hive UDFs (logging to /home/ubuntu/Impala/logs/data_loading/build-and-copy-hive-udfs.log)...
Loading Hive UDFs OK (Took: 0 min 41 sec)
Change-Id: I7e93ee5a77ec9271b980b88bef7ad512ecbe0407
Reviewed-on: http://gerrit.cloudera.org:8080/8822
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
I re-created the original patch for IMPALA-6068, but only
performed what I believe to be the limited legal transformation
of data load: DEPENDENT_LOAD -> DEPENDENT_LOAD_HIVE.
Any place that directly uploads via hadoop or hdfs commands
was left alone as changing it can't be proven to be correct.
Change-Id: I6c242cca209a7138b10ad517076707709b5cd204
Testing: Doing a full data load. I mistakenly changed a variable
name causing the first two dry-runs to fail.
Reviewed-on: http://gerrit.cloudera.org:8080/8690
Reviewed-by: Zach Amsden <zamsden@cloudera.com>
Tested-by: Zach Amsden <zamsden@cloudera.com>
This reverts commit e4f585240a.
Among other things, that commit replaced hdfs command line calls
with "LOAD DATA LOCAL INPATH" using Hive. However, doing so
presumes that the minicluster is the only test environment.
Sometimes though, the data load script is against a remote cluster,
and those cases, the data load process is now broken.
Change-Id: I6dc419934d2953eb950b14d090d7895ec57aa9f2
Reviewed-on: http://gerrit.cloudera.org:8080/8653
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Reviewed-by: Zach Amsden <zamsden@cloudera.com>
Tested-by: Impala Public Jenkins
With this commit, $IMPALA_MAVEN_OPTIONS is used by bin/mvn-quiet.sh
to configure Maven slightly. The default is no extra options.
This is handy for giving Maven a settings file with the "-s" flag, to
control, for example, repositories and their mirrors. In fact, I
considered exposing IMPALA_MAVEN_SETTINGS_FILE explicitly, but decided
that the generic option would be as good.
It's useful to customize how Maven works, especially
to provide a settings file with repository mirrors.
Change-Id: I2c62185476fd2388c7cda8884276b79a77370127
Reviewed-on: http://gerrit.cloudera.org:8080/8496
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins
This is a revert of a revert, re-enabling parallel data load. It avoid
the race condition by explicitly configuring the temporary directory in
question in load-data.py.
When the parallel data load change went in, we discovered
a race with a signature of:
java.io.FileNotFoundException: File
/tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist
The number in this path is milliseconds since the epoch, and the race
occurs when two queries submitted to HiveServer2, running with the local
runner, hit the same millisecond time stamp. The upstream bug is
https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the
symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which
is now marked as a dupe).
I've tested this by running data load 5 times on the same machines
where it failed before. I also ran data load manually and inspected
the system to make sure that the temporary directories are getting
created as expected in /tmp/impala-data-load-*.
Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
Reviewed-on: http://gerrit.cloudera.org:8080/8405
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
We may be seeing a race with errors like "java.io.FileNotFoundException:
File /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist".
This reverts commit e020c37106.
Change-Id: I46da93f4315a5a4bdaa96fa464cb51922bd6c419
Reviewed-on: http://gerrit.cloudera.org:8080/8386
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Dataload typically follows a pattern of loading data into
a text version of a table, and then using an insert
overwrite from the text table to populate the table for
other file formats. This insert is always done in Impala
for Parquet and Kudu. Otherwise it runs in Hive.
Since Impala doesn't support writing nested data, the
population of complextypes_fileformat tries to hack
the insert to run in Hive by including it in the ALTER
part of the table definition. ALTER runs immediately
after CREATE and always runs in Hive. The problem is
that ALTER also runs before the base table
(functional.complextypes_fileformat) is populated.
The insert succeeds, but it is inserting zero rows.
This code change introduces a way to force the Parquet
load to run using Hive. This lets complextypes_fileformat
specify that the insert should happen in Hive and fixes
the ordering so that the table is populated correctly.
This is also useful for loading custom Parquet files
into Parquet tables. Hive supports the DATA LOAD LOCAL
syntax, which can read a file from the local filesystem.
This means that several locations that currently use
the hdfs commandline can be modified to use this SQL.
This change speeds up dataload by a few minutes, as it
avoids the overhead of the hdfs commandline.
Any other location that could use DATA LOAD LOCAL is
also switched over to use it. This includes the
testescape* tables which now print the appropriate
DATA LOAD commands as a result of text_delims_table.py.
Any location that already uses DATA LOAD LOCAL is also
switched to indicate that it must run in Hive. Any
location that was doing an HDFS command in the LOAD
section is moved to the LOAD_DEPENDENT_HIVE section.
Testing: Ran dataload and core tests. Also verified that
functional_parquet.complextypes_fileformat has rows.
Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Reviewed-on: http://gerrit.cloudera.org:8080/8350
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins
This commit loads functional-query, TPC-H data, and TPC-DS data in
parallel. In parallel, these take about 37 minutes, dominated by
functional-query. Serially, these take about 30 minutes more, namely the
13 minutes of tpcds and 16 minutes of tpcds. This works out nicely
because CPU usage during data load is very low in aggregate. (We don't
sustain more than 1 CPU of load, whereas build machines are likely to
have many CPUs.)
To do this, I added support to run-step.sh to have a notion of a
backgroundable task, and support waiting for all tasks.
I also increased the heapsize of our HiveServer2 server. When datasets
were being loaded in parallel, we ran out of memory at 256MB of heap.
The resulting log output is currently like so (but without the
timestamps):
15:58:04 Started Loading functional-query data in background; pid 8105.
15:58:04 Started Loading TPC-H data in background; pid 8106.
15:58:04 Loading functional-query data (logging to /home/impdev/Impala/logs/data_loading/load-functional-query.log)...
15:58:04 Started Loading TPC-DS data in background; pid 8107.
15:58:04 Loading TPC-H data (logging to /home/impdev/Impala/logs/data_loading/load-tpch.log)...
15:58:04 Loading TPC-DS data (logging to /home/impdev/Impala/logs/data_loading/load-tpcds.log)...
16:11:31 Loading workload 'tpch' using exploration strategy 'core' OK (Took: 13 min 27 sec)
16:14:33 Loading workload 'tpcds' using exploration strategy 'core' OK (Took: 16 min 29 sec)
16:35:08 Loading workload 'functional-query' using exploration strategy 'exhaustive' OK (Took: 37 min 4 sec)
I tested dataloading with the following command on an 8-core, 32GB
machine. I saw 19GB of available memory during my run:
./buildall.sh -testdata -build_shared_libs -start_minicluster -start_impala_cluster -format
Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac
Reviewed-on: http://gerrit.cloudera.org:8080/8320
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Uses a thread pool to issue many compute stats commands in parallel to
Impala, rather than doing it serially. Where it was obvious, I combined
multiple stats commands into fewer, to reduce the number
of "show databses" and serialized "show tables" commands.
This speeds up the compute stats step in data loading significantly. My
measurements for testdata/bin/compute-table-stats.sh running before and
after this change, with the Impala daemons restarted (cold) or not
restarted (warm) on an 8-core, 32GB RAM machine were:
old, cold: 7m44s
new, cold: 1m42s
old, warm: 1m23s
new, warm: 48s
The data load in the full test build behaves in a cold fashion. It's
typical for https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/ to
run this compute stats step for 9 or 10 minutes. With this change, this
will come down to about 2 minutes.
Change-Id: Ifb080f2552b9dbe304ecadd6e52429214094237d
Reviewed-on: http://gerrit.cloudera.org:8080/8354
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Impala Public Jenkins