94 Commits

Author SHA1 Message Date
Tim Armstrong
3f0989a4fc IMPALA-7811: optionally count JVM heap towards process mem limit
Adds a flag --mem_limit_includes_jvm that alters memory accounting to
include the amount of memory we think that the JVM is likely to use.
By default this flag is false, so behaviour is unchanged.

We're not ready to change the default but I want to check this in to
enable experimentation.

Two metrics are counted towards the process limit:
* The maximum JVM heap size. We count this because the JVM memory
  consumption can expand up to this threshold at any time.
* JVM non-heap committed memory. This can be a non-trivial amount of
  memory (e.g. I saw 150MB on one production cluster). There isn't a
  hard upper bound on this memory that I know of but should not
  grow rapidly.

This requires adjustments in a couple of other places:
* Admission control previous assumed that all of the process memory
  limit was available to queries (an assumption that is not strictly
  true because of untracked memory, etc, but close enough). However,
  the JVM heap makes a large part of the process limit unusable to
  queries, so we should only admit up to "process limit - max JVM heap
  size" per node.
* The buffer pool is now a percentage of the remaining process limit
  after the JVM heap, instead of the total process limit.

Currently, end-to-end tests fail if run with this flag for two reasons:
* The default JVM heap size is 1/4 of physical memory, which means that
  essentially all of the process memory limit is consumed by the JVM
  heaps when we running 3 impala daemons per host, unless -Xmx is
  explicitly set.
* If the heap size is limited to 1-2GB like below, then most tests pass
  but TestInsert.test_insert_large_string fails because IMPALA-4865
  lets it create giant strings that eat up all the JVM heap.

  start-impala-cluster.py \
      --impalad_args=--mem_limit_includes_jvm=true --jvm_args="-Xmx1g"

Testing:
Add a custom cluster test that uses the new option and validates the
the memory consumption values.

Change-Id: I39dd715882a32fc986755d573bd46f0fd9eefbfc
Reviewed-on: http://gerrit.cloudera.org:8080/10928
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-12-04 08:20:34 +00:00
Fredy Wijaya
76842acc34 IMPALA-7824: INVALIDATE METADATA should not hang when Sentry is unavailable
Before this patch, running INVALIDATE METADATA when Sentry is
unavailable could cause Impala query to hang. PolicyReader thread in
SentryProxy is used by two use cases, one as a background thread
that periodically refreshes Sentry policy and another one as a
synchronous operation for INVALIDATE METADATA. For the background
thread, we need to swallow any exception thrown while refreshing the
Sentry policy in order to not kill the background thread. For a
synchronous reset operation, such as INVALIDATE METADATA, swallowing
an exception causes the Impala catalog to wait indefinitely for
authorization catalog objects that never get processed due to Sentry
being unavailable. The patch updates the code by not swallowing any
exception in INVALIDATE METADATA and return the exception to the
caller.

Testing:
- Ran all FE tests
- Added a new E2E test
- Ran all E2E authorization tests

Change-Id: Icff987a6184f62a338faadfdc1a0d349d912fc37
Reviewed-on: http://gerrit.cloudera.org:8080/11897
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-11-08 07:05:09 +00:00
Vuk Ercegovac
97f028299c IMPALA-7622: adds profile metrics for incremental stats
Reapplies change after fixing where frontend profile is placed in runtime
profile.

When computing incremental statistics by fetching the stats directly
from catalogd, a potentially expensive RPC is made from the impalad
coordinator to catalogd. This change adds metrics to the frontend
section of the profile to track how long the request takes, the size
of the compressed bytes received, and the number of partitions received.

The profile for a 'compute incremental ...' command on a table with
no statistics looks like this:

Frontend:
     - StatsFetch.CompressedBytes: 0
     - StatsFetch.TotalPartitions: 24
     - StatsFetch.NumPartitionsWithStats: 0
     - StatsFetch.Time: 26ms

And the profile looks as follows when the table has stats, so the stats
are fetched:

Frontend:
     - StatsFetch.CompressedBytes: 24622
     - StatsFetch.TotalPartitions: 23
     - StatsFetch.NumPartitionsWithStats: 23
     - StatsFetch.Time: 14ms

Testing:
- manual inspection
- e2e test to check the profile

Change-Id: I94559a749500d44aa6aad564134d55c39e1d5273
Reviewed-on: http://gerrit.cloudera.org:8080/11670
Reviewed-by: Tianyi Wang <twang@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-12 23:44:42 +00:00
Adam Holley
21f521a7c2 IMPALA-7554: Update custom cluster tests to have new logs for sentry
This patch adds the ability to create a new log for each spawn of the
sentry service. This will enable better trouble shooting for the
custom cluster tests that restart the sentry service.

Testing:
- Ran all custom cluster tests.

Change-Id: I6e538af7fd6e6ea21dc3f4442bdebf3b31558516
Reviewed-on: http://gerrit.cloudera.org:8080/11624
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-12 01:00:56 +00:00
Vuk Ercegovac
d918b2aeb5 Revert "IMPALA-7622: adds profile metrics when fetching incremental stats"
Breaks downstream dependence on profile (1/2 of changes).

This reverts commit 235748316c.

Change-Id: I80b4c0e4b8487572285ac788ab0195896f221842
Reviewed-on: http://gerrit.cloudera.org:8080/11551
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-01 21:33:43 +00:00
Vuk Ercegovac
235748316c IMPALA-7622: adds profile metrics when fetching incremental stats
When computing incremental statistics by fetching the stats directly
from catalogd, a potentially expensive RPC is made from the impalad
coordinator to catalogd. This change adds metrics to the frontend
section of the profile to track how long the request takes, the size
of the compressed bytes received, and the number of partitions received.

The profile for a 'compute incremental ...' command on a table with
no statistics looks like this:

Frontend:
     - StatsFetch.CompressedBytes: 0
     - StatsFetch.TotalPartitions: 24
     - StatsFetch.NumPartitionsWithStats: 0
     - StatsFetch.Time: 26ms

And the profile looks as follows when the table has stats, so the stats
are fetched:

Frontend:
     - StatsFetch.CompressedBytes: 24622
     - StatsFetch.TotalPartitions: 23
     - StatsFetch.NumPartitionsWithStats: 23
     - StatsFetch.Time: 14ms

Testing:
- manual inspection
- e2e test to check the profile

Change-Id: Ic9b268548c7a98c751eb99855ee08313d1d5a903
Reviewed-on: http://gerrit.cloudera.org:8080/11534
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-28 11:22:53 +00:00
Tim Armstrong
e83fe23a5f IMPALA-7632: fix erasure coding build for custom cluster tests
Fix tests to always pass query options via the query_options
parameter.

Modified the infrastructure to fail on non-erasure-coding builds if
tests pass in default query options in the wrong way.

Skip an restart test that makes assumptions about scheduling that EC
seems to break.

Testing:
Ran core tests with erasure coding enabled.

Change-Id: I4d809faedc0c45417519f13c73559efb6c54154e
Reviewed-on: http://gerrit.cloudera.org:8080/11536
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-28 01:23:41 +00:00
Adam Holley
48640b5dfa IMPALA-7456: Deprecate file-based authorization
This patch simply adds a warning message to the log when the
authorization_policy_file run-time flag is used.  Sentry has
deprecated the use of policy files and they do not support
user level privileges which are required for object ownership.
Here is the Jira where it will be removed. SENTRY-1922

Test:
- Added custom cluster test to validate logs
- Ran all custom cluster tests

Change-Id: Ibbb13f3ef1c3a00812c180ecef022ea638c2ebc7
Reviewed-on: http://gerrit.cloudera.org:8080/11502
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-25 23:03:27 +00:00
Adam Holley
c5dc6ded68 IMPALA-7537: REVOKE GRANT OPTION regression
This patch fixes several issues around granting and revoking of
privileges.  This includes:
- REVOKE ALL ON SERVER where the privilege has the grant option was
  removing from the cache but not Sentry.
- With the addition of the grantoption to the name in the catalog
  object, refactoring was required to make grants and revokes work
  correctly.

Assertions with regard to granting and revoking:
- If there is a privilege that has the grant option, that privilege
  can be revoked simply with "REVOKE privilege..." or the grant option
  can be removed with "REVOKE GRANT OPTION ON..."
- We should not limit the privilege being revoked simply because it
  has the grant option.
- If a privilege already exists without the grant option, granting the
  privilege with the grant option should add the grant option to it.
- If a privilege already exists with the grant option, granting the
  privilege without the grant option will not change anything as the
  expectation is if you want to remove the grant option, you should
  explicitly use the "REVOKE GRANT OPTION ON...".

Testing:
- Added new grant/revoke tests that validate cache and Sentry refresh
- Ran all FE, E2E, and custom-cluster tests.

Change-Id: I3be5c8f15e9bc53e9661347578832bf446abaedc
Reviewed-on: http://gerrit.cloudera.org:8080/11483
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-25 22:21:57 +00:00
Tim Armstrong
16f9437b4b IMPALA-7589: default query options for custom cluster
The bug that caused the erasure coding test failure was that the
default query options specified by the test overrode the allow_erasure_coded_files
option that was added by the custom cluster test infrastructure when running
erasure coded tests.

Testing:
Manually ran a custom cluster test with and without ERASURE_CODING=true
and with --capture=no and confirmed the right arguments were passed
to start-impala-cluster.py.

Change-Id: I14f60ea8746657a731e48850b0e48300a2b7c66d
Reviewed-on: http://gerrit.cloudera.org:8080/11463
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-24 20:02:40 +00:00
Adam Holley
23f5338bf6 Revert "Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER""
The problem was caused by update in Hive with changed notifications.
HIVE-15180 was added but was incomplete and resulted in the break.
HIVE-17747 fixed the issue by properly creating the messages.

Change-Id: I4b9276c36bf96afccd7b8ff48803a30b47062c3d
Reviewed-on: http://gerrit.cloudera.org:8080/11466
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-20 00:51:28 +00:00
Thomas Tauber-Marshall
23da624113 Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
This patch has been causing a large number of build failures. Revert
it until we figure out why.

Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442
Reviewed-on: http://gerrit.cloudera.org:8080/11456
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-18 02:11:48 +00:00
Adam Holley
e5b424ba4e IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
This patch adds calls to automatically create or remove owner
privileges in the catalog based on the statement.  This is similar to
the existing pattern where after privileges are granted in Sentry,
they are created in the catalog directly instead of pulled from
Sentry.

When object ownership is enabled:
CREATE DATABASE will grant the user OWNER privileges to that database.
ALTER DATABASE SET OWNER will transfer the OWNER privileges to the
new owner.
DROP DATABASE will revoke the OWNER privileges from the owner.
This will apply to DATABASE, TABLE, and VIEW.

Example:
If ownership is enabled, when a table is created, the creator is the
owner, and Sentry will create owner privileges for the created table so
the user can continue working with it without waiting for Sentry
refresh.  Inserts will be available immediately.

Testing:
- Created new custom cluster tests for object ownership

Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Reviewed-on: http://gerrit.cloudera.org:8080/11314
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-14 06:03:44 +00:00
Todd Lipcon
b986f2a8bb IMPALA-7510. Support principals/privileges with LocalCatalog
This enables support for Sentry authorization when LocalCatalog is
enabled. The design is detailed in a change to the comment on
CatalogdMetaProvider, but to recap it briefly here:

At a high level, this patch takes the approach of duplicating the "v1"
catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog
daemon publishes complete objects into the statestore topic, and the
impalad fully replicates them locally.

I took this approach rather than trying to do fine-grained caching and
invalidation for the following reasons:

- The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude
  smaller than table metadata. So, the benefit of fine-grained caching
  and eviction is not as great.

- The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined
  with relationships between them and backwards mappings maintained from
  groups back to principals. This logic is implemented by the
  AuthorizationPolicy class. Implementing similar mapping in a
  fine-grained caching approach would be a reasonable amount of work.

- This bit of code is under some current flux as others are working on
  implementing more fine grained permissioning. Thus, trying to
  duplicate the logic in a "fetch-on-demand" implementation might turn
  out to be chasing somewhat of a moving target.

In order to take this approach, the patch is organized as follows:

- refactored some of the role/principal removal logic from ImpaladCatalog
  into AuthorizationPolicy. This makes it easier to perform the similar
  "subscribe" with less duplicate cdoe.

- changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2
  catalogs in addition to v1.

- passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and
  added an AuthorizationPolicy member there. This member is maintained
  when we see PRINCIPAL and PRIVILEGE objects come via the catalog
  updates.

- had to implement LocalCatalog.isReady() to ensure that we don't allow
  user access until the first topic update has been consumed.

- additionally had to copy some other code from ImpaladCatalog to
  protect against various races -- we need a CatalogDeltaLog as well as
  careful sequencing of the order in which the objects apply.

With this patch and the following one to enable UDF support, I was able
to run the tests in tests/authorization successfully with LocalCatalog
enabled.

Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Reviewed-on: http://gerrit.cloudera.org:8080/11358
Reviewed-by: Todd Lipcon <todd@apache.org>
Tested-by: Todd Lipcon <todd@apache.org>
2018-09-06 02:39:08 +00:00
Todd Lipcon
8dcf54aee2 IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
This implements cache invalidation inside CatalogdMetaProvider. The
design is as follows:

- when the catalogd collects updates into the statestore topic, it now
  adds an additional entry for each table and database. These additional
  entries are minimal - they only include the object's name, but no
  metadata. This new behavior is conditional on a new flag
  --catalog_topic_mode. The default mode is to keep the old style, but
  it can be configured to mixed (support both v1 and v2) or v2-only.

- the old-style topic entries are prefixed with a '1:' whereas the new
  minimal entries are prefixed with a '2:'. The impalad will subscribe
  to one or the other prefix depending on whether it is running with
  --use_local_catalog. Thus, old impalads will not be confused by the
  new entries and vice versa.

- when the impalad gets these topic updates, it forwards them through to
  the catalog implementation. The LocalCatalog implementation forwards
  them to the CatalogdMetaProvider, which uses them to invalidate
  cached metadata as appropriate.

This patch includes some basic unit tests. I also did some manual
testing by connecting to different impalads and verifying that a session
connected to impalad #1 saw the effects of DDLs made by impalad #2
within a short period of time (the statestore topic update frequency).

Existing end-to-end tests cover these code paths pretty thoroughly:

- if we didn't automatically invalidate the cache on a coordinator
  in response to DDL operations, then any test which expects to
  "read its own writes" (eg access a table after creating one)
  would fail
- if we didn't propagate invalidations via the statestore, then
  all of the tests that use sync_ddl would fail.

I verified the test coverage above using some of the tests in
test_ddl.py -- I selectively commented out a few of the invalidation
code paths in the new code and verified that tests failed until I
re-introduced them. Along the way I also improved test_ddl so that, when
this code is broken, it properly fails with a timeout. It also has a bit
of expanded coverage for both the SYNC_DDL and non-SYNC cases.

I also wrote a new custom-cluster test for LocalCatalog that verifies
a few of the specific edge cases like detecting catalogd restart,
SYNC_DDL behavior in mixed mode, etc.

One notable exception here is the implementation of INVALIDATE METADATA
This turned out to be complex to implement, so I left a lengthy TODO
describing the issue and filed a JIRA.

Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Reviewed-on: http://gerrit.cloudera.org:8080/11280
Reviewed-by: Todd Lipcon <todd@apache.org>
Tested-by: Todd Lipcon <todd@apache.org>
2018-09-05 22:51:15 +00:00
Vuk Ercegovac
72ee4a4275 IMPALA-7425: Change incremental stats to pull from catalogd.
Currently, incremental stats can consume a substantial
amount of metadata memory (per table, partition, column).
This metadata is transmitted from catalogd to all coordinators.
As a result, memory is used for all loaded tables that use
incremental stats all the time at all coordinators. A consequence
is that coordinators and catalogd die from OOM more often
when incremental stats are used and more network bandwidth is used.

This change removes incremental stats from impalads. These stats
are only needed when computing incremental statistics and merging
new results with the existing results. They are not used by queries.
As a result, the change requires that coordinators fetch
incremental stats directly from catalogd when computing incremental stats.
In addition, catalogd no longer sends incremental stats to coordinators
via the statestore.

The option is enabled by setting a new flag, --pull_incremental_statistics,
on the catalogd and all impalad coordinators.

Testing:
  - manual testing
  - added end-to-end tests with --pull_incremental_statistics enabled
    for the compute-stats-incremental.test
  - added fe CatalogTest for new catalogd service method
  - passes exhaustive tests when --pull_incremental_statistics is enabled
    and disabled

Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Reviewed-on: http://gerrit.cloudera.org:8080/11193
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-05 20:49:54 +00:00
Vuk Ercegovac
c692e5cc9e IMPALA-7408: add a debugging flag to disable reading fs data from catalogd
Add the flag: --disable_catalog_data_ops_debug_only that skips loading
files from the file-system from catalogd. The flag is by default false
and its hidden. Its intent is to avoid time-consuming accesses to
the file-system when debugging metadata issues and the file-system
contents are not available. For example, a recent ~18 GB catalog
takes 10 hours to load without the flag set vs. 1 hour to load with
the flag. The extra time comes from accessing the file-system, failing,
and logging exceptions.

This flag specifically disables copying jars from the fs when loading
Java functions and it skips loading avro schema files. Additional cases
can be added under this flag if more are needed.

Testing:
- manually confirmed that jars and avro schema files are skipped.
- added a test to check the same behavior in a custom cluster test.
- ran core tests.

Change-Id: I15789fb489b285e2a6565025eb17c63cdc726354
Reviewed-on: http://gerrit.cloudera.org:8080/11191
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-15 01:58:18 +00:00
Todd Lipcon
4aec50484a IMPALA-7308. Support Avro tables in LocalCatalog
This adds support for loading Avro-formatted tables in LocalCatalog. In
the case that the table properties indicate a table is Avro-formatted,
the semantics are identical to the existing catalog implementation:

- if an explicit avro schema is specified, it overrides the schema
  provided by the HMS
- if no explicit avro schema is specified, one is inferred, and then the
  inferred schema takes the place of the one provided by the HMS (thus
  promoting columns like TINYINT to INT)
- on COMPUTE STATS, if any discrepancy is discovered between the HMS
  schema and the inferred schema, an error is emitted.

The semantics for LocalCatalog are slightly different in the case of
tables which have not been configured as Avro format on the table level:

The existing implementation has the behavior that, when a table is
loaded, all partitions are inspected, and, if any partition is
discovered with Avro format, the above rules are applied. This has some
very unexpected results, described in an earlier email to
dev@impala.apache.org [1]. To summarize that email thread, the existing
behavior was decided to be unintuitive and inconsistent with Hive.
Additionally, this behavior requires loading all partitions up-front,
which gets in the goal of lazy/granular metadata loading in
LocalCatalog.

Thus, the LocalCatalog implementation differs as follows:

- the "schema override" behavior ONLY occurs if the Avro file format has
  been selected at a table level.

- if an Avro partition is added to a non-Avro table, and that partition
  has a schema that isn't compatible with the table's schema, an error
  will occur on read.

The thread additionally discusses adding an error message on "alter" to
prevent users from adding an Avro partition to a table with an
incompatible schema. To keep the scope of this patch minimal, that is
not yet implemented here. I filed IMPALA-7309 to change the behavior of
the existing catalog implementation to match.

A new test verifies the behavior, set to 'xfail' when running on the
existing catalog implementation.

[1] https://lists.apache.org/thread.html/fb68c54bd66a40982ee17f9f16f87a4112220a5df035a311bda310f1@%3Cdev.impala.apache.org%3E

Change-Id: Ie4b86c8203271b773a711ed77558ec3e3070cb69
Reviewed-on: http://gerrit.cloudera.org:8080/10970
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
2018-08-07 17:38:04 +00:00
Michael Ho
8d7f638654 IMPALA-7212: Removes --use_krpc flag and remove old DataStream services
This change removes the flag --use_krpc which allows users
to fall back to using Thrift based implementation of DataStream
services. This flag was originally added during development of
IMPALA-2567. It has served its purpose.

As we port more ImpalaInternalServices to use KRPC, it's becoming
increasingly burdensome to maintain parallel implementation of the
RPC handlers. Therefore, going forward, KRPC is always enabled.
This change removes the Thrift based implemenation of DataStreamServices
and also simplifies some of the tests which were skipped when KRPC
is disabled.

Testing done: core debug build.

Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Reviewed-on: http://gerrit.cloudera.org:8080/10835
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-07-24 02:36:50 +00:00
Tim Armstrong
e07fbc1b63 IMPALA-7185: low statestore custom cluster interval
This changes the default statestore interval for the custom cluster
tests. This can reduce the time taken for the cluster to start and
metadata to load. On some tests this resulted in saving 5+ seconds
per test. Overall it shaved around a minute off the custom cluster
tests.

Testing:
Ran 10 iterations of the tests.

Change-Id: Ia5d1612283ff420d95b0dd0ca5a2a67f56765f79
Reviewed-on: http://gerrit.cloudera.org:8080/10845
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-07-02 22:06:38 +00:00
Taras Bobrovytsky
8060f4d50e IMPALA-7102 (Part 1): Disable reading of erasure coding by default
In this patch we add a query option ALLOW_ERASURE_CODED_FILES, that
allows us to enable or disable the support of erasure coded files. Even
though Impala should be able to handle HDFS erasure coded files already,
this feature hasn't been tested thoroughly yet. Also, Impala lacks
metrics, observability and DDL commands related to erasure coding. This
is a query option instead of a startup flag because we want to make it
possible for advanced users to enable the feature.

We may also need a follow on patch to also disable the write path with
this flag.

Cherry-picks: not for 2.x

Change-Id: Icd3b1754541262467a6e67068b0b447882a40fb3
Reviewed-on: http://gerrit.cloudera.org:8080/10646
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-06-29 23:26:35 +00:00
Michael Ho
3b72a6c0da IMPALA-2567: Enable KRPC by default
This change enables the switch to use KRPC by default.
This change also fixes a bug in KrpcDataStreamMgr to
check if maintenance thread was started before calling
Join() on it. This shows up in BE tests as the maintenance
thread isn't started in them.

Testing done: exhaustive build.

Change-Id: Iae736c1c1351758969b4d84e34fc5b2d048660a0
Reviewed-on: http://gerrit.cloudera.org:8080/9461
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Impala Public Jenkins
2018-03-05 08:57:40 +00:00
Lars Volker
a8fc9f0fc7 IMPALA-6508: add KRPC test flag
This change adds a flag "--use_krpc" to start-impala-cluster.py. The
flag is currently passed as an argument to the impalad daemon. In the
future it will also enable KRPC for the catalogd and statestored
daemons.

This change also adds a flag "--test_krpc" to pytest. When running tests
using "impala-py.test --test_krpc", the test cluster will be started
by passing "--use_krpc" to start-impala-cluster.py (see above).

This change also adds a SkipIf to skip tests based on whether the
cluster was started with KRPC support or not.

- SkipIf.not_krpc can be used to mark a test that depends on KRPC.
- SkipIf.not_thrift can be used to mark a test that depends on Thrift
  RPC.

This change adds a meta test to make sure that the new SkipIf decorators
work correctly. The test should be removed as soon as real tests have
been added with the new decorators.

Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Reviewed-on: http://gerrit.cloudera.org:8080/9291
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Impala Public Jenkins
2018-02-16 09:26:01 +00:00
Vuk Ercegovac
6a2b7a64fb IMPALA-4704: Turns on client connections when local catalog initialized.
Currently, impalad starts beeswax and hs2 servers even if the
catalog has not yet been initialized. As a result, client
connections see an error message stating that the impalad
is not yet ready.

This patch changes the impalad startup sequence to wait
until the catalog is received before opening beeswax and hs2 ports
and starting their servers.

Testing:
- python e2e tests that start a cluster without a catalog
  and check that client connections are rejected as expected.

Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Reviewed-on: http://gerrit.cloudera.org:8080/8202
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-13 21:14:14 +00:00
Matthew Jacobs
7a1ff1e5e9 IMPALA-5539: Fix Kudu timestamp with -use_local_tz_for_unix_ts
The -use_local_tz_for_unix_timestamp_conversion flag exists
to specify if TIMESTAMPs should be interpreted as localtime
or UTC when converting to/from Unix time via builtins:
  from_unixtime(bigint unixtime)
  unix_timestamp(string datetime[, ...])
  unix_timestamp(timestamp datetime)

However, the KuduScanner was calling into code that, when
the gflag above was set, interpreted Unix times as local
time.  Unfortunately the write path (KuduTableSink) and some
FE TIMESTAMP code (see KuduUtil.java) did not have this
behavior, i.e. we were handling the gflag inconsistently.

Tests:
* Adds a custom cluster test to run Kudu test cases with
  -use_local_tz_for_unix_timestamp_conversion.
* Adds tests for the new builtin
  unix_micros_to_utc_timestamp() which run in a custom
  cluster test (added test_local_tz_conversion.py) as well
  as in the regular tests (added to test_exprs.py).

Change-Id: I423a810427353be76aa64442044133a9a22cdc9b
Reviewed-on: http://gerrit.cloudera.org:8080/7311
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-07-19 22:17:13 +00:00
Dimitris Tsirogiannis
e2c53a8bdf IMPALA-5147: Add the ability to exclude hosts from query execution
This commit introduces a new startup option, termed 'is_executor',
that determines whether an impalad process can execute query fragments.
The 'is_executor' option determines if a specific host will be included
in the scheduler's backend configuration and hence included in
scheduling decisions.

Testing:
- Added a customer cluster test.
- Added a new scheduler test.

Change-Id: I5d2ff7f341c9d2b0649e4d14561077e166ad7c4d
Reviewed-on: http://gerrit.cloudera.org:8080/6628
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
2017-04-26 01:45:40 +00:00
Dimitris Tsirogiannis
296df3c826 IMPALA-4041: Limit catalog and admission control updates to coordinators
With this commit we add the ability to limit catalog updates to a
limited set of coordinator nodes. A new startup option, termed
'is_coordinator' is added to indicate if a node is a coordinator.
Coordinators accept connections through HS2 and Beeswax interfaces
and can also participate in query execution. Non-coordinator nodes
do not receive catalog updates from the statestore, do not initialize
a query scheduler and cannot accept Beeswax and HS2 client connections.

Testing:
- Added a custom cluster test that launches a cluster in which the
number of coordinators is less than the cluster size and runs a number
of smoke queries.
- Successfully run exhaustive tests.

Change-Id: I5f2c74abdbcd60ac050efa323616bd41182ceff3
Reviewed-on: http://gerrit.cloudera.org:8080/6344
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
2017-03-28 22:27:25 +00:00
David Knupp
f590bc0da6 IMPALA-4750: Rename test infra classes so they don't mimic test classes.
This patch addresses warning messages from pytest re: the imported
TestMatrix, TestVector, and TestDimension classes, which were being
collected as potential test classes. The fix was to simply prepend
the class names with Impala-

git grep -l 'TestDimension' | xargs \
    sed -i 's/TestDimension/ImpalaTestDimension/g'

git grep -l 'TestMatrix' | xargs \
    sed -i 's/TestMatrix/ImpalaTestMatrix/g'

git grep -l 'TestVector' | xargs \
    sed -i 's/TestVector/ImpalaTestVector/g'

The tests all passed in an exhaustive run on the upstream jenkins
server:

http://jenkins.impala.io:8080/view/Utility/job/pre-review-test/8/

Change-Id: I06b7bc6fd99fbb637a47ba376bf9830705c1fce1
Reviewed-on: http://gerrit.cloudera.org:8080/5794
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
2017-01-26 23:40:22 +00:00
Lars Volker
8b7f876649 IMPALA-4722: Disable log caching in test_scratch_disk
test_scratch_disk fails sporadically when trying to assert the presence
of log messages. This is probably caused by log caching, since after
such failures the log files do contains the lines in question.

I manually tested this by running the tests repeatedly for 2 days (10k
runs).

To make future diagnosis of similar problems easier, this change also
adds more output to assert_impalad_log_contains().

Change-Id: I9f21284338ee7b4374aca249b6556282b0148389
Reviewed-on: http://gerrit.cloudera.org:8080/5669
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-01-12 18:58:48 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Taras Bobrovytsky
609b80410e Clean up Python test import statements
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.

Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-07-15 23:26:18 +00:00
Michael Brown
067af1957c IMPALA-3614: work around pytest bugs causing custom cluster test skips
All versions of pytest contain various bugs regarding test marking
(including skips) when tests are both:

1. class-level marked
2. inherited

More info is available in IMPALA-3614 and IMPALA-2943, but the gist is
that it's possible for some tests to be skipped when they shouldn't be.
This is happening pretty badly with the custom cluster tests, because
CustomClusterTestSuite has a class level skipif mark.

The easiest workaround for now is to remove the pytest skipif mark in
CustomClusterTestSuite and skip using explicit pytest.skip() in the
setup_class() method. Some CustomClusterTestSuite children implemented
their own setup_* methods, and I made some adjustments to them both to
clean them up and implement proper parent method calling via super().

Testing:

I ran the following combinations of all the custom cluster tests:

DEBUG   / HDFS  / core
RELEASE / HDFS  / exhaustive
DEBUG   / LOCAL / core
DEBUG   / S3    / core

Before, we'd get situations in which most of the tests were skipped.
Consider the RELEASE/HDFS/exhaustive situation:

  custom_cluster/test_admission_controller.py .....
  custom_cluster/test_alloc_fail.py ss
  custom_cluster/test_breakpad.py sssss
  custom_cluster/test_delegation.py sss
  custom_cluster/test_exchange_delays.py ss
  custom_cluster/test_hdfs_fd_caching.py s
  custom_cluster/test_hive_parquet_timestamp_conversion.py ss
  custom_cluster/test_insert_behaviour.py ss
  custom_cluster/test_legacy_joins_aggs.py s
  custom_cluster/test_parquet_max_page_header.py s
  custom_cluster/test_permanent_udfs.py sss
  custom_cluster/test_query_expiration.py sss
  custom_cluster/test_redaction.py ssss
  custom_cluster/test_s3a_access.py s
  custom_cluster/test_scratch_disk.py ssss
  custom_cluster/test_session_expiration.py s
  custom_cluster/test_spilling.py ssss
  authorization/test_authorization.py ss
  authorization/test_grant_revoke.py s

Now, more tests run appropriately:

  custom_cluster/test_admission_controller.py .....
  custom_cluster/test_alloc_fail.py ss
  custom_cluster/test_breakpad.py sssss
  custom_cluster/test_delegation.py ...
  custom_cluster/test_exchange_delays.py ss
  custom_cluster/test_hdfs_fd_caching.py .
  custom_cluster/test_hive_parquet_timestamp_conversion.py ..
  custom_cluster/test_insert_behaviour.py ..
  custom_cluster/test_kudu_not_available.py .
  custom_cluster/test_legacy_joins_aggs.py .
  custom_cluster/test_parquet_max_page_header.py .
  custom_cluster/test_permanent_udfs.py ...
  custom_cluster/test_query_expiration.py ...
  custom_cluster/test_redaction.py ....
  custom_cluster/test_s3a_access.py s
  custom_cluster/test_scratch_disk.py ....
  custom_cluster/test_session_expiration.py .
  custom_cluster/test_spilling.py ....
  authorization/test_authorization.py ..
  authorization/test_grant_revoke.py .

Change-Id: Ie301b69718f8690322cc3b4130fb1c715344779c
Reviewed-on: http://gerrit.cloudera.org:8080/3265
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Michael Brown <mikeb@cloudera.com>
2016-06-06 17:34:07 -07:00
Lars Volker
c9df348c38 IMPALA-2686: Add breakpad crash handler to all daemons
This changes add breakpad crash handling support to catalogd, impalad,
and statestored. The destination folder for minidump files can be
configured via the 'minidump_path' command line flag. Leaving it empty
will disable minidump generation. The daemons will rotate minidump
files. The number of files to keep can be configured with the
'max_minidumps' command line flag.

Change-Id: I7a37a38488716ffe34296f3490ae291bbb7228d6
Reviewed-on: http://gerrit.cloudera.org:8080/2028
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Internal Jenkins
2016-05-12 14:17:52 -07:00
Skye Wanderman-Milne
9c4eb9fc61 IMPALA-2605: prevent long-running child processes from keeping TCP connection open
The problem: By default, all file descriptors opened by a process,
including sockets, are inherited by any forked child processes. This
includes the connection socket created at the beginning of each test
in ImpalaTestSuite.setup_class(). In
TestHiveMetaStoreFailure.test_hms_service_dies(), the Hive Metastore
is stopped and restarted, meaning the metastore in now a child process
of the test process. This causes the client connection not to be
closed when the parent process (the test) exits, meaning that one of a
finite number of connections (64) to Impala is left permanently in
use.

This would be barely noticeable except run-tests.py runs the mini
stress test with 4 * <num CPUs> concurrent clients by default. On our
build machines, this is 64 clients, which is also the default max
number of connections for an impalad. When a test process tries to
make the 65th connection (since the leaked connection is still there),
it blocks until a connection is freed up. Due to a quirk of the xdist
py.test plugin that I don't fully understand, the test framework will
not clean up test classes (and close the connections) until a number
of tests complete, causing the test process to deadlock.

The solution: use the close_fds argument to make sure the TCP socket
is closed in the spawned child process. This is also done in
CustomClusterTestSuite._start_impala_cluster() when it starts the new
cluster.

This patch also switches test_hms_failure.py to use check_call()
instead of call(), and explicitly caps the number of stress clients at
64.

Change-Id: I03feae922883a0624df1422ffb6ba5f1d83fb869
Reviewed-on: http://gerrit.cloudera.org:8080/1853
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Internal Jenkins
2016-01-22 22:59:22 +00:00
Tim Armstrong
7e92a5b8c9 Improve error handling when starting test impala cluster
Check return code of start-impala-cluster.py and check that statestored
was found in test_custom_cluster. This avoids various strange scenarios
where the cluster wasn't created correctly.

Change-Id: Iebaf325d085b85ad156f2bf8a39dddcf6319fb09
Reviewed-on: http://gerrit.cloudera.org:8080/1765
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-01-20 23:42:08 +00:00
Vlad Berindei
b6c20b2a40 Allow Impala to run against local filesystem.
Allow Impala to start only with a running HMS (and no additional services like HDFS,
HBase, Hive, YARN) and use the local file system.

Skip all tests that need these services, use HDFS caching or assume that multiple impalads
are running.

To run Impala with the local filesystem, set TARGET_FILESYSTEM to 'local' and
WAREHOUSE_LOCATION_PREFIX to a location on the local filesystem where the current user has
permissions since this is the location where the test data will be extracted.

Test coverage (with core strategy) in comparison with HDFS and S3:
HDFS             1348 tests passed
S3               1157 tests passed
Local Filesystem 1161 tests passed

Change-Id: Ic9718c7e0307273382b1cc6baf203ff2fb2acd03
Reviewed-on: http://gerrit.cloudera.org:8080/1352
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Readability: Alex Behm <alex.behm@cloudera.com>
2015-12-05 06:48:32 +00:00
Tim Armstrong
1d2afcfec2 IMPALA-2079: Part 1: report non-writable scratch dirs at startup
Previously Impala could erroneously decide to use non-writable scratch
directories, e.g. if /tmp/impala-scratch already exists and is not
writable by the current user.

With this change, if we cannot remove and recreate a fresh scratch directory,
it is not used.  If we have no valid scratch directories, we log an
error and continue startup.

Add unit test for CreateDirectory to test behavior for success and
failure cases.

Add system tests to check logging and query execution in various
scenarios where we do not have scratch available.

Modify FilesystemUtil to use non-exception-throwing Boost functions to
avoid unhandled exceptions escaping into the rest of the Impala
codebase, which does not expect the use of exceptions.

Change-Id: Icaa8429051942424e1d811c54bde10102ac7f7b3
Reviewed-on: http://gerrit.cloudera.org:8080/565
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2015-08-14 00:38:22 +00:00
Casey Ching
074e5b4349 Remove hashbang from non-script python files
Many python files had a hashbang and the executable bit set though
they were not intended to be run a standalone script. That makes
determining which python files are actually scripts very difficult.
A future patch will update the hashbang in real python scripts so they
use $IMPALA_HOME/bin/impala-python.

Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba
Reviewed-on: http://gerrit.cloudera.org:8080/599
Reviewed-by: Casey Ching <casey@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
2015-08-04 05:26:07 +00:00
casey
99cb338b11 Redaction: Add end-to-end test
Add a few python custom cluster tests to check:

1) The server fails to start if redaction rules are bad, and the error
   message appears in the log.
2) Without redaction rules set, Impala functions as before redaction was
   introduced.
3) With redaction rules set, redacted values appear in the logs and web
   ui instead of the "sensitive" raw values.

Change-Id: I70e6876d6df8e8afbf2c845f6c922c72d564cadb
Reviewed-on: http://gerrit.cloudera.org:8080/172
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-03-06 16:47:37 -08:00
Nong Li
8a661d0787 [CDH5] cherry pick conflicts.
Change-Id: Ic11237b7ead4a810b523d6b6095781efbc5bb66b
2014-09-20 19:41:42 -07:00
ishaan
c4b4e010ff Buffered Tuple Stream fixes.
This patch fixes two issues:
  - Add API to buffered block mgr to allow an atomic Unpin and GetNewBlock. This has
    the semantics of unpinning a block and giving the buffer to the new block. This
    is necessary for the tuple stream to make sure another thread does not grab the
    unpinned block in between.
  - Buffer management reading an unpinned stream. Before moving onto a new block (and
    unpinning the current), we need to make sure all the tuples returned from the
    current block are returned up the operator tree.

Change-Id: I95ee58d1019dd971f6a7dc19ecafdfa54cdbf942
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4333
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-09-20 16:05:11 -07:00
Lenni Kuff
ffe9e4b74e [CDH5] Add support for GRANT/REVOKE to Impala
This change adds support for GRANT/REVOKE to Impala via the Sentry Service. This includes
support for creating and dropping roles, granting and revoking roles to/from groups,
granting/revoking privileges to/from roles, and commands to view role metadata.

The specific statements that are added in this patch are:
CREATE/DROP ROLE <roleName>
SHOW ROLES
SHOW ROLE GRANT GROUP <groupName>
GRANT/REVOKE ROLE <roleName> TO/FROM GROUP <groupName>
GRANT/REVOKE <privilegeSpec> TO/FROM <roleName

It does not include some of the fancier bulk-op syntax like support for granting multiple
roles to multiple groups in one statement.

This patch does not add support for the WITH GRANT OPTION to delegate GRANT/REVOKE
privileges to other users.

TODO:
* Authorize these statements on the client side. The current Sentry Service design makes
  it difficult to authorize any GRANT/REVOKE statement on the client (Impala) side.
  Privilege checks are done within the Sentry Service itself. There are a few different
  options available to let Impala "fail fast" and those changes will come in a follow
  on patch.

Change-Id: Ic6bd19f5939d3290255222dcc1a42ce95bd345e2
2014-09-13 21:21:10 -07:00
Matthew Jacobs
9156cb94ca Admission controller functional tests
The test works by submitting a number of queries (parameterized) with
some delay between submissions (parameterized) and the ability to
submit to one impalad or many. The queries are set with the WAIT debug
action so that we have more control over the state that the admission
controller uses to make decisions.  Each query is submitted on a
separate thread. Depending on the test parameters a varying number of
queries will be admitted, queued, and rejected. Once queries are
admitted, the query execution blocks and we can cancel the query in
order to allow another queued query to be admitted. The test tracks
the state of the admission controller using metric counters on each
impalad.

Change-Id: I455484a7f899032890b22c38592fcea1875f5399
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1413
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
(cherry picked from commit bc2a74d6da622de877422f926ff1892bed867bb1)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1624
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
2014-02-20 14:48:30 -08:00
Henry Robinson
9bc840dc85 Support for custom cluster configurations in some tests
Test suites that derive from common.CustomClusterTestSuite have a brand
new cluster for every tests case, which they can configure as they wish
with custom arguments using the @with_args() decorator.

A future improvement is to optionally only have one cluster per test
suite, to allow multiple tests to run more quickly if they share
configuration options.

Change-Id: I6abd5740e644996d7ca2800edf4ff11b839d1bc4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/882
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:57 -08:00