For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.
Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
Previously when we created a new connection to S3 via the S3A
connector, the S3AFileSystem object was created only then. However,
recently it was realized that the object could be created even before
we created a connection to the S3A file system; so the configuration
values we pass through to the connection would be ignored.
This is fixed by forcing the FS builder to return a new instance of
the filesystem object, which will read the configuration changes we
make.
The test is also split up to test a DDL statement in a separate test
because we would want that statement to succeed as it goes through the
HMS and the SELECT statement to fail.
Change-Id: I23b541eef747dd62e59390f8cc9ac6e5742ead40
Reviewed-on: http://gerrit.cloudera.org:8080/3392
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Sailesh Mukil <sailesh@cloudera.com>
All versions of pytest contain various bugs regarding test marking
(including skips) when tests are both:
1. class-level marked
2. inherited
More info is available in IMPALA-3614 and IMPALA-2943, but the gist is
that it's possible for some tests to be skipped when they shouldn't be.
This is happening pretty badly with the custom cluster tests, because
CustomClusterTestSuite has a class level skipif mark.
The easiest workaround for now is to remove the pytest skipif mark in
CustomClusterTestSuite and skip using explicit pytest.skip() in the
setup_class() method. Some CustomClusterTestSuite children implemented
their own setup_* methods, and I made some adjustments to them both to
clean them up and implement proper parent method calling via super().
Testing:
I ran the following combinations of all the custom cluster tests:
DEBUG / HDFS / core
RELEASE / HDFS / exhaustive
DEBUG / LOCAL / core
DEBUG / S3 / core
Before, we'd get situations in which most of the tests were skipped.
Consider the RELEASE/HDFS/exhaustive situation:
custom_cluster/test_admission_controller.py .....
custom_cluster/test_alloc_fail.py ss
custom_cluster/test_breakpad.py sssss
custom_cluster/test_delegation.py sss
custom_cluster/test_exchange_delays.py ss
custom_cluster/test_hdfs_fd_caching.py s
custom_cluster/test_hive_parquet_timestamp_conversion.py ss
custom_cluster/test_insert_behaviour.py ss
custom_cluster/test_legacy_joins_aggs.py s
custom_cluster/test_parquet_max_page_header.py s
custom_cluster/test_permanent_udfs.py sss
custom_cluster/test_query_expiration.py sss
custom_cluster/test_redaction.py ssss
custom_cluster/test_s3a_access.py s
custom_cluster/test_scratch_disk.py ssss
custom_cluster/test_session_expiration.py s
custom_cluster/test_spilling.py ssss
authorization/test_authorization.py ss
authorization/test_grant_revoke.py s
Now, more tests run appropriately:
custom_cluster/test_admission_controller.py .....
custom_cluster/test_alloc_fail.py ss
custom_cluster/test_breakpad.py sssss
custom_cluster/test_delegation.py ...
custom_cluster/test_exchange_delays.py ss
custom_cluster/test_hdfs_fd_caching.py .
custom_cluster/test_hive_parquet_timestamp_conversion.py ..
custom_cluster/test_insert_behaviour.py ..
custom_cluster/test_kudu_not_available.py .
custom_cluster/test_legacy_joins_aggs.py .
custom_cluster/test_parquet_max_page_header.py .
custom_cluster/test_permanent_udfs.py ...
custom_cluster/test_query_expiration.py ...
custom_cluster/test_redaction.py ....
custom_cluster/test_s3a_access.py s
custom_cluster/test_scratch_disk.py ....
custom_cluster/test_session_expiration.py .
custom_cluster/test_spilling.py ....
authorization/test_authorization.py ..
authorization/test_grant_revoke.py .
Change-Id: Ie301b69718f8690322cc3b4130fb1c715344779c
Reviewed-on: http://gerrit.cloudera.org:8080/3265
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Michael Brown <mikeb@cloudera.com>
This patch adds two flags, namely 's3a_access_key_cmd' and
's3a_secret_key_cmd' each of which should be a Unix command that
would retrieve the S3 access key and the S3 secret key respectively.
These keys are used when creating a connection to S3 so that Impala
is given access.
This patch gives users the safer option of not needing to put the
access and secret keys in the core-site.xml file (which is potentially
insecure).
Change-Id: I2ba103bcb399861683066fd00219d30c180db043
Reviewed-on: http://gerrit.cloudera.org:8080/2850
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Sailesh Mukil <sailesh@cloudera.com>