impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Riza Suminto	daaf73a7c2	IMPALA-13682: Implement missing capabilities in ImpylaHS2Connection This patch implements 'wait_for_finished_timeout', 'wait_for_admission_control', and 'get_admission_result' for ImpylaHS2Client. This patch also changes the behavior of ImpylaHS2Connection to produce several extra cursors aside from self.__cursor for 'execute' call that supplies user argument and each 'execute_async' to make issuing multiple concurrent queries possible. Note that each HS2 cursor opens its own HS2 Session. Therefore, this patch breaks the assumption that an ImpylaHS2Connection is always under a single HS2 Session (see HIVE-11402 and HIVE-14247 on why concurrent query with shared HS2 Session is problematic). However, they do share the same query options stored at self.__query_options. In practice, most Impala tests do not care about running concurrent queries under a single HS2 session but only require them to use the same query options. The following additions are new for both BeeswaxConnection and ImpylaHS2Connection: - Add method 'log_client' for convenience. - Implement consistent query state mapping and checking through several accessor methods. - Add methods 'wait_for_impala_state' and 'wait_for_any_impala_state' that use 'get_impala_exec_state' method internally. - Add 'fetch_profile_after_close' parameter to 'close_query' method. If True, 'close_query' will return the query profile after closing the query. - Add 'discard_results' parameter for 'fetch' method. This can save time parsing results if the test does not care about the result. Reuse existing op_handle_to_query_id and add new session_handle_to_session_id to parse HS2 TOperationHandle.operationId.guid and TSessionHandle.sessionId.guid respectively. To show that ImpylaHS2Connection is on par with BeeswaxConnection, this patch refactors test_admission_controller.py to test using HS2 protocol by default. Test that does raw HS2 RPC (require capabilities from HS2TestSuite) is separated out into a new TestAdmissionControllerRawHS2 class and stays using beeswax protocol by default. All calls to copy.copy is replaced with copy.deepcopy for safety. Testing: - Pass exhaustive tests. Change-Id: I9ac07732424c16338e060c9392100b54337f11b8 Reviewed-on: http://gerrit.cloudera.org:8080/22362 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-03-04 06:58:23 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
David Knupp	f1c8176e65	IMPALA-3343: Part 2 - Add thrift_sasl library to shell/ext_py/ We've relied on a copied version of thrift_sasl.py, which needs to be updated to be compatible with python 3, so taking this opportunity to add the thrift_sasl 0.4.1 package to ext-py like the other external python libs we use. Change-Id: I7e66c728883ceb5b3e96bc5fd120d44ab81bbb75 Reviewed-on: http://gerrit.cloudera.org:8080/15513 Reviewed-by: David Knupp <dknupp@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-03-21 12:25:55 +00:00
Henry Robinson	e4a0e2f391	IMPALA-5775: Allow shell to support TLSv1, v1.1 and v1.2 The shell uses Thrift's TSSLSocket to negotiate secure connections to Impala. This socket uses a variable SSL_VERSION to determine which SSL and TLS protocol versions it will connect to. SSL_VERSION was hardcoded to be PROTOCOL_TLSv1, which only supports TLSv1 servers and no other protocol version. Change the allowed version to be PROTOCOL_SSLv23, which supports any TLS or SSL protocol. We rely on the server not to allow SSLv2 or v3 connections. Testing: Added a new custom cluster test to confirm that the shell can connect to a TLSv1.2 cluster. Confirmed that the test is correctly skipped on machines with an old version of OpenSSL that does not support TLSv1.2. Change-Id: I5487f82d110676b9c3c7a5305931da00c7f68ca0 Reviewed-on: http://gerrit.cloudera.org:8080/7675 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-08-16 08:10:02 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Casey Ching	f288867833	Stress test: Various changes The major changes are: 1) Collect backtrace and fatal log on crash. 2) Poll memory usage. The data is only displayed at this time. 3) Support kerberos. 4) Add random queries. 5) Generate random and TPC-H nested data on a remote cluster. The random data generator was converted to use MR for scaling. 6) Add a cluster abstraction to run data loading for #5 on a remote or local cluster. This also moves and consolidates some Cloudera Manager utilities that were in the stress test. 7) Cleanup the wrappers around impyla. That stuff was getting messy. Change-Id: I4e4b72dbee1c867626a0b22291dd6462819e35d7 Reviewed-on: http://gerrit.cloudera.org:8080/1298 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-01-20 23:00:25 +00:00
Casey Ching	074e5b4349	Remove hashbang from non-script python files Many python files had a hashbang and the executable bit set though they were not intended to be run a standalone script. That makes determining which python files are actually scripts very difficult. A future patch will update the hashbang in real python scripts so they use $IMPALA_HOME/bin/impala-python. Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba Reviewed-on: http://gerrit.cloudera.org:8080/599 Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Internal Jenkins	2015-08-04 05:26:07 +00:00
Mike Yoder	75a97d3d7e	[CDH5] Kerberize mini-cluster and Impala daemons This is the first iteration of a kerberized development environment. All the daemons start and use kerberos, with the sole exception of the hive metastore. This is sufficient to test impala authentication. When buildall.sh is run using '-kerberize', it will stop before loading data or attempting to run tests. Loading data into the cluster is known to not work at this time, the root causes being that Beeline -> HiveServer2 -> MapReduce throws errors, and Beeline -> HiveServer2 -> HBase has problems. These are left for later work. However, the impala daemons will happily authenticate using kerberos both from clients (like the impala shell) and amongst each other. This means that if you can get data into the mini-cluster, you could query it. Usage: * Supply a '-kerberize' option to buildall.sh, or * Supply a '-kerberize' option to create-test-configuration.sh, then 'run-all.sh -format', re-source impala-config.sh, and then start impala daemons as usual. You must reformat the cluster because kerberizing it will change all the ownership of all files in HDFS. Notable changes: * Added clean start/stop script for the llama-minikdc * Creation of Kerberized HDFS - namenode and datanodes * Kerberized HBase (and Zookeeper) * Kerberized Hive (minus the MetaStore) * Kerberized Impala * Loading of data very nearly working Still to go: * Kerberize the MetaStore * Get data loading working * Run all tests * The unknown unknowns * Extensive testing Change-Id: Iee3f56f6cc28303821fc6a3bf3ca7f5933632160 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4019 Reviewed-by: Michael Yoder <myoder@cloudera.com> Tested-by: jenkins	2014-09-05 12:36:21 -07:00
Henry Robinson	d264ab90fe	Add support for client SSL to Python Beeswax client Change-Id: I0d9352471067bfe19e25221e0ecbbb08f945b962 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2810 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins (cherry picked from commit 545bd30d5cf3cae9a3581d7bc942a909a1a98806) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2850 Tested-by: Henry Robinson <henry@cloudera.com>	2014-06-05 10:48:23 -07:00
Henry Robinson	93a3d65492	Support for LDAP tests * Allow Beeswax connections to optionally use LDAP * Run custom cluster tests from the aux repo, if it exists Change-Id: I054af64e030ad0cd722ae8dd75afda9c58ea2913 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2547 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2640	2014-05-21 05:52:55 -07:00
Alex Behm	9cabee4a71	Wait for the Metastore to come up before starting HiveServer2. Change-Id: Ic8e29efe63f6745e1ff44248657cbd7882bb16d9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1626 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1670 Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-02-25 21:05:33 -08:00
Lenni Kuff	3ee82e7543	Add support for running Impala query tests against secure cluster Adds support for running all the Impala query tests against a secure cluster. This run mode can be selected by adding a --use_kerberos flag to run-tests.py and pointing to the correct (secure) Hive Metastore Service.	2014-01-08 10:48:21 -08:00

12 Commits