impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Sahil Takiar	13f50eaec5	IMPALA-9229: impala-shell 'profile' to show original and retried queries Currently, the impala-shell 'profile' command only returns the profile for the most recent profile attempt. There is no way to get the original query profile (the profile of the first query attempt that failed) from the impala-shell. This patch modifies TGetRuntimeProfileReq and TGetRuntimeProfileResp to add support for returning both the original and retried profiles for a retried query. When a query is retried, TGetRuntimeProfileResp currently contains the profile for the most recent query attempt. TGetRuntimeProfileReq has a new field called 'include_query_attempts' and when it is set to true, the TGetRuntimeProfileResp will include all failed profiles in a new field called failed_profiles / failed_thrift_profiles. impala-shell has been modified so the 'profile' command has a new set of options. The syntax is now: PROFILE [ALL \| LATEST \| ORIGINAL] If 'ALL' is specified, both the latest and original profiles are printed. If 'LATEST' is specified, only the latest profile is printed. If 'ORIGINAL' is printed, only the original profile is printed. The default behavior is equivalent to specifying 'LATEST' (which is the current behavior before this patch as well). Support for this has only been added to HS2 given that Beeswax is being deprecated soon. The new 'profile' options have no affect when the Beeswax protocol is used. Most of the code change is in impala-hs2-server and impala-server; a lot of the GetRuntimeProfile code has been re-factored. Testing: * Added new impala-shell tests * Ran core tests Change-Id: I89cee02947b311e7bf9c7274f47dfc7214c1bb65 Reviewed-on: http://gerrit.cloudera.org:8080/16406 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-09-17 20:55:45 +00:00
Sahil Takiar	ea95691b77	IMPALA-9953: Shell should continue fetching even when 0 rows are returned The Impala shell stops fetching rows if it receives a batch that contains 0 rows. This is incorrect because a batch with 0 rows can be returned if the fetch request hits a timeout. Instead, the shell should rely on the value of has_rows / hasMoreRows to determine when to stop issuing fetch requests. Tests: * Added a regression test to test_shell_commandline.py * Ran all shell tests Change-Id: I5f8527aea9e433f8cf426435c0ba41355bbf9d88 Reviewed-on: http://gerrit.cloudera.org:8080/16222 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-07-22 23:28:10 +00:00
Tim Armstrong	6ec6aaae8e	IMPALA-3695: Remove KUDU_IS_SUPPORTED Testing: Ran exhaustive tests. Change-Id: I059d7a42798c38b570f25283663c284f2fcee517 Reviewed-on: http://gerrit.cloudera.org:8080/16085 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-06-18 01:11:18 +00:00
Sahil Takiar	3088ca8580	IMPALA-9818: Add fetch size as option to impala shell Adds the option --fetch_size to the Impala shell. This new option allows users to specify the fetch size used when issuing fetch RPCs to the Impala Coordinator (e.g. TFetchResultsReq and BeeswaxService.fetch). This parameter applies for all client protocols: beeswax, hs2, hs2-http. The default --fetch_size is set to 10240 (10x the default batch size). The new --fetch_size parameter is most effective when result spooling is enabled. When result spooling is disabled, Impala can only return a single row batch per fetch RPC (so 1024 rows by default). When result spooling is enabled, Impala can return up to 100 row batches per fetch request. Removes some logic in the the impala_client.py file that attempts to simulate a fetch_size. The code would issue multiple fetch requests to fullfill the given fetch_size. This logic is no longer needed now that result spooling is available. Testing: * Ran core tests * Added new tests in test_shell_client.py and test_shell_commandline.py Change-Id: I8dc7962aada6b38795241d067a99bd94fabca57b Reviewed-on: http://gerrit.cloudera.org:8080/16041 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Sahil Takiar <stakiar@cloudera.com>	2020-06-10 17:46:21 +00:00
Tim Armstrong	c43c03c5ee	IMPALA-3926: part 2: avoid setting LD_LIBRARY_PATH This removes LD_LIBRARY_PATH and LD_PRELOAD from the developer's shell and cleans it up. With the preceding change, toolchain utilities like clang can be run without a special LD_LIBRARY_PATH. This fixes a bug where libjvm.so was registered as a static instead of a shared library, which adds it to the RUNPATH variable in the binary, which provides a default search location that can be overriden by LD_LIBRARY_PATH. Impala binaries don't have the rpath baked in for some libraries, including Impala-lzo, libgcc and libstdc++. , so we still need to set LD_LIBRARY_PATH when running those. That is solved with wrapper scripts that sets the environment variables only when invoking those binaries, e.g. starting a daemon or running a backend test. I added three scripts because there were 3 sets of environment variables. The scripts are: * run-binary.sh: just sets LD_LIBRARY_PATH * run-jvm-binary.sh: sets LD_LIBRARY_PATH and CLASSPATH * start-daemon.sh: sets LD_LIBRARY_PATH and CLASSPATH and kerberos-related environment variables. The binaries, in almost all cases, work fine without those tweaks, because libstdc++ and libgcc are picked up along with libkuduclient.so from the toolchain (they are in the same directory). I decided to leave good enough alone here. run-binary.sh and friends can be used in any remaining edge cases to run binaries. An alternative to the 3 scripts would be to have an uber-script that set all the variables, but I felt that it was better to be specific about what each binary needed. Cleaning the LD_LIBRARY_PATH mess up has given me a distaste for scattershot setting of environment variables. I am open to revisiting this. Testing: * Ran tests on centos 7 * Manually tested that my dev env with LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu continued to work (for now). All ubuntu 16.04 and 18.04 dev envs that were set up with bootstrap_development.sh will be in this state. Change-Id: I61c83e6cca6debb87a12135e58ee501244bc9603 Reviewed-on: http://gerrit.cloudera.org:8080/14494 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-05-07 08:50:44 +00:00
Tim Armstrong	748e41ab41	IMPALA-9380: async query unregistration This change improves query latency by doing much of the heavyweight work of unregistering a query asynchronously, instead of synchronously on the RPC thread. The biggest win is to move the profile serialization off the RPC thread. Unregistration processing is done by a thread pool with 4 threads by default. This is configurable by --unregistration_thread_pool_size and --unregistration_thread_pool_queue_depth. This fixes a pre-existing bug where a query was temporarily neither in the in-flight queries nor the completed queries. It would be much easier to hit this with async unregistration because there is less synchronisation on the client side. Now the query is briefly in both maps, but this is handled as follows: * All places that look up both the maps will check the in-flight map first, and return a reference to the ClientRequestState, i.e. ignoring the entry in the query log. * The /queries page does not return completed queries if they were found in the in-flight queries map, so avoids duplicate results. The thread safety story changes slightly. Before this change, only one thread could remove the query from the map and close it, with only one thread "winning" the race to remove the ClientRequestState from the map. Since we leave the query in the map while being finalized, we instead use an atomic in ClientRequestState to ensure that only one thread does the finalization. Some misc cleanup was done as a result of these changes: * Fix a pre-existing TSAN race in RuntimeProfile that was revealed by the new concurrent unregister test. * Consolidate the various unknown query handle errors into an error code so that we consistently return the same string. * "Unregister query" should include flushing audit events. Testing: * Add a test that unregisters a query concurrent with other operations. * Ran exhaustive tests Perf: Ran TPC-H 30 with mt_dop=4. No regressions and some improvements: +----------+-----------------------+---------+------------+------------+----------------+ \| Workload \| File Format \| Avg (s) \| Delta(Avg) \| GeoMean(s) \| Delta(GeoMean) \| +----------+-----------------------+---------+------------+------------+----------------+ \| TPCH(30) \| parquet / none / none \| 5.38 \| -2.67% \| 4.02 \| -2.01% \| +----------+-----------------------+---------+------------+------------+----------------+ +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+ \| Workload \| Query \| File Format \| Avg(s) \| Base Avg(s) \| Delta(Avg) \| StdDev(%) \| Base StdDev(%) \| Iters \| Median Diff(%) \| MW Zval \| Tval \| +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+ \| TPCH(30) \| TPCH-Q1 \| parquet / none / none \| 5.36 \| 5.17 \| +3.61% \| 1.82% \| 1.17% \| 5 \| +3.73% \| 1.73 \| 3.65 \| \| TPCH(30) \| TPCH-Q6 \| parquet / none / none \| 1.77 \| 1.74 \| +1.48% \| 2.00% \| 2.50% \| 5 \| +2.89% \| 0.87 \| 1.03 \| \| TPCH(30) \| TPCH-Q12 \| parquet / none / none \| 3.02 \| 3.00 \| +0.79% \| 2.18% \| 2.21% \| 5 \| +1.55% \| 0.00 \| 0.57 \| \| TPCH(30) \| TPCH-Q16 \| parquet / none / none \| 1.65 \| 1.64 \| +0.81% \| 1.35% \| 0.03% \| 5 \| +0.07% \| 1.15 \| 1.34 \| \| TPCH(30) \| TPCH-Q2 \| parquet / none / none \| 1.21 \| 1.21 \| -0.07% \| 2.11% \| 2.14% \| 5 \| -0.04% \| -0.29 \| -0.05 \| \| TPCH(30) \| TPCH-Q4 \| parquet / none / none \| 2.50 \| 2.52 \| -0.49% \| 2.43% \| 3.34% \| 5 \| -0.09% \| -0.29 \| -0.27 \| \| TPCH(30) \| TPCH-Q20 \| parquet / none / none \| 2.86 \| 2.90 \| -1.28% \| 2.30% \| 1.24% \| 5 \| -0.02% \| -0.58 \| -1.11 \| \| TPCH(30) \| TPCH-Q3 \| parquet / none / none \| 4.35 \| 4.40 \| -1.15% \| 1.76% \| 1.78% \| 5 \| -1.12% \| -0.87 \| -1.03 \| \| TPCH(30) \| TPCH-Q19 \| parquet / none / none \| 4.10 \| 4.17 \| -1.80% \| 1.05% \| 1.31% \| 5 \| -1.25% \| -1.73 \| -2.40 \| \| TPCH(30) \| TPCH-Q14 \| parquet / none / none \| 3.20 \| 3.25 \| -1.52% \| 0.79% \| 2.56% \| 5 \| -1.56% \| -0.58 \| -1.26 \| \| TPCH(30) \| TPCH-Q18 \| parquet / none / none \| 10.81 \| 11.07 \| -2.34% \| 5.00% \| 7.01% \| 5 \| -1.40% \| -0.58 \| -0.61 \| \| TPCH(30) \| TPCH-Q7 \| parquet / none / none \| 11.19 \| 11.56 \| -3.18% \| 3.47% \| 6.02% \| 5 \| -0.90% \| -0.87 \| -1.03 \| \| TPCH(30) \| TPCH-Q21 \| parquet / none / none \| 19.91 \| 20.32 \| -2.02% \| 0.66% \| 0.47% \| 5 \| -2.18% \| -2.31 \| -5.64 \| \| TPCH(30) \| TPCH-Q17 \| parquet / none / none \| 5.63 \| 5.77 \| -2.40% \| 1.71% \| 2.01% \| 5 \| -1.84% \| -1.73 \| -2.05 \| \| TPCH(30) \| TPCH-Q5 \| parquet / none / none \| 3.91 \| 4.03 \| -2.74% \| 1.08% \| 1.86% \| 5 \| -2.45% \| -1.44 \| -2.88 \| \| TPCH(30) \| TPCH-Q8 \| parquet / none / none \| 4.55 \| 4.71 \| -3.48% \| 1.90% \| 3.53% \| 5 \| -2.35% \| -1.44 \| -1.96 \| \| TPCH(30) \| TPCH-Q22 \| parquet / none / none \| 1.93 \| 2.01 \| -3.96% \| 0.05% \| 4.05% \| 5 \| -2.59% \| -2.31 \| -2.19 \| \| TPCH(30) \| TPCH-Q10 \| parquet / none / none \| 4.52 \| 4.73 \| -4.26% \| 1.26% \| 2.43% \| 5 \| -3.40% \| -2.02 \| -3.51 \| \| TPCH(30) \| TPCH-Q11 \| parquet / none / none \| 1.02 \| 1.05 \| -3.58% \| 3.94% \| 2.36% \| 5 \| -4.56% \| -1.44 \| -1.79 \| \| TPCH(30) \| TPCH-Q13 \| parquet / none / none \| 9.52 \| 10.04 \| I -5.24% \| 2.14% \| 0.56% \| 5 \| I -4.67% \| -2.31 \| -5.57 \| \| TPCH(30) \| TPCH-Q15 \| parquet / none / none \| 3.49 \| 3.68 \| I -5.08% \| 0.07% \| 0.56% \| 5 \| I -5.66% \| -2.31 \| -20.08 \| \| TPCH(30) \| TPCH-Q9 \| parquet / none / none \| 11.92 \| 12.71 \| I -6.19% \| 0.57% \| 3.15% \| 5 \| I -4.99% \| -2.31 \| -4.33 \| +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+ Change-Id: I80027b1baeb4ab453938c0f6357b120f4035ba08 Reviewed-on: http://gerrit.cloudera.org:8080/15821 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-05-05 10:12:42 +00:00
David Knupp	bc9d7e063d	IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. This is the main patch for making the the impala-shell cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests irrepsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x for the time being. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=/<path to virtualenv>/bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - The $IMPALA_HOME/bin/impala-shell.sh now sets up the impala-shell python environment independenty from bin/set-pythonpath.sh. The default version of thrift is thrift-0.11.0 (See IMPALA-9489). - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell with no authentication using Python 3.7.5 Opened TCP connection to localhost:21000 ... OR Starting Impala Shell with LDAP-based authentication using Python 2.7.12 Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result, this was inconsistency was reflected in the way we handled strings in the impala-shell code, but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. What ultimately seemed like a better approach was to try to weed out as many existing spurious str.encode() and str.decode() calls as I could, and try to implement what is has colloquially been called a "unicode sandwich" -- namely, "bytes on the outside, unicode on the inside, encode/decode at the edges." The primary spot in the shell where we call decode() now is when sanitising input... args = self.sanitise_input(args.decode('utf-8')) ...and also whenever a library like re required it. Similarly, str.encode() is primarily used where a library like readline or csv requires is. - PYTHONIOENCODING needs to be set to utf-8 to override the default setting for python 2. Without this, piping or redirecting stdout results in unicode errors. - from __future__ import unicode_literals was added throughout Testing: To test the changes, I ran the e2e shell tests the way we always do (against the normal build tarball), and then I set up a python 3 virtual env with the shell installed as a package, and manually ran the tests against that. No effort has been made at this point to come up with a way to integrate testing of the shell in a python3 environment into our automated test processes. Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Reviewed-on: http://gerrit.cloudera.org:8080/15524 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-04-18 05:13:50 +00:00
David Knupp	c26e3db4bd	IMPALA-9362: Upgrade sqlparse 0.1.19 -> 0.3.1 Upgrades the impala-shell's bundled version of sqlparse to 0.3.1. There were some API changes in 0.2.0+ that required a re-write of the StripLeadingCommentFilter in impala_shell.py. A slight perf optimization was also added to avoid using the filter altogether if no leading comment is readily discernible. As 0.1.19 was the last version of sqlparse to support python 2.6, this patch also breaks Impala's compatibility with python 2.6. No new tests were added, but all existing tests passed without modification. Change-Id: I77a1fd5ae311634a18ee04b8c389d8a3f3a6e001 Reviewed-on: http://gerrit.cloudera.org:8080/15642 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-04-17 05:04:23 +00:00
Tim Armstrong	35d2718d36	IMPALA-9547: retry accept in test_shell_commandline This is a point solution to this particular socket.accept() call failing. The more general problem is described in https://www.python.org/dev/peps/pep-0475/ and fixed in Python 3.5. Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Reviewed-on: http://gerrit.cloudera.org:8080/15541 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-03-24 20:31:19 +00:00
Thomas Tauber-Marshall	3fd6f60b22	IMPALA-9414 (part 2): Support the 'Expect: 100-continue' http header The 'Expect: 100-continue' http header allows http clients to send only the headers for their request, get a confirmation back from the server that the headers are valid, and only then send the body of the request, avoiding the overhead of sending large requests that will ultimately fail. This patch adds support for this in the HS2 HTTP server by having THttpServer look for the header, and if it's present and the request is validated returning a '100 Continue' response before reading the body of the request. It also adds supports for using this header on large requests sent by impala-shell. Testing: - This case is covered by the existing test_large_sql, however that test was previously broken and passing spuriously. This patch fixes the test. - Passed all other shell tests. Change-Id: I4153968551acd58b25c7923c2ebf75ee29a7e76b Reviewed-on: http://gerrit.cloudera.org:8080/15284 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>	2020-03-13 17:00:42 +00:00
Alice Fan	e1d1428181	IMPALA-9384: Improve Impala shell usability by enabling live_progress in interactive mode In order to improve usability, this patch makes Impala shell show query processing status while the query is running. The patch enables shell option live_progress by default when a user launches impala shell in the interactive mode. The patch also adds a new command line flag "--disable_live_progress", which allows a user to disable live_progress at runtime. In the interactive mode, a user can disable live_progress by either using the command line flag or setting the option as False in the config file. As for in the non-interactive mode (when the -q or -f options are used), live reporting is not supported. Impala-shell will disable live_progress if the mode is detected. Testing: - Added and updated tests in test_shell_interactive.py and test_shell_commandline.py - Successfully ran all shell related tests Change-Id: I3765b775f663fa227e59728acffe4d5ea9a5e2d3 Reviewed-on: http://gerrit.cloudera.org:8080/15219 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2020-03-09 21:28:19 +00:00
wzhou-code	66e6879e8c	IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue on CentOS6/Python 2.6 ImpalaShell.test_config_file failed in negative test case, which ran impala shell with bad format config file - wrong option name and wrong option value. The testing code expect impala shell return both warning and error messages. But on CentOS6/Python 2.6, Impala shell only return error message. To fix it, separate the test cases as two test cases by running Impala shell in two different config file. Testing: - Passed all test cases in test_shell_commandline.py and test_shell_interactive.py. - Passed all core test in pre-review-test. - Passed EE tests in impala-private-parameterized with CentOS6. Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd Reviewed-on: http://gerrit.cloudera.org:8080/15139 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-02-04 00:55:54 +00:00
wzhou-code	6a23ec6985	IMPALA-6393: Add support for live_summary and live_progress in impalarc This patch adds support for live_summary and live_progress in impalarc. Testing: 1) Added unit-test cases in test_shell_commandline.py and test_shell_interactive.py for live_summary and live_progress. 2) Successfully ran all other tests in test_shell_interactive.py and test_shell_commandline.py Change-Id: If4549b775a7966ad89d661d0349cc78754e13a86 Reviewed-on: http://gerrit.cloudera.org:8080/14927 Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-01-23 01:48:13 +00:00
Lars Volker	74c7b7e55f	IMPALA-8863: Add support to run tests over HTTP/HS2 This change adds support to run backend tests over HTTP using a new version of Impyla (0.16.1). It also adds a test that exercises authentication over HTTP. Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018 Reviewed-on: http://gerrit.cloudera.org:8080/14059 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-11-26 22:46:40 +00:00
norbert.luksa	2114fc6155	IMPALA-4618: Fixing #Hosts and adding #Instances in exec summary When mt_dop > 0, the summary is reporting the number of fragment instances, instead of the number of hosts as the header would imply. This commit fixes the issue so the number of hosts will be shown under the #Hosts column. The commit also adds an #Inst column where the number of instances are shown (current behaviour). Tests: * Changed profile tests with mt_dop > 0. * Updated benchmark tests and shell tests accordingly. Change-Id: I3bdf9a06d9bd842b2397cd16c28294b6bec7af69 Reviewed-on: http://gerrit.cloudera.org:8080/14715 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-11-26 07:28:23 +00:00
Tim Armstrong	0364e5f8d4	IMPALA-8859: fix test_global_config_file for remote clusters I think the bug is that necessary environment variables were not passed in - the environment was clobbered instead of just having the necessary variable added. Change-Id: I448e5a7dfc0ab6fd53182a593e2fff1a12a10fd7 Reviewed-on: http://gerrit.cloudera.org:8080/14053 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-08-15 21:21:56 +00:00
Bharath Vissapragada	72c9370856	IMPALA-8717: impala-shell support for HS2 HTTP endpoint Adds impala-shell support to connect to HiveServer2 HTTP endpoint. Relies on toolchain change at https://gerrit.cloudera.org/#/c/13725/. Use --protocol='hs2-http' to enable this behavior. Example usages: --------------- impala-shell --protocol='hs2-http' (No auth) impala-shell --protocol='hs2-http' --ldap -u..... (PLAIN auth) impala-shell --protocol-'hs2-http' --ssl --ca_cert... (TLS) impala-shell --protocol='hs2-http' --ldap --ssl --ca_cert... (LDAP + TLS) Limitations: ----------- - Does not support Kerberos (-k) due to lack ot SPNEGO support. Testing: -------- - Parameterized existing shell tests to support this combination. - Added shell test coverage for LDAP auth. Change-Id: I8323950857dfe1c1dfd5377fde79f87bc2ce9534 Reviewed-on: http://gerrit.cloudera.org:8080/13746 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>	2019-07-29 05:43:48 +00:00
Tim Armstrong	9ecbe7d3dc	IMPALA-8553,IMPALA-8552: fix checks for remote cluster Apparently IMPALA_REMOTE_URL is not generally used for remote cluster tests: only --testing_remote_cluster is reliably set. Fix the is_remote_cluster() implementation to take into account REMOTE_DATA_LOAD and --testing_remote_cluster in addition to IMPALA_REMOTE_URL. Consistently use is_remote_cluster() in other tests instead of checking the pytest flag directly. There were a few lifecycle headaches with how ImpalaTestClusterProperties is used: * common.environ is imported from conftest, which means that the top-level code in the file runs before pytest command-line arguments have been registered and parsed. * ImpalaTestClusterProperties is used by various code, like build_flavor_timeout(), which runs before pytest command-line arguments have been parsed. * ImpalaTestClusterProperties is called from non-pytest scripts like start-impala-cluster.py, so the command-line arguments are not available. I dealt with the above challenges by making a few changes to do the detection later: * Lazily initializing a singleton ImpalaTestClusterProperties. This was not strictly necessary but makes the whole problem less sensitive to import order and module dependencies. * Adding cluster_properties fixture to make ImpalaTestClusterProperties available in tests without additional boilerplate. * Removing the caching of the local/remote build calculation. ImpalaTestClusterProperties is instantiated outside of python tests, but is_remote_cluster() is only called from python tests, so if we check flags in is_remote_cluster() we'll get the right results reliably. As a workaround to unblock remote tests, also assume catalog_v1 if accessing the web UI fails. Testing: Ran core tests against a regular minicluster. Ran tests against a remote cluster Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Reviewed-on: http://gerrit.cloudera.org:8080/13386 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-06-20 20:27:31 +00:00
Tim Armstrong	f1f3ae9ec2	IMPALA-7290: part 2: Add HS2 support to Impala shell HS2 is added as an option via --protocol=hs2. The user-visible differences in behaviour are minimal. Beeswax is still the default and can be explicitly enabled via --protocol=beeswax but will be deprecated. The default is unchanged because changing the default could break certain workflows, e.g. those that explicitly specify the port with -i or deployments that hit --fe_service_threads for HS2 and somehow rely on impala-shell not contributing to that limit. For most workflows the change is transparent and we should change the default in a major version change. This support requires Impala-specific extensions to the HS2 interface, similar to the existing extensions to Beeswax. Thus the HS2 shell is only forwards-compatible with newer Impala versions. I considered trying to gracefully degrade when the new extensions weren't present, but it didn't seem to be worth the ongoing testing effort. Differences between HS2 and Beeswax are abstracted into ImpalaClient subclasses. Here are the changes required to make it work: * Switch to TBinaryProtocolAccelerated to avoid perf regression. The HS2 protocol requires decoding more primitive values (because its not a string-per-row), which was slow with the pure python implementation of TBinaryProtocol. * Added bitarray module to efficiently unpack null indicators * Minimise invasiveness of changes by transposing and stringifying the columnar results into rows in impala_client.py. The transposition needs to happen before display anyway. * Add PingImpalaHS2Service() to get back version string and webserver address. * Add CloseImpalaOperation() extension to return DML row counts. This possibly addresses IMPALA-1789, although we need to confirm that this is a sufficient solution. * Add is_closed member to query handles to avoid shell independently tracking whether the query handle was closed or not. * Include query status in HS2 log to match beeswax. * HS2 GetLog() command now includes query status error message for consistency with beeswax. * "set"/"set all" uses the client requests options, not the session default. This captures the effective value of TIMEZONE, which was previously missing. This also requires test changes where the tests set non-default values, e.g. for ABORT_ON_ERROR. * "set all" on the server side returns REMOVED query options - the shell needs to know these so it can correctly ignore them. * Clean up self.orig_cmd/self.last_leading comment argument passing to avoid implicit parameter passing through multiple function calls. * Clean up argument handling in shell tests to consistently pass around lists of arguments instead of strings that are subject to shell tokenisation rules. * Consistently close connections in the shell to avoid leaking HS2 sessions. This is enforced by making ImpalaShell a context manager and also eliminating all sys.exit() calls that would bypass the explicit connection closing. Testing: * Shell tests can run with both protocols * Add tests for formatting of all types and NULL values * Added testing for floating point output formatting, which does change as a result of switching to server-side vs client-side formatting. * Verified that newly-added tests were actually going through HS2 by disabling hs2 on the minicluster and running tests. * Add checks to test_verify_metrics.py to ensure that no sessions are left open at the end of tests. Performance: Baseline from beeswax shell for large extract is as follows: $ time impala-shell.sh -B -q 'select * from tpch_parquet.orders' > /dev/null real 0m6.708s user 0m5.132s sys 0m0.204s After this change it is somewhat slower, but we generally don't consider bulk extract performance through the shell to be perf-critical: real 0m7.625s user 0m6.436s sys 0m0.256s Change-Id: I6d5cc83d545aacc659523f29b1d6feed672e2a12 Reviewed-on: http://gerrit.cloudera.org:8080/12884 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-06-20 10:23:28 +00:00
Ethan Xue	487547ec44	IMPALA-6042: Allow Impala shell to use a global impalarc config Currently, impalarc files can be specified on a per-user basis (stored in ~/.impalarc), and they aren't created by default. The Impala shell should pick up /etc/impalarc as well, in addition to the user-specific configurations. The intent here is to allow a "global" configuration of the shell by a system administrator. The default path of the global config file can be changed by setting the $IMPALA_SHELL_GLOBAL_CONFIG_FILE environment variable. Note that the options set in the user config file take precedence over those in the global config file. Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816 Reviewed-on: http://gerrit.cloudera.org:8080/13313 Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-05-30 03:59:54 +00:00
Tim Armstrong	b55d905322	IMPALA-8515: port shell tests to use shell build shell/make_shell_tarball.sh builds a tarball with all the shell dependencies bundled. We should test the contents of that tarball in the shell tests instead of using infra/python/env and the libraries bundled there. This tarball is one of the default targets (e.g. run by buildall.sh) so this should not affect any typical development workflows. Note that this means the shell tests now requires the shell tarball to be built locally, which doesn't necessarily happen for remote cluster tests, so we preserve the old behaviour in that case. Testing: Ran core tests on CentOS 6 and CentOS 7. Change-Id: I581363639b279a9c2ff1fd982bdb140260b24baa Reviewed-on: http://gerrit.cloudera.org:8080/13267 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-05-14 01:32:47 +00:00
Tim Armstrong	0a9ea803d2	IMPALA-7290: part 1: clean up shell tests This sets up the tests to be extensible to test shell in both beeswax and HS2 modes. Testing: * Add test dimension containing only beeswax in preparation for HS2 dimension. * Factor out hardcoded ports. * Add tests for formatting of all types and NULL values. * Merge date shell test into general type tests. * Added testing for floating point output formatting, which does change as a result of switching to server-side vs client-side formatting. * Use unique_database for tests that create tables. Change-Id: Ibe5ab7f4817e690b7d3be08d71f8f14364b84412 Reviewed-on: http://gerrit.cloudera.org:8080/13083 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-04-30 11:30:45 +00:00
Fredy Wijaya	7adb411bd9	IMPALA-8330: Impala shell config file should support flag names This patch updates the file format in Impala shell config file to accept both short and long flag names in addition to optparse's dest names (variable names to store flag values) for better user experience because dest names are internal to Impala shell. Format: [impala] flag_name=flag_value Example: [impala] ; This is long flag. query=select 1 ; This is short flag. Q=DEFAULT_FILE_FORMAT=parquet ; Flags can be repeated with , var=msg1=hello,var=msg2=world ; The old format using internal variable name is still supported for ; backward compatibility. keyval=msg3=foo,keyval=msg4=bar Testing: - Ran all E2E shell tests on Python 2.6 and 2.7. Change-Id: Ic43603c1b538af08fddcab1b2c1f6ad1af1a6cb9 Reviewed-on: http://gerrit.cloudera.org:8080/12823 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-03-29 09:38:24 +00:00
Fredy Wijaya	6853184234	IMPALA-8317: Add support for list type flags in Impala shell config file This patch adds support for list type flags in Impala shell config file, i.e. those that use action="append", such as --var and --query_option. To make it less error-prone, this patch also updates the logic for bool flags in the config file to also look at the correct type from the argument parser instead of relying on whether or not the default values are set in impala_shell_config_defaults.py. Testing: - Added a new test for list type flags - Ran all shell E2E tests Change-Id: I824ca15b4e1064a391b13deef9cecd34c928ef73 Reviewed-on: http://gerrit.cloudera.org:8080/12781 Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-03-21 10:29:43 +00:00
Fredy Wijaya	561158b306	IMPALA-3323: Fix unrecognizable shell option when --config_file is specified Impala shell defines a dictionary of default values for some shell options. Before this patch, the logic for --config_file checks if a shell option exists by using the default value dictionary, which does not contain the exhaustive list of shell options. This causes a valid option in the Impala shell config file to be treated as unrecognizable shell option due to the option not having a default value. The patch fixes the issue by changing the logic that checks for the existence of an option using the option list from optparse. The patch also fixes the missing dest parameter for ldap_password_cmd option. Testing: - Updated test_shell_commandline::test_config_file - Ran all shell tests Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Reviewed-on: http://gerrit.cloudera.org:8080/12245 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-01-23 00:15:28 +00:00
Philip Zeyliger	a8d3b765d8	IMPALA-7666: Adding an opaque client identifier to query options. We sometimes struggle to identify the client (e.g., a given version of a JDBC driver, Tableau, Hue, etc.) for a given query. This commit adds a User-Agent header style, called "Client Identifier", which clients can set as a Query Option. Nothing is done with this header, but it's written into logs and query profiles. This commit includes changes to impala-shell to include the version of impala shell with an associated test. A future commit will serialize the name of the py.test being run into this field, which is handy for figuring out where a query came from. Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4 Reviewed-on: http://gerrit.cloudera.org:8080/12130 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-01-10 03:11:35 +00:00
Yongjun Zhang	7cc9092212	IMPALA-5474: Adding a trivial subquery turns error into warning After adding a subquery to a query that fails with ERROR, it fails with WARNING. The fix here makes it return ERROR. Testing: Added unit tests; Done real cluster testing with reported cases. Change-Id: Ibedb11dd3d50bcdb21d508f7d21691925491946e Reviewed-on: http://gerrit.cloudera.org:8080/12022 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2019-01-04 21:51:48 +00:00
Fredy Wijaya	9c44853998	IMPALA-6591: Fix test_ssl flaky test test_ssl has a logic that waits for the number of in-flight queries to be 1. However, the logic for wait_for_num_in_flight_queries(1) only waits for the condition to be true for a period of time and does not throw an exception when the time has elapsed and the condition is not met. In other words, the logic in test_ssl that loops while the number of in-flight queries is 1 never gets executed. I was able to simulate this issue by making Impala shell start much longer. Prior to this patch, in the event that Impala shell took much longer to start, the test started sending the commands to Impala shell even when Impala shell was not ready to receive commands. The patch fixes the issue by waiting until Impala shell is connected. The patch also adds assert in other places that calls wait_for_num_in_flight_queries and updates the default behavior for Impala shell to wait until it is connected. Testing: - Ran core and exhaustive tests several times on CentOS 6 without any issue Change-Id: I9805269d8b806aecf5d744c219967649a041d49f Reviewed-on: http://gerrit.cloudera.org:8080/12047 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-12-12 22:44:34 +00:00
David Knupp	4c923b29d8	IMPALA-7783: Skip test_default_timezone when testing a real cluster. test_shell_commandline.py::test_default_timezone assumes that the cluster is running on the same platform as the test process, but that's only guaranteed when the testing a local minicluster. When run against a real cluster, the test executor can be a completely different OS. Change-Id: Ia4d4c503d2c77136cedd8f3fd830b6ce70d4457f Reviewed-on: http://gerrit.cloudera.org:8080/11820 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-10-30 19:47:08 +00:00
aphadke	2fb8ebaef2	IMPALA-7555: Set socket timeout in impala-shell impala-shell does not set any socket timeout while connecting to the impala server. This change sets a timeout on the socket before connecting and unsets it back after successfully connecting. The default timeout on this socket is 5 sec. Usage: impala-shell --client_connect_timeout=<value in ms> Testing: 1. Added a test where I create a random listening socket. impala-shell (with ssl enabled) connects to this socket and times out after 2 sec. 2. Created a kerberized impala cluster with ssl enabled and connected to the impalad using an openssl client (block the beeswax server thread to accept new connection) - E.g. - openssl s_client -connect <IP Addr>:21000 Used impala-shell to connect to the same impalad later. impala-shell timed out after the default of 5 sec.I verified it manually. Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813 Reviewed-on: http://gerrit.cloudera.org:8080/11540 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-10-18 01:41:42 +00:00
Fredy Wijaya	31dfa3e28c	IMPALA-7673: Support values from other variables in Impala shell --var Prior to this patch, Impala shell --var could not accept values from other variables unlike the one in Impala interactive shell with the SET command. This patch refactors the logic of variable substitution to use the same logic in both interactive and command line shells. Example: $ impala-shell.sh \ --var="msg1=1" \ --var="msg2=\${var:msg1}2" \ --var="msg3=\${var:msg1}\${var:msg2}" [localhost:21000] default> select ${var:msg3}; Query: select 112 +-----+ \| 112 \| +-----+ \| 112 \| +-----+ Testing: - Added a new shell test - Ran all shell tests Change-Id: Ib5b9fda329c45f2e5682f3cbc76d29ceca2e226a Reviewed-on: http://gerrit.cloudera.org:8080/11623 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-10-16 00:50:26 +00:00
Thomas Tauber-Marshall	dccc2de86a	IMPALA-7407: Fix test_cancellation failure on KeyboardInterrupt test_cancellation runs a shell process, executes a query, sleeps, sends a sigint to the process, and then checks that the query is cancelled. If the sigint is sent prior to the shell installing its signal handler, the test can fail with a KeyboardInterrupt. This patch removes the reliance on the sleep being long enough by actually reading the output of the shell and only cancelling the query once the shell shows that it has started running. Testing: - Ran test_cancellation in a loop. Change-Id: I65302ffb838d5185f77853bc2e53296f3a701d93 Reviewed-on: http://gerrit.cloudera.org:8080/11255 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Thomas Marshall <thomasmarshall@cmu.edu>	2018-08-20 19:56:11 +00:00
Fredy Wijaya	c0ff4fe8f6	IMPALA-7428: Fix flaky test_shell_command_line::test_large_sql This patch fixes the flaky test by rewriting the test to perform a large query from a non-existent table instead since the test focuses on Impala shell performance and not Impala performance in general. This patch also reduces the time limit from 30 seconds to 10 seconds and increases the number of columns in the query to 10K. Testing: - Ran all shell tests Change-Id: Ic87891f34872da65aac5ce02caf01da1c050efa5 Reviewed-on: http://gerrit.cloudera.org:8080/11201 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-08-14 20:09:24 +00:00
Csaba Ringhofer	dc32bf7703	IMPALA-7362: Add query option to set timezone This change adds a new query option "timezone" which defines the timezone used for utc<->local conversions. The main goal is to simplify testing, but I think that some users may also find it useful so it is added as a "general" query option. Examples: set timezone=UTC; set timezone="Europe/Budapest" The timezones are validated, but as query options are not sent to the coordinator immediately, the error checking will only happen when running a query. Leading/trailing " and 'characters are stripped because the / character cannot be entered unquoted in some contexts. Currently the timezone has effect in the following cases: -function now() -conversions between unix time and timestamp if flag use_local_tz_for_unix_timestamp_conversions is true -reading parquet timestamps written by Hive if flag convert_legacy_hive_parquet_utc_timestamps is true In the near future Parquet timestamps's isAdjustedToUTC property will be supported, which will decide whether to do utc->local conversion on a per file+column basis. This conversion will be also affected. Testing: - Extended test_local_tz_conversion.py to actually test utc<->local conversion. Until now the effect of flag use_local_tz_for_unix_timestamp_conversions was practically untested. - Added a shell test to check that the default of the query option is the system's timezone. - Added a shell test to check timezone validation. Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272 Reviewed-on: http://gerrit.cloudera.org:8080/11064 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-08-03 17:45:25 +00:00
Fredy Wijaya	1d491d6480	IMPALA-7259: Improve Impala shell performance This patch fixes the slow performance in Impala shell, especially for large queries by replacing all calls to sqlparse.format(sql_string, strip_comments=True) with the custom implementation of strip comments that does not use grouping. The code to strip leading comments was also refactored to not use grouping. * Benchmark running a query with 12K columns * Before the patch: $ time impala-shell.sh -f large.sql --quiet real 2m4.154s user 2m0.536s sys 0m0.088s After the patch: $ time impala-shell.sh -f large.sql --quiet real 0m3.885s user 0m1.516s sys 0m0.048s Testing: - Added a new test to test the Impala shell performance - Ran all shell tests on Python 2.6 and Python 2.7 Change-Id: Idac9f3caed7c44846a8c922dbe5ca3bf3b095b81 Reviewed-on: http://gerrit.cloudera.org:8080/10939 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-07-19 15:32:46 +00:00
poojanilangekar	28162117ad	IMPALA-6223: Gracefully handle malformed 'with' queries in impala-shell The change handles the exception thrown by shlex while parsing a malformed query. This patch was tested by adding both commandline and interactive shell tests. Change-Id: Ibb1e9238ac67b8ad3b2caa1748a18b04f384802d Reviewed-on: http://gerrit.cloudera.org:8080/10876 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-07-07 02:58:40 +00:00
Vincent Tran	1953e6191f	IMPALA-7181: Fix flaky test shell/test_shell_commandline.py::TestImpalaShell::test_socket_opening test_shell_commandline.py::TestImpalaShell::test_socket_opening uses netcat to listen to an ephemeral port to verify the expected socket opening behavior of impala-shell. This port number is fixed to 42000. When this port happens to be used by another outbound socket, this test will fail. This change refactors the test to use socket.bind(). The port used in this test is no longer fixed and will be picked automatically. This change also adds the proper cleanup logics to the various subprocess.Popen objects used in the test. Change-Id: Idd64632ded936d49fc404bcac75588dd7886be44 Reviewed-on: http://gerrit.cloudera.org:8080/10747 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-06-19 20:06:34 +00:00
Vincent Tran	e7d5a25a45	IMPALA-7130: impala-shell -b / --kerberos_host_fqdn flag overrides value passed in via -i / --impalad After additional testing around IMPALA-2782, it was discovered that impala-shell starts the session displaying the expected hostname (as passed -i flag) on the prompt. This gives the impression that the load balancer was bypassed, however the actual TSSLSocket is still created with the hostname passed in via the -b or --kerberos_host_fqdn flag. This change ensures that the hostname used to create the TSSLSocket will always be the one passed in via the -i flag on impala-shell. This change is required by IMPALA-2782. Testing: Using netcat, we verified that the impala daemon host[:port] value passed into the -i/--impalad option is indeed the one impala-shell tries to connect to in both cases (with and without -b) Change-Id: Ibee05bd0dbe8c6ae108b890f0ae0f6900149773a Reviewed-on: http://gerrit.cloudera.org:8080/10580 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-06-17 06:45:38 +00:00
Thomas Tauber-Marshall	af4909fdfc	IMPALA-7089: reenable test_kudu_dml_reporting This test was zfailed because it was broken. The change that broke it (IMPALA-2751 - `bdad189469`) was reverted (`84b55c6148`) so we can reenable the test. Change-Id: Ib5716f30458eb6db08f735bcbd2f79d205334930 Reviewed-on: http://gerrit.cloudera.org:8080/10577 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-06-06 07:12:19 +00:00
Thomas Tauber-Marshall	e660149670	IMPALA-7089: xfail test_kudu_dml_reporting test_kudu_dml_reporting has been causing a large number of build failures. Temporarily disable it while we figure out what's going on. Also improve output of test_kudu_dml_reporting on failure. Change-Id: I222e4c86a50f2450201fbad8b937e8fcf4fac31d Reviewed-on: http://gerrit.cloudera.org:8080/10527 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-05-29 21:49:35 +00:00
Tim Armstrong	a7c6fa7101	IMPALA-7067: deflake test_cancellation Tweak the query so that it still runs for a long time but can cancel the fragment quicker instead of being stuck in a long sleep() call. Change-Id: I0c90d4f5c277f7b0d5561637944b454f7a44c76e Reviewed-on: http://gerrit.cloudera.org:8080/10499 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Tim Armstrong <tarmstrong@cloudera.com>	2018-05-24 19:27:51 +00:00
Thomas Tauber-Marshall	20c161d758	IMPALA-6740: Fix flaky test_cancellation test_shall_commandline:test_cancellation starts an Impala shell process, runs a query, sleeps briefly, and then cancels the query by sending a SIGINT to the process. This has been occasionally failing with either the error 'KeyboardInterrupt' or with the query succeeding instead of being cancelled. The problem occurs if the process hasn't fully started up before the SIGINT is sent - in particular, if ImpalaShell:__init__ hasn't installed the signal handler, which happens sometimes depending on concurrent load on the machine. Depending on the exact timing, this may cause a 'KeyboardInterrupt' that isn't handled, or the signal may be ignored and the query allowed to run to completion. The solution is to increase the time spent sleeping. Testing: - I can reliably repro the problem locally by reducing the sleep time. Change-Id: I5d13de6207807e4ba2e2e406a29d670f01d6c3a0 Reviewed-on: http://gerrit.cloudera.org:8080/10177 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-04-25 01:25:33 +00:00
Tianyi Wang	8e86678d65	IMPALA-5690: Part 1: Rename ostream operators for thrift types Thrift 0.9.3 implements "ostream& operator<<(ostream&, T)" for thrift data types while impala did the same to enums and special types including TNetworkAddress and TUniqueId. To prepare for the upgrade of thrift 0.9.3, this patch renames these impala defined functions. In the absence of operator<<, assertion macros like DCHECK_EQ can no longer be used on non-enum thrift defined types. Change-Id: I9c303997411237e988ef960157f781776f6fcb60 Reviewed-on: http://gerrit.cloudera.org:8080/9168 Reviewed-by: Tianyi Wang <twang@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-04-20 10:28:12 +00:00
Tim Armstrong	318051cc21	IMPALA-2717: fix output of formatted unicode to non-TTY The bug is that PrettyOutputFormatter.format() returned a unicode object, and Python cannot automatically write unicode objects to output streams where there is no default encoding. The fix is to convert to UTF-8 encoded in a regular string, which can be output to any output device. This makes the output type consistent with DelimitedOutputFormatter.format(). Based on code by Marcell Szabo. Testing: Added a basic test. Played around in an interactive shell to make sure that unicode characters still work in interactive mode. Change-Id: I9de641ecf767a2feef3b9f48b344ef2d55e17a7f Reviewed-on: http://gerrit.cloudera.org:8080/9928 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-04-12 20:34:47 +00:00
Fredy wijaya	2186c692ee	IMPALA-6615: CTE query in Impala shell does not show query web links This patch is to show query submitted and query progress web links in Impala shell for CTE queries. Testing: - Ran end-to-end shell tests Change-Id: Ie3352406e3b048be395a20405c8e6b911e663164 Reviewed-on: http://gerrit.cloudera.org:8080/9537 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2018-03-08 11:37:09 +00:00
Tim Armstrong	acfd169c8e	IMPALA-4319: remove some deprecated query options Adds a concept of a "removed" query option that has no effect but does not return an error when a user attempts to set it. These options are not returned by "set" or "set all" commands that are executed in impala-shell or server-side. These query options have been deprecated for several releases: DEFAULT_ORDER_BY_LIMIT, ABORT_ON_DEFAULT_LIMIT_EXCEEDED, V_CPU_CORES, RESERVATION_REQUEST_TIMEOUT, RM_INITIAL_MEM, SCAN_NODE_CODEGEN_THRESHOLD, MAX_IO_BUFFERS RM_INITIAL_MEM did still have an effect, but it was undocumented and MEM_LIMIT should be used in preference. DISABLE_CACHED_READS also had an effect but it was documented as deprecated. Otherwise the options had no effect at all. Testing: Ran exhaustive build. Updated query option tests to reflect the new behaviour. Cherry-picks: not for 2.x. Change-Id: I9e742e9b0eca0e5c81fd71db3122fef31522fcad Reviewed-on: http://gerrit.cloudera.org:8080/9118 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2018-02-01 08:26:26 +00:00
Tim Armstrong	56464e4616	IMPALA-3998: Remove refresh_after_connect option from shell This removes the deprecated option in time for 3.0. Testing: Ran core tests. Manually ran the shell with the argument to confirm that it reported "no such option". Cherry-picks: not for 2.x. Change-Id: I8f430bad0578e150d5e80066b9e7572041af4a15 Reviewed-on: http://gerrit.cloudera.org:8080/9072 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2018-01-23 01:48:50 +00:00
Gabor Kaszab	281f7ab010	IMPALA-6318: Revert "Adjustment for hanging query cancellation test" Jenkins jobs occasionally hang on test_query_cancellation_during_fetch. There was a workaround proposal submitted under this Jira ID, however, apparently jobs still hang on this test randomly. Reverting the workaround and skipping the test until further fix proposal provided. This reverts commit `7810d1f9a2`. Change-Id: I51acee49b5a17c4852410b7568fd1d092b114a6d Reviewed-on: http://gerrit.cloudera.org:8080/8972 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2018-01-23 01:24:29 +00:00
Gabor Kaszab	7810d1f9a2	IMPALA-6318: Adjustment for hanging query cancellation test Apparently test_query_cancellation_during_fetch hangs occasionally in Jenkins builds. The Impala debug page shows the query being cancelled, however, on the host the ImpalaShell process related to that query is still running. Since I had no luck in reproducing the issue locally I only have a theory what might be going on here: The query is cancelled successfully on Impala backend and when the test tries to get the stdout and stderr from the ImpalaShell it gets stuck. It might be the case that ImpalaShell process fetching the query results holds the stdout. According to the documentation of subprocess.communicate() it may cause issues to fetch data when the data size is large or unlimited, that we can consider to be the case here. As a workaround there is a new optional parameter to util.ImpalaShell to omit the stdout because this test wouldn't use it anyway and we get rid of fetching the large result from ImpalaShell. Change-Id: I082c83b91b6d0c527de92c7992f0dc9d1b290433 Reviewed-on: http://gerrit.cloudera.org:8080/8852 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2018-01-03 20:32:24 +00:00
Gabor Kaszab	75121819be	IMPALA-6265 Query cancellation test enhancements In the query cancellation tests it is essential to wait until the query gets to a desired state (waiting_to_finish, fetching) and then cancel it. Apparently, ASAN query execution happens slower than on a Release build. As a result a hard coded timeout threshold is not sufficient to cover all the builds, or should be set to a wastingly high value. As a solution the query state is checked on the Impala debug page in intervals until it reaches the desired state or the maximum retry attempt value is reached. Change-Id: Ie0bff485a872df7be8efd784314a6ca91aaadd11 Reviewed-on: http://gerrit.cloudera.org:8080/8713 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-12-05 21:40:11 +00:00

1 2 3 4

160 Commits