impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Joe McDonnell	c5a0ec8bdf	IMPALA-11980 (part 1): Put all thrift-generated python code into the impala_thrift_gen package This puts all of the thrift-generated python code into the impala_thrift_gen package. This is similar to what Impyla does for its thrift-generated python code, except that it uses the impala_thrift_gen package rather than impala._thrift_gen. This is a preparatory patch for fixing the absolute import issues. This patches all of the thrift files to add the python namespace. This has code to apply the patching to the thirdparty thrift files (hive_metastore.thrift, fb303.thrift) to do the same. Putting all the generated python into a package makes it easier to understand where the imports are getting code. When the subsequent change rearranges the shell code, the thrift generated code can stay in a separate directory. This uses isort to sort the imports for the affected Python files with the provided .isort.cfg file. This also adds an impala-isort shell script to make it easy to run. Testing: - Ran a core job Change-Id: Ie2927f22c7257aa38a78084efe5bd76d566493c0 Reviewed-on: http://gerrit.cloudera.org:8080/20169 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-15 17:03:02 +00:00
Daniel Becker	b73847f178	IMPALA-10851: Codegen for structs IMPALA-9495 added support for struct types in SELECT lists but only with codegen turned off. This commit implements codegen for struct types. To facilitate this, code generation for reading and writing 'AnyVal's has been refactored. A new class, 'CodegenAnyValReadWriteInfo' is introduced. This class is an interface between sources and destinations, one of which is an 'AnyVal' object: sources generate an instance of this class and destinations take that instance and use it to write the value. The other side can for example be tuples from which we read (in the case of 'SlotRef') or tuples we write into (in case of materialisation, see Tuple::CodegenMaterializeExprs()). The main advantage is that sources do not have to know how to write their destinations, only how to read the values (and vice versa). Before this change, many tests that involve structs ran only with codegen turned off. Now that codegen is supported in these cases, these tests are also run with codegen on. Testing: - enabed tests for structs in the select list with codegen on in tests/query_test/test_nested_types.py - enabled codegen in other tests where it used to be disabled because it was not supported. Change-Id: I5272c3f095fd9f07877104ee03c8e43d0c4ec0b6 Reviewed-on: http://gerrit.cloudera.org:8080/18526 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-04-14 13:46:59 +00:00
Joe McDonnell	eb66d00f9f	IMPALA-11974: Fix lazy list operators for Python 3 compatibility Python 3 changes list operators such as range, map, and filter to be lazy. Some code that expects the list operators to happen immediately will fail. e.g. Python 2: range(0,5) == [0,1,2,3,4] True Python 3: range(0,5) == [0,1,2,3,4] False The fix is to wrap locations with list(). i.e. Python 3: list(range(0,5)) == [0,1,2,3,4] True Since the base operators are now lazy, Python 3 also removes the old lazy versions (e.g. xrange, ifilter, izip, etc). This uses future's builtins package to convert the code to the Python 3 behavior (i.e. xrange -> future's builtins.range). Most of the changes were done via these futurize fixes: - libfuturize.fixes.fix_xrange_with_import - lib2to3.fixes.fix_map - lib2to3.fixes.fix_filter This eliminates the pylint warnings: - xrange-builtin - range-builtin-not-iterating - map-builtin-not-iterating - zip-builtin-not-iterating - filter-builtin-not-iterating - reduce-builtin - deprecated-itertools-function Testing: - Ran core job Change-Id: Ic7c082711f8eff451a1b5c085e97461c327edb5f Reviewed-on: http://gerrit.cloudera.org:8080/19589 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Csaba Ringhofer	7e65e0095a	IMPALA-11447: Fix crash when fetching arrays/structs with result caching Some parts of HS2ColumnarResultSet were not prepared for returning non-scalar types. This code only runs if impala.resultset.cache.size is set, which is not the case in most of tests. The issue was caught with Hue, which uses result caching. Testing: - Added a regression test in test_fetch_first.py, which contained other tests that used result caching. - It turned out that some tests in the file did not run at all, as @needs_session() needs the parenthesis at the end. For this reason some test fixes were added to run them correctly, though these changes are totally unrelated to the current issue. Change-Id: Ia4dd8f76187dc3555207e2d30d46d811e0a7a126 Reviewed-on: http://gerrit.cloudera.org:8080/18768 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-07-21 18:35:36 +00:00
Sahil Takiar	4502097a2d	IMPALA-9799: Add retries to TestFetchFirst get_num_in_flight_queries calls The calls to get_num_in_flight_queries in TestFetchFirst are flaky because they expect the number of in flight queries to drop to 0 immediately. This might not always be true, especially in ASAN builds where Impala is generally slower. This patch wraps to call to get_num_in_flight_queries in ImpalaTestSuite.assert_eventually, which adds retries to the calls to get_num_in_flight_queries. Testing: * Ran tests/hs2/test_fetch_first.py locally Change-Id: I349f861e8219e62311e8d4e0bfbd8f3618f0fa46 Reviewed-on: http://gerrit.cloudera.org:8080/16218 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-07-22 23:28:17 +00:00
Sahil Takiar	34d132c513	IMPALA-8825: Add additional counters to PlanRootSink Adds the counters RowsSent and RowsSentRate to the PLAN_ROOT_SINK section of the profile: PLAN_ROOT_SINK: - PeakMemoryUsage: 4.01 MB (4202496) - RowBatchGetWaitTime: 0.000ns - RowBatchSendWaitTime: 0.000ns - RowsSent: 10 (10) - RowsSentRate: 416.00 /sec RowsSent tracks the number of rows sent to the PlanRootSink via PlanRootSink::Send. RowsSentRate tracks the rate that rows are sent to the PlanRootSink. Adds the counters NumRowsFetched, NumRowsFetchedFromCache, and RowMaterializationRate to the ImpalaServer section of the profile. ImpalaServer: - ClientFetchWaitTimer: 11.999ms - NumRowsFetched: 10 (10) - NumRowsFetchedFromCache: 10 (10) - RowMaterializationRate: 9.00 /sec - RowMaterializationTimer: 1s007ms NumRowsFetched tracks the total number of rows fetched by the query, but does not include rows fetched from the cache. NumRowsFetchedFromCache tracks the total number of rows fetched from the query results cache. RowMaterializationRate tracks the rate at which rows are materialized. RowMaterializationTimer already existed and tracks how much time is spent materializing rows. Testing: * Added tests to test_fetch_first.py and query_test/test_fetch.py * Enabled some tests in test_fetch_first.py that were pending the completion of IMPALA-8819 * Ran core tests Change-Id: Id9e101e2f3e2bf8324e149c780d35825ceecc036 Reviewed-on: http://gerrit.cloudera.org:8080/14180 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Sahil Takiar <stakiar@cloudera.com>	2019-09-16 19:36:11 +00:00
Sahil Takiar	bbec8fa749	IMPALA-8781: Result spooling tests to cover edge cases and cancellation Adds additional tests to test_result_spooling.py to cover various edge cases when fetching query results (ensure all Impala types are returned properly, UDFs are evaluated correctly, etc.). A new QueryTest file result-spooling.test is added to encapsulate all these tests. Tests with a decreased ROW_BATCH_SIZE are added as well to validate that BufferedPlanRootSink buffers row batches correctly. BufferedPlanRootSink requires careful synchronization of the producer and consumer threads, especially when queries are cancelled. The TestResultSpoolingCancellation class is dedicated to running cancellation tests with SPOOL_QUERY_RESULTS = true. The implementation is heavily borrowed from test_cancellation.py and some of the logic is re-factored into a new utility class called cancel_utils.py to avoid code duplication between test_cancellation.py and test_result_spooling.py. Testing: * Looped test_result_spooling.py overnight with no failures * Core tests passed Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283 Reviewed-on: http://gerrit.cloudera.org:8080/13907 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-08-06 02:35:54 +00:00
Tim Armstrong	2ca7f8e7c0	IMPALA-7995: part 1: fixes for e2e dockerised impala tests This fixes all core e2e tests running on my local dockerised minicluster build. I do not yet have a CI job or script running but I wanted to get feedback on these changes sooner. The second part of the change will include the CI script and any follow-on fixes required for the exhaustive tests. The following fixes were required: * Detect docker_network from TEST_START_CLUSTER_ARGS * get_webserver_port() does not depend on the caller passing in the default webserver port. It failed previously because it relied on start-impala-cluster.py setting -webserver_port for all processes. * Add SkipIf markers for tests that don't make sense or are non-trivial to fix for containerised Impala. * Support loading Impala-lzo plugin from host for tests that depend on it. * Fix some tests that had 'localhost' hardcoded - instead it should be $INTERNAL_LISTEN_HOST, which defaults to localhost. * Fix bug with sorting impala daemons by backend port, which is the same for all dockerised impalads. Testing: I ran tests locally as follows after having set up a docker network and starting other services: ./buildall.sh -noclean -notests -ninja ninja -j $IMPALA_BUILD_THREADS docker_images export TEST_START_CLUSTER_ARGS="--docker_network=impala-cluster" export FE_TEST=false export BE_TEST=false export JDBC_TEST=false export CLUSTER_TEST=false ./bin/run-all-tests.sh Change-Id: Iee86cbd2c4631a014af1e8cef8e1cd523a812755 Reviewed-on: http://gerrit.cloudera.org:8080/12639 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-04-13 02:42:32 +00:00
Tim Armstrong	f46de21140	IMPALA-1760: Implement shutdown command This is the same patch except with fixes for the test failures on EC and S3 noted in the JIRA. This allows graceful shutdown of executors and partially graceful shutdown of coordinators (new operations fail, old operations can continue). Details: * In order to allow future admin commands, this is implemented with function-like syntax and does not add any reserved words. * ALL privilege is required on the server * The coordinator impalad that the client is connected to can be shut down directly with ":shutdown()". * Remote shutdown of another impalad is supported, e.g. with ":shutdown('hostname')", so that non-coordinators can be shut down and for the convenience of the client, which does not have to connect to the specific impalad. There is no assumption that the other impalad is registered in the statestore; just that the coordinator can connect to the other daemon's thrift endpoint. This simplifies things and allows shutdown in various important cases, e.g. statestore down. * The shutdown time limit can be overridden to force a quicker or slower shutdown by specifying a deadline in seconds after the statement is executed. * If shutting down, a banner is shown on the root debug page. Workflow: 1. (if a coordinator) clients are prevented from submitting queries to this coordinator via some out-of-band mechanism, e.g. load balancer 2. the shutdown process is started via ":shutdown()" 3. a bit is set in the statestore and propagated to coordinators, which stop scheduling fragment instances on this daemon (if an executor). 4. the query startup grace period (which is ideally set to the AC queueing delay plus some additional leeway) expires 5. once the daemon is quiesced (i.e. no fragments, no registered queries), it shuts itself down. 6. If the daemon does not successfully quiesce (e.g. rogue clients, long-running queries), after a longer timeout (counted from the start of the shutdown process) it will shut down anyway. What this does: * Executors can be shut down without causing a service-wide outage * Shutting down an executor will not disrupt any short-running queries and will wait for long-running queries up to a threshold. * Coordinators can be shut down without query failures only if there is an out-of-band mechanism to prevent submission of more queries to the shut down coordinator. If queries are submitted to a coordinator after shutdown has started, they will fail. * Long running queries or other issues (e.g. stuck fragments) will slow down but not prevent eventual shutdown. Limitations: * The startup grace period needs to be configured to be greater than the latency of statestore updates + scheduling + admission + coordinator startup. Otherwise a coordinator may send a fragment instance to the shutting down impalad. (We could automate this configuration as a follow-on) * The startup grace period means a minimum latency for shutdown, even if the cluster is idle. * We depend on the statestore detecting the process going down if queries are still running on that backend when the timeout expires. This may still be subject to existing problems, e.g. IMPALA-2990. Tests: * Added parser, analysis and authorization tests. * End-to-end test of shutting down impalads. * End-to-end test of shutting down then restarting an executor while queries are running. * End-to-end test of shutting down a coordinator - New queries cannot be started on coord, existing queries continue to run - Exercises various Beeswax and HS2 operations. Change-Id: I8f3679ef442745a60a0ab97c4e9eac437aef9463 Reviewed-on: http://gerrit.cloudera.org:8080/11484 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-09-26 01:28:36 +00:00
Tim Armstrong	16a04ce81b	Revert "IMPALA-1760: Implement shutdown command" This reverts commit `fda44aed9d`. A couple of the tests broken on S3 and erasure coding. Reverting to unblock testing until we can come up with a proper fix. Change-Id: Icef47b3aa67bc056c40592d47e93c4ebc57be98c Reviewed-on: http://gerrit.cloudera.org:8080/11435 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2018-09-14 01:12:22 +00:00
Tim Armstrong	fda44aed9d	IMPALA-1760: Implement shutdown command This allows graceful shutdown of executors and partially graceful shutdown of coordinators (new operations fail, old operations can continue). Details: * In order to allow future admin commands, this is implemented with function-like syntax and does not add any reserved words. * ALL privilege is required on the server * The coordinator impalad that the client is connected to can be shut down directly with ":shutdown()". * Remote shutdown of another impalad is supported, e.g. with ":shutdown('hostname')", so that non-coordinators can be shut down and for the convenience of the client, which does not have to connect to the specific impalad. There is no assumption that the other impalad is registered in the statestore; just that the coordinator can connect to the other daemon's thrift endpoint. This simplifies things and allows shutdown in various important cases, e.g. statestore down. * The shutdown time limit can be overridden to force a quicker or slower shutdown by specifying a deadline in seconds after the statement is executed. * If shutting down, a banner is shown on the root debug page. Workflow: 1. (if a coordinator) clients are prevented from submitting queries to this coordinator via some out-of-band mechanism, e.g. load balancer 2. the shutdown process is started via ":shutdown()" 3. a bit is set in the statestore and propagated to coordinators, which stop scheduling fragment instances on this daemon (if an executor). 4. the query startup grace period (which is ideally set to the AC queueing delay plus some additional leeway) expires 5. once the daemon is quiesced (i.e. no fragments, no registered queries), it shuts itself down. 6. If the daemon does not successfully quiesce (e.g. rogue clients, long-running queries), after a longer timeout (counted from the start of the shutdown process) it will shut down anyway. What this does: * Executors can be shut down without causing a service-wide outage * Shutting down an executor will not disrupt any short-running queries and will wait for long-running queries up to a threshold. * Coordinators can be shut down without query failures only if there is an out-of-band mechanism to prevent submission of more queries to the shut down coordinator. If queries are submitted to a coordinator after shutdown has started, they will fail. * Long running queries or other issues (e.g. stuck fragments) will slow down but not prevent eventual shutdown. Limitations: * The startup grace period needs to be configured to be greater than the latency of statestore updates + scheduling + admission + coordinator startup. Otherwise a coordinator may send a fragment instance to the shutting down impalad. (We could automate this configuration as a follow-on) * The startup grace period means a minimum latency for shutdown, even if the cluster is idle. * We depend on the statestore detecting the process going down if queries are still running on that backend when the timeout expires. This may still be subject to existing problems, e.g. IMPALA-2990. Tests: * Added parser, analysis and authorization tests. * End-to-end test of shutting down impalads. * End-to-end test of shutting down then restarting an executor while queries are running. * End-to-end test of shutting down a coordinator - New queries cannot be started on coord, existing queries continue to run - Exercises various Beeswax and HS2 operations. Change-Id: I4d5606ccfec84db4482c1e7f0f198103aad141a0 Reviewed-on: http://gerrit.cloudera.org:8080/10744 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-09-11 23:57:20 +00:00
Henry Robinson	b9034ea0d5	IMPALA-4580: Fix crash with FETCH_FIRST when #rows < result cache size The following sequence can lead to a crash: 1. Client sets result cache size to N 2. Client issues query with #results < N 3. Client fetches all results, triggering eos and tearing down Coordinator::root_sink_. 4. Client restarts query with FETCH_FIRST. 5. Client reads all results again. After cache is exhausted, Coordinator::GetNext() is called to detect eos condition again. 6. GetNext() hits DCHECK(root_sink_ != nullptr). This patch makes GetNext() a no-op if called after it sets *eos, avoiding the crash.. Testing: Regression test that triggered the bug before this fix. Change-Id: I454cd8a6cf438bdd0c49fd27c2725d8f6c43bb1d Reviewed-on: http://gerrit.cloudera.org:8080/5335 Reviewed-by: Henry Robinson <henry@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-12-03 11:07:04 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Casey Ching	074e5b4349	Remove hashbang from non-script python files Many python files had a hashbang and the executable bit set though they were not intended to be run a standalone script. That makes determining which python files are actually scripts very difficult. A future patch will update the hashbang in real python scripts so they use $IMPALA_HOME/bin/impala-python. Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba Reviewed-on: http://gerrit.cloudera.org:8080/599 Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Internal Jenkins	2015-08-04 05:26:07 +00:00
Henry Robinson	66295a8554	IMPALA-1264: Re-enable test_fetch_first Since Impala may sometimes return fewer rows than asked for, it's not safe to test for an exact size response from a single fetch() call except in very particular cases (when either 0 or 1 rows are expected). The HS2 fetch_first tests relied on the full request always being honoured in a couple of places, and as a result are prone to occasional failures due to 'underflow'. This patch changes the fetch() call to use fetch_until() in all places where underflow could happen, and removes the xfail restriction from those V1 and V6 tests. Change-Id: Ia62f3624947530d516a87f84e706e305048b916f Reviewed-on: http://gerrit.cloudera.org:8080/192 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins	2015-03-11 16:39:39 -07:00
Henry Robinson	a8c76c57fd	Disable test_query_stmts Change-Id: I584ebc4cd2a4c578b314272cb8e462f210af19db Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5316 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-11-21 16:40:42 -08:00
Henry Robinson	c850e6c65c	IMPALA-1453: Fix many bugs with HS2 FETCH_FIRST This patch fixes a number of bugs with HS2 columnar result sets, particularly around the computation of their size in memory. It also fixes test_fetch_first.py to be robust to the possibility of a single fetch not returning all expected rows (IMPALA-1264). In order to properly account for the columnar result set's size in memory, we need to cope with the fact that the size of two result sets combined might be larger than a result set which the result of appending one to the other. However, we estimate the expected size of the result cache as its current size plus the size of the result set to be added (we do this so that we don't perform the copy, then find out it used too much memory). This patch adds logic to QueryExecState::FetchRowsInternal() to correct the true memory usage after the copy, in the case where memory was saved. (This happens for example when we stitch together two null columns and wind up saving a byte). Change-Id: Ie172840ea0ff40e43370825ca8f671a86dde4f22 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5199 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-11-21 16:40:41 -08:00
Henry Robinson	6bc411c890	Add support for HS2 protocol V6 This patch adds support for V6 of the HS2 protocol, which notably includes columnar organisation of result sets. Clients that set their protocol version to < V6 will receive result sets in the traditional row orientation. The performance of fetches over HS2 goes up significantly as a result, since the V1 protocol had some pathologies in its deserialisation performance. Beeswax Row materialisation: 455ms, client processing time: 523ms HS2 V6: Row materialisation: 444ms, client processing time: 1.8s HS2 V1: Row materialisation: 585ms, client processing time: 15.9s (!) TODO: Add support for the CHAR datatype The following patch is also included: Fix wait-for-hiveserver2.py when Impala moves to HS2 V6 Due to HIVE-6050, older versions of Hive are not compatible with newer clients (even those that try to use old protocol versions). wait-for-hiveserver2.py uses HS2 to talk to the HiveServer2 service, but picks up the newer version from V6, and fails. This patch temporarily re-adds cli_service.thrift (renaming the Thrift service as LegacyTCLIService) only for wait-for-hiveserver2.py to use. As soon as Impala's thirdparty Hive moves to HS2 V6, we can get rid of this change. Change-Id: I2cbe884345ae7e772620b80a29b6574bd6532940 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4402 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-09-18 20:17:18 -07:00
ishaan	aa1a63aca7	Disable test_fetch_first and test_session_expiration. This patch disables these two tests to unblocks builds. Tested the custom_cluster tests locally to confirm that they do indeed get skipped. Change-Id: Ie6e67daf7e5705c2227e19a079682e87ad35ee6c Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4381 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4393 Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-09-17 23:33:14 -07:00
Dan Hecht	4f6d0cbb7e	IMPALA-1211: Query should be unregistered if SetResultCache returns error status Otherwise, the query will hang around until the session or connection is closed. Enhance a test so that this is regression tested. While I'm here, some minor cleanup: hoist out some code that's sprinkled between the return_val writes to make it more obvious what the code is doing. Change-Id: I594e583c4e10f12c3f9b920ea4d44e5da7df3558 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4211 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins (cherry picked from commit 799d1fc6cd0e871cdf7eeb6bcde12822574d8ff2) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4244 Reviewed-by: Daniel Hecht <dhecht@cloudera.com> Tested-by: Daniel Hecht <dhecht@cloudera.com>	2014-09-11 09:03:42 -07:00
Srinath Shankar	d193a1e8a5	IMPALA-963: Impala crash in ClearResultCache() The issue is that Impala crashes in ClearResultCache() with result caching on for parallel inserts. The reason is that the ClearResltCache() accesses the coordinator RuntimeState to update the query mem tracker. However, for there is no coordinator fragment (or RuntimeState) for parallel inserts. The fix is to intiialize a query mem tracker to track memory usage in the coordinator instance even if there is no coordinator fragment. Change-Id: I3a2ef14860f683910c29ae19b931202ca6867b9f Reviewed-on: http://gerrit.ent.cloudera.com:8080/2501 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins	2014-05-19 12:40:12 -07:00
Alex Behm	6b769d011d	Adds limited support for the FETCH_FIRST fetch orientation in HS2 client requests. Adds a bounded query-result cache that clients can enable by setting an 'impala.resultset.cache.size' option in the HS2 confOverlay mapof the HS2 exec request. Impala permits FETCH_FIRST for a particular stmt iff result caching is enabled. FETCH_FIRST will succeed as long all previously fetched rows fit into the bounded result cache. Regardless of whether a FETCH_FIRST succeeds or not, clients may always resume fetching with FETCH_NEXT. The FETCH_FIRST feature is intended to allow HUE users to export an entire result set (to Excel, CSV, etc.) after browsing through a few pages of results, without having ro re-run the query from scratch. Change-Id: I71ab4794ddef30842594c5e1f7bc94724d6ce89f Reviewed-on: http://gerrit.ent.cloudera.com:8080/1356 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1406	2014-01-30 14:58:46 -08:00

23 Commits