impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Michael Brown	4028e9c5ec	IMPALA-6759: align stress test memory estimation parse pattern The stress test never expected to see memory estimates on the order of PB. Apparently it can happen with TPC DS 10000, so update the pattern. It's not clear how to quickly write a test to catch this, because it involves crossing language boundaries and possibly having a massively-scaled dataset. I think leaving a comment in both places is good enough for now. Change-Id: I317c271888584ed2a817ee52ad70267eae64d341 Reviewed-on: http://gerrit.cloudera.org:8080/9846 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins	2018-03-29 03:27:25 +00:00
Michael Brown	2c0926e2de	Revert "IMPALA-6759: align stress test memory estimation parse pattern" This reverts commit `2521848753`.	2018-03-28 15:28:48 -07:00
Michael Brown	2521848753	IMPALA-6759: align stress test memory estimation parse pattern The stress test never expected to see memory estimates on the order of PB. Apparently it can happen with TPC DS 10000, so update the pattern. It's not clear how to quickly write a test to catch this, because it involves crossing language boundaries and possibly having a massively-scaled dataset. I think leaving a comment in both places is good enough for now. Change-Id: I08976f261582b379696fd0e81bc060577e552309	2018-03-28 15:27:10 -07:00
Taras Bobrovytsky	2159beee89	IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Here's an example of a full call with the new options that runs the stress test on the local mini cluster: ./concurrent_select.py \ --tpch-kudu-db=tpch_kudu \ --generate-dml-queries \ --dml-mod-values 11 13 17 \ --generate-compute-stats-queries \ --select-probability=0.5 \ --mem-limit-padding-pct=25 \ --mem-limit-padding-abs=50 \ --reset-databases-before-binary-search \ --reset-databases-after-binary-search Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Reviewed-on: http://gerrit.cloudera.org:8080/5093 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Impala Public Jenkins	2016-12-20 01:33:01 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Casey Ching	f288867833	Stress test: Various changes The major changes are: 1) Collect backtrace and fatal log on crash. 2) Poll memory usage. The data is only displayed at this time. 3) Support kerberos. 4) Add random queries. 5) Generate random and TPC-H nested data on a remote cluster. The random data generator was converted to use MR for scaling. 6) Add a cluster abstraction to run data loading for #5 on a remote or local cluster. This also moves and consolidates some Cloudera Manager utilities that were in the stress test. 7) Cleanup the wrappers around impyla. That stuff was getting messy. Change-Id: I4e4b72dbee1c867626a0b22291dd6462819e35d7 Reviewed-on: http://gerrit.cloudera.org:8080/1298 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-01-20 23:00:25 +00:00

6 Commits