impala

mirror of https://github.com/apache/impala.git synced 2026-01-04 00:00:56 -05:00

Author	SHA1	Message	Date
Laszlo Gaal	9e7fb830fd	IMPALA-4088: Assign fix values to the minicluster server ports The minicluster setup logic assigned fixed port numbers to several but not all listening sockets of the data nodes. This change assigns similar port ranges to all the listening ports that were so far allowed to pick their own port numbers, interfering with other components, e.g. HBase. Change-Id: Iecf312873b7026c52b0ac0e71adbecab181925a0 Reviewed-on: http://gerrit.cloudera.org:8080/6531 Reviewed-by: Michael Brown <mikeb@cloudera.com> Tested-by: Impala Public Jenkins	2017-04-07 22:57:16 +00:00
Lars Volker	85d7f5eb2b	IMPALA-4733: Change HBase ports to non-ephemeral We've seen repeated test failures because HBase tries to bind to ports in the ephemeral port range, which sometimes would already be occupied by outgoing connections of other proccesses. This change changes the ports to the new default HBase ports (HBASE-10123): HBase Master Port: 60000 -> 16000 HBase Master Web UI Port: 60010 -> 16010 HBase ReqionServer Port: 60020 -> 16020 HBase ReqionServer Web UI Port: 60030 -> 16030 HBase Status Multicast Port: 60100 -> 16100 This made it necessary to change the default KMS port, too (HADOOP-12811): KMS HTTP port: 16000 -> 9600 Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Reviewed-on: http://gerrit.cloudera.org:8080/6524 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins	2017-04-04 00:28:29 +00:00
Jim Apple	84ee40428d	IMPALA-4588: Enable starting the minicluster when offline IMPALA-4553 made ntp-wait succeed before kudu would start, assuming ntp-wait was installed, in order to prevent a litany of errors on ec2 about unsynchronized clocks. This patch disables that waiting if no internet connection is detected in order to make it possible to start the minicluster when offline. Change-Id: Ifbb5babebb0ca6d2553be1b001e20e2270e052b6 Reviewed-on: http://gerrit.cloudera.org:8080/5412 Reviewed-by: Jim Apple <jbapple-impala@apache.org> Tested-by: Impala Public Jenkins	2016-12-09 03:04:07 +00:00
Jim Apple	90bf40d213	IMPALA-4553: ntpd must be synchronized for kudu to start. When ntpd is not synchronized, kudu initialization fails on the master node: F1129 16:37:28.969956 15230 master_main.cc:68] Check failed: _s.ok() Bad status: Service unavailable: Cannot initialize clock: Error reading clock. Clock considered unsynchronized Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4 Reviewed-on: http://gerrit.cloudera.org:8080/5258 Reviewed-by: Jim Apple <jbapple-impala@apache.org> Tested-by: Impala Public Jenkins	2016-11-30 00:28:02 +00:00
Lars Volker	ef4c9958d0	IMPALA-4047: Remove occurrences of 'CDH'/'cdh' from repo This change removes some of the occurrences of the strings 'CDH'/'cdh' from the Impala repository. References to Cloudera-internal Jiras have been replaced with upstream Jira issues on issues.cloudera.org. For several categories of occurrences (e.g. pom.xml files, DOWNLOAD_CDH_COMPONENTS) I also created a list of follow-up Jiras to remove the occurrences left after this change. Change-Id: Icb37e2ef0cd9fa0e581d359c5dd3db7812b7b2c8 Reviewed-on: http://gerrit.cloudera.org:8080/4187 Reviewed-by: Jim Apple <jbapple@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-10-13 00:40:41 +00:00
Henry Robinson	1b9d9ea7c1	IMPALA-4160: Remove some leftover Llama references Change-Id: I62e12363ab3ecca42bf7a82be3c2df01bc47cdca Reviewed-on: http://gerrit.cloudera.org:8080/4493 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Internal Jenkins	2016-09-22 02:10:32 +00:00
Henry Robinson	19de09ab7d	IMPALA-4160: Remove Llama support. Alas, poor Llama! I knew him, Impala: a system of infinite jest, of most excellent fancy: we hath borne him on our back a thousand times; and now, how abhorred in my imagination it is! Done: * Removed QueryResourceMgr, ResourceBroker, CGroupsMgr * Removed untested 'offline' mode and NM failure detection from ImpalaServer * Removed all Llama-related Thrift files * Removed RM-related arguments to MemTracker constructors * Deprecated all RM-related flags, printing a warning if enable_rm is set * Removed expansion logic from MemTracker * Removed VCore logic from QuerySchedule * Removed all reservation-related logic from Scheduler * Removed RM metric descriptions * Various misc. small class changes Not done: * Remove RM flags (--enable_rm etc.) * Remove RM query options * Changes to RequestPoolService (see IMPALA-4159) * Remove estimates of VCores / memory from plan Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Reviewed-on: http://gerrit.cloudera.org:8080/4445 Tested-by: Internal Jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2016-09-20 23:50:43 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Michael Brown	08e8de73b2	IMPALA-3806: remove a few modern shell idioms to improve RHEL5 support Both `find -executable` and the Bash "&>>" operator are too new to be supported on RHEL5. Both have reasonable workarounds, so prefer them. Note that this may not be the exhaustive list of such "modern" conventions, but RHEL5 isn't working end-to-end, so we can't identify all of them in a single commit yet. Testing: Before, the RHEL5 build would fail quite early here. Now, data load succeeds and most of the backend tests successfully run. Change-Id: I7438bed908d8026327923607238808122212d2d8 Reviewed-on: http://gerrit.cloudera.org:8080/3531 Reviewed-by: David Knupp <dknupp@cloudera.com> Tested-by: Internal Jenkins	2016-07-05 13:37:26 -07:00
Matthew Jacobs	40b79aecbc	IMPALA-3417: run-all.sh fails when no services should start Change-Id: I268c7ad66c82f2b04b832d520e21662e631572cf Reviewed-on: http://gerrit.cloudera.org:8080/3250 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-06-02 09:32:54 -07:00
Anuj Phadke	a915293109	IMPALA-1850: Allow fs.defaultFS to be set to a non-HDFS filesystem This change whitelists the supported filesystems which can be set as Default FS for Impala to run on. This patch configures Impala to use S3 as the default filesystem, rather than a secondary filesystem as before. Change-Id: I2f45bef6c94ece634045acb906d12591587ccfed Reviewed-on: http://gerrit.cloudera.org:8080/1121 Reviewed-by: anujphadke <aphadke@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:40 -07:00
casey	a27946e696	Improve mini-cluster usability (testdata/cluster/admin) Changes: 1) Previously when a service would fail, the user would have to find the the log file and open it. Now the end of the log is dumped to stdout. 2) Add start, stop, and restart commands to the "admin" script. For example now you can run testdata/cluster/admin restart kudu 3) Wait up to 120 seconds for services to shutdown. The timeout is the same as for the Impala processes. If the services fail to stop an error will be raised. Change-Id: I537ea5656df2081d4f1f27a9f3fcef4547fdc2fe Reviewed-on: http://gerrit.cloudera.org:8080/2751 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:37 -07:00
casey	6f4a5e6bb0	Kudu: Use -block_manager=file if "hole punching" isn't supported By default Kudu requires the underlying file system to support hole punching. If support isn't there Kudu will fail to start. People using such a file system can instead start Kudu with -block_manager=file. Before starting Kudu in the local mini-cluster, the "fallocate" command will be used to automatically determine if the special flag is needed. Note, users who need this must run bin/create-test-configuration.sh after pulling in this commit. This also fixes a bug in the delete_kudu_data() in the cluster admin script. A directory name was incorrect. Change-Id: I1ca7fedb367444c41e462b72b0b76091ee94e27c Reviewed-on: http://gerrit.cloudera.org:8080/2750 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:36 -07:00
casey	cef87e39dc	Updates for new Kudu toolchain layout and upgrade Kudu The directory structure of the newer Kudu toolchain artifacts has changed. Now the root directory is split into /release and /debug. A few little updates are needed to the build and service scripts. Since the toolchain no longer provides stubs for platforms that Kudu doesn't support the stubs need to be generated. This will be done as part of the toolchain bootstrapping. Also this upgrades Kudu to 0.8 RC1. Developers will need to run bin/create-test-configuration.sh after pulling in this change. Otherwise the Kudu service will fail to start. Change-Id: I625903bd92afece0ad819a96fc275d5812b5eb2a Reviewed-on: http://gerrit.cloudera.org:8080/2720 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:35 -07:00
Sailesh Mukil	86fd262dc9	IMPALA-3324: Hive server does not start for S3 builds. The hive server does not start for S3 builds because HDFS is marked as an unsupported service in testdata/cluster/admin; and so HDFS is not started at all, and so the Hive server is unable to start as well. Due to this, all our S3 builds fail. Currently our S3 builds need HDFS to run correctly. (This has to be reverted once IMPALA-1850 goes in, because then S3 can run as a default FS without HDFS) Change-Id: Ibda9dc3ef895c2aa4d39eb5694ac5f2dbd83bee4 Reviewed-on: http://gerrit.cloudera.org:8080/2741 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-04-12 14:03:43 -07:00
Casey Ching	9d43aac6ce	IMPALA-3274: Always start Kudu for testing Previously Kudu would only be started when the test configuration was the standard mini-cluster. That led to failures during data loading when testing without the mini-cluster (ex: local file system). Kudu doesn't require any other services so now it'll be started for all test environments. Change-Id: I92643ca6ef1acdbf4d4cd2fa5faf9ac97a3f0865 Reviewed-on: http://gerrit.cloudera.org:8080/2690 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-04-12 14:02:35 -07:00
Casey Ching	39a28185e8	Re-enable Kudu in build using client stubs when needed The stubs in Impala broke during the merge commit. This commit removes the stubs in hopes of improving robustness of the build. The original problem (Kudu clients are only available for some OSs) is now addressed by moving the stubbing into a dummy Kudu client. The dummy client only allows linking to succeed, if any client method is called, Impala will crash. Before calling any such method, Kudu availability must be checked. Change-Id: I4bf1c964faf21722137adc4f7ba7f78654f0f712 Reviewed-on: http://gerrit.cloudera.org:8080/2585 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-03-29 23:57:54 +00:00
Alex Behm	7e76e92bef	Consolidate test and cluster logs under a single directory. All logs, test results and SQL files generated during data loading and testing are now consolidated under a single new directory $IMPALA_HOME/logs. The goal is to simplify archiving in Jenkins runs and debugging. The new structure is as follows: $IMPALA_HOME/logs/cluster - logs of Hadoop components and Impala $IMPALA_HOME/logs/data_loading - logs and SQL files produced in data loading $IMPALA_HOME/logs/fe_tests - logs and test output of Frontend unit tests $IMPALA_HOME/logs/be_tests - logs and test output of Backend unit tests $IMPALA_HOME/logs/ee_tests - logs and test output of end-to-end tests $IMPALA_HOME/logs/custom_cluster_tests - logs and test output of custom cluster tests I tested this change with a full data load which was successful. Change-Id: Ief1f58f3320ec39d31b3c6bc6ef87f58ff7dfdfa Reviewed-on: http://gerrit.cloudera.org:8080/2456 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-03-28 19:23:22 +00:00
casey	804cfbdd64	Get and use Kudu from the toolchain by default This is for review purposes only. This patch will be merged with David's big merge patch. Changes: 1) Make Kudu compilation dependent on the OS since not all OSs support Kudu. 2) Only run Kudu related tests when Kudu is supported (see #1). 3) Look for Kudu locally, but in a different location. To use a local build of Kudu, set KUDU_BUILD_DIR to the path Kudu was built in and set KUDU_CLIENT_DIR to the path KUDU was installed in. Example: git clone https://github.com/cloudera/kudu.git ...build 3rd party etc... mkdir -p $KUDU_BUILD_DIR cd $KUDU_BUILD_DIR cmake <path to Kudu source dir> make DESTDIR=$KUDU_CLIENT_DIR make install 4) Look for Kudu in the toolchain if not using a local Kudu build. 5) Add Kudu service startup scripts. The Kudu in the toolchain is actually a parcel that has been renamed (the contents were not modified in any way), that mean the Kudu service binaries are there. Those binaries are now used to run the Kudu service. Change-Id: I3db88cbd27f2ea2394f011bc8d1face37411ed58	2016-03-11 11:38:05 -08:00
casey	3a3497e819	Fix cluster start script for RHEL 7 The change to the start script for OSX used "find" with the "-perm +0111" option as an "executables only" filter but that doesn't work with newer versions of "find". "-perm +" has been deprecated or removed (depending on the version) in Linux. I couldn't find a OSX+Linux compatible filter. The variable IS_OSX was added and used to choose the appopriate filter. Change-Id: I0c49f78e816147c820ec539cfc398fb77b83307a Reviewed-on: http://gerrit.cloudera.org:8080/1630 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2015-12-29 23:25:02 +00:00
Casey Ching	e2bfb6ae2f	Misc improvements to shell scripts about error reporting Changes: 1) Consistently use "set -euo pipefail". 2) When an error happens, print the file and line. 3) Consolidated some of the kill scripts. 4) Added better error messages to the load data script. 5) Changed use of #!/bin/sh to bash. Change-Id: I14fef66c46c1b4461859382ba3fd0dee0fbcdce1 Reviewed-on: http://gerrit.cloudera.org:8080/1620 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2015-12-17 18:25:27 +00:00
casey	c56ba5149c	Infra scripts: Only attempt to kill processes owned by the current user This is for compatibility with docker containers. Before this patch, when the scripts were run on the docker host, the scripts would try to kill the mini-cluster in the docker containers and fail because they didn't have permissions (the user is different). Now the scripts will only try to kill mini-cluster processes that were started by the current user. Also some psutil availability checks were removed because psutil is now provided by the python virtualenv. Change-Id: Ida371797bbaffd0a3bd84ab353cb9f466ca510fd Reviewed-on: http://gerrit.cloudera.org:8080/1541 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2015-12-17 12:08:33 +00:00
Martin Grund	577e335122	OS X: cluster admin script compatibility admin was using the -executable flag of find that is not available on Mac. This patch replaces it with "-perm +0111 -type f" which is similar semantics. In addition, there seem to be differences in which shell builtins are available so some changes have been made to fix that issue. Change-Id: I9b2ecbd5bf6a9b1610e7ca9f15b1a4d1407b94c1 Reviewed-on: http://gerrit.cloudera.org:8080/1612 Reviewed-by: Casey Ching <casey@cloudera.com> Readability: Martin Grund <mgrund@cloudera.com> Tested-by: Internal Jenkins	2015-12-11 22:09:45 +00:00
Matthew Jacobs	835d6dbef4	IMPALA-1209: Add KMS service to testdata cluster (pt1) First change for IMPALA-1209 to address Impala limitations when using HDFS encryption. This adds a KMS process to the testdata cluster. This was tested manually by creating a key and an encryption zone. Change-Id: I499154506386f04e71c5371b128c10868b1e1318 Reviewed-on: http://gerrit.cloudera.org:8080/41 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins	2015-02-13 20:46:14 +00:00
Mike Yoder	75a97d3d7e	[CDH5] Kerberize mini-cluster and Impala daemons This is the first iteration of a kerberized development environment. All the daemons start and use kerberos, with the sole exception of the hive metastore. This is sufficient to test impala authentication. When buildall.sh is run using '-kerberize', it will stop before loading data or attempting to run tests. Loading data into the cluster is known to not work at this time, the root causes being that Beeline -> HiveServer2 -> MapReduce throws errors, and Beeline -> HiveServer2 -> HBase has problems. These are left for later work. However, the impala daemons will happily authenticate using kerberos both from clients (like the impala shell) and amongst each other. This means that if you can get data into the mini-cluster, you could query it. Usage: * Supply a '-kerberize' option to buildall.sh, or * Supply a '-kerberize' option to create-test-configuration.sh, then 'run-all.sh -format', re-source impala-config.sh, and then start impala daemons as usual. You must reformat the cluster because kerberizing it will change all the ownership of all files in HDFS. Notable changes: * Added clean start/stop script for the llama-minikdc * Creation of Kerberized HDFS - namenode and datanodes * Kerberized HBase (and Zookeeper) * Kerberized Hive (minus the MetaStore) * Kerberized Impala * Loading of data very nearly working Still to go: * Kerberize the MetaStore * Get data loading working * Run all tests * The unknown unknowns * Extensive testing Change-Id: Iee3f56f6cc28303821fc6a3bf3ca7f5933632160 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4019 Reviewed-by: Michael Yoder <myoder@cloudera.com> Tested-by: jenkins	2014-09-05 12:36:21 -07:00
casey	2351266d0e	Replace single process mini-dfs with multiple processes This should allow individual service components, such as a single nodemanager, to be shutdown for failure testing. The mini-cluster bundled with hadoop is a single process that does not expose the ability to control individual roles. Now each role can be controlled and configured independently of the others. Change-Id: Ic1d42e024226c6867e79916464d184fce886d783 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1432 Tested-by: Casey Ching <casey@cloudera.com> Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2297 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-04-23 18:24:05 -07:00

26 Commits