impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Tim Armstrong	fc4ee65f9f	Add all build targets to CMake and speed up builds Use CMake's dependency resolution always instead of serial execution of targets via shell scripts. This improves parallelism by building fe, be, and other targets at the same time and avoid some overhead from invoking "make" multiple times. This reduces the time taken for an incremental compilation of fe and be from 56s to 24s with this command: ./buildall.sh -debug -noclean -notests -skiptests -ninja Also use Impala-lzo's build script. This depends on the IMPALA-4277 fixes to the Impala-lzo build script. Log directory creation is also moved from impala-config.sh to buildall.sh. This means that impala-config.sh has no side-effects and can be run concurrently with no issues. Also make sure that "make" builds all the same artifacts as buildall.sh when run with no args. Testing: Ran a jenkins core job, also experimented locally. Ran a jenkins core job with distcc disabled - this exposed some concurrency bugs where impala-config.sh fails if run concurrently. Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Reviewed-on: http://gerrit.cloudera.org:8080/4790 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2016-12-14 23:42:19 +00:00
Jim Apple	14891fe004	IMPALA-3676: Use clang as a static analysis tool This patch adds a script to run clang-tidy over the whole code base. It is a first step towards running clang-tidy over patches as a tool to help users spot bugs before code review. Because of the number of clang-tidy checks, this patch only addresses some of them. In particular, only checks starting with 'clang' are considered. Many of them which are flaky or not part of our style are excluded from the analysis. This patch also exlcudes some checks which are part of our current style but which would be too laborious to fix over the entire codebase, like using nullptr rather than NULL. This patch also fixes a number of small bugs found by clang-tidy. Finally, this patch adds the class AlignedNew, the purpose of which is to provide correct alignment on heap-allocated data. The global new operator only guarantees 16-byte alignment. A class that includes a member variable that must be aligned on a k-byte boundary for k>16 can inherit from AlignedNew<k> to ensure correct alignment on the heap, quieting clang's -Wover-aligned warning. (Static and stack allocation are required by the standard to respect the alignment of the type and its member variables, so no extra code is needed for allocation in those places.) Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06 Reviewed-on: http://gerrit.cloudera.org:8080/4758 Reviewed-by: Jim Apple <jbapple@cloudera.com> Tested-by: Internal Jenkins	2016-11-04 00:13:12 +00:00
Tim Armstrong	a6257013fa	IMPALA-4339: ensure coredumps end up in IMPALA_HOME Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e Reviewed-on: http://gerrit.cloudera.org:8080/4785 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-10-25 04:17:58 +00:00
Tim Armstrong	df680cfe3a	IMPALA-4277: allow overriding of Hive/Hadoop versions/locations This is to help with IMPALA-4277 to make it easier to build against Hadoop/Hive distributions where the directory layout doesn't exactly match our current CDH dependencies, or where we may want to temporarily override a version without making a source change. Change-Id: I7da10e38f9c4309f2d193dc25f14a6ea308c9639 Reviewed-on: http://gerrit.cloudera.org:8080/4720 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Internal Jenkins	2016-10-18 05:54:09 +00:00
Tim Armstrong	ef762b73a1	IMPALA-4299: add buildall.sh option to start test cluster A previous commit "IMPALA-4259: build Impala without any test cluster setup" altered some undocumented side-effects of buildall.sh. Previously the following commands reconfigured and restarted the test cluster. It worked because buildall.sh unconditionally regenerated the test cluster configs. ./buildall.sh -notests && ./testdata/bin/run-all.sh ./buildall.sh -noclean -notests && ./testdata/bin/run-all.sh Instead of restoring the old behaviour and continuing to encourage mixing use of low and high level scripts like testdata/bin/run-all.sh as part of the "standard" workflow, this commit adds another high-level option to buildall.sh, -start_minicluster, that accomplishes the high-level task of restarting a minicluster with fresh configs. The above commands can be replaced with: ./buildall.sh -notests -start_minicluster ./buildall.sh -notests -noclean -start_minicluster Change-Id: I0ab3461f8ff3de49b3f28a0dc22fa0a6d5569da5 Reviewed-on: http://gerrit.cloudera.org:8080/4734 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-10-17 22:19:06 +00:00
Tim Armstrong	75a857c0ce	IMPALA-4259: build Impala without any test cluster setup. The main outcome of this change is to avoid making unnecessary modification to the Impala or other source trees when we don't need the test cluster. To achieve that, this refactors the script to make the flow easier to understand and makes it more consistent which build steps are executed in which modes. Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137 Reviewed-on: http://gerrit.cloudera.org:8080/4685 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-10-13 05:45:47 +00:00
Tim Armstrong	78e129c923	Fix typo in buildall.sh introduced in IMPALA-4006 The typo resulted in a silent failure: an error message was printed in the middle of the buildall.sh output and the branch was never taken. Change-Id: I7a0f74b93bb31bd0c56fc4c20f42f8ab1fc6de78 Reviewed-on: http://gerrit.cloudera.org:8080/4382 Reviewed-by: Michael Brown <mikeb@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-09-12 21:15:35 +00:00
Zoltan Ivanfi	a60ba6d274	IMPALA-4006: dangerous rm -rf statements in scripts Quoted variable substitutions in rm -rf commands and in many other places. This prevents disasters if those variables contain whitespace. Redirected output of the cd commands to /dev/null. This prevents polluting the target variable with the directory name when the CDPATH environment variable is set. Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe Reviewed-on: http://gerrit.cloudera.org:8080/4078 Tested-by: Internal Jenkins Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2016-09-01 21:26:52 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Tim Armstrong	a7963e6b03	IMPALA-3914: SKIP_TOOLCHAIN_BOOTSTRAP skips Python package downloads SKIP_TOOLCHAIN_BOOTSTRAP is meant to control download of third-party components to speed up builds and allow builds to be less tied to third-party infrastructure. It therefore makes sense that it should apply to downloading of third-party Python packages. Change-Id: Ibf68dbf5efb514511fc16e2956284ce508b997aa Reviewed-on: http://gerrit.cloudera.org:8080/3773 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-07-28 03:45:45 +00:00
Jim Apple	a5ae2bfd88	IMPALA-3762: Download Python requirements before they are needed. This is needed for ASF builds. It sounds expensive, but takes less than 10 seconds if the packages are already present. Change-Id: I84103c2fb8f9a93336bf28b644ca045f15651dd6 Reviewed-on: http://gerrit.cloudera.org:8080/3452 Reviewed-by: Jim Apple <jbapple@cloudera.com> Tested-by: Jim Apple <jbapple@cloudera.com>	2016-06-22 14:38:57 -07:00
Michael Ho	6e71e903ff	IMPALA-3223: Supports download of CDH components from S3. This change updates the toolchain bootstrapping script to download the CDH components (hadoop, hbase, hive, llama, llama-minikdc and sentry) from the toolchain S3 bucket to the toolchain directory if the environment variable $DOWNLOAD_CDH_COMPONENTS is true. By default, it is false which means the CDH components in the thirdparty directory will be used instead. To build the ASF tree(https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git), set $DOWNLOAD_CDH_COMPONENTS to true. Currently, the CDH components in S3 are snapshots from the thirdparty directory at 688d0efcd38731e8e27a8236dbdca21c8fd571a1. Once the integration jenkins job (impala-cdh5-trunk-core-integration) is modified to upload the latest stable builds to the S3 buckets, we can remove the thirdparty directory and always use the CDH components in the toolchain directory. Note that bootstrap_toolchain.py will not overwrite existing directories in the toolchain directory. To force a refresh of cpmponents in the toolchain directory, a user should delete the cached copy in the toolchain directory and execute bootstrap_toolchain.py again. This behavior allows users to develop locally without network connection once the toolchain has been bootstrapped. Change-Id: I16fa79db0005554cc0a116e74775647ba99f8dda Reviewed-on: http://gerrit.cloudera.org:8080/3333 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-06-21 00:37:53 -07:00
Michael Ho	86ff18eee9	IMPALA-3223: Removal of non-toolchain builds. This change removes the option to build without specifying the environment variable $IMPALA_TOOLCHAIN. By default, if it's not set, sourcing impala-config.sh will set it to $IMPALA_HOME/toolchain. A user can override it by setting $IMPALA_TOOLCHAIN to his/her own toolchain directory. The user can also set $SKIP_TOOLCHAIN_BOOTSTRAP to true to avoid running the toolchain bootstrapping script (e.g. a particular component in toolchain is at a version not checked into S3). $IMPALA_TOOLCHAIN holds some third party binaries which Impala relies on. They can be compiled from source in the native toolchain which is public. This commit also removes build_thirdparty.sh as it's no longer used. By default, Impala will be built with the compiler in $IMPALA_TOOLCHAIN but this option can be overridden by setting environment variable $USE_SYSTEM_GCC to 1. Change-Id: I42b60e99fb9caf1294be7ab242856ca3b9a5ab73 Reviewed-on: http://gerrit.cloudera.org:8080/3259 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Michael Ho <kwho@cloudera.com>	2016-06-07 17:29:59 -07:00
Lars Volker	3ee075f962	IMPALA-3594: Fix -build_shared_libs switch in buildall.sh Change-Id: I6ad4afc30ca3717fece65ff075981d01efd580fe Reviewed-on: http://gerrit.cloudera.org:8080/3170 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-24 20:41:09 -07:00
Alex Behm	9d23f4a65d	IMPALA-3572: FE unit test coverage report with Jacoco. This patch integrates Jacoco into Impala's FE test runs for getting a code coverage report. The instrumentation and reporting functionality is disabled by default, and must be enabled explicitly, e.g., like this: mvn test -DcodeCoverage The code coverage report is stored in this location: $IMPALA_HOME/logs/fe_tests/coverage With additional changes, Jacoco can also be used to get code coverage reports for our end-to-end tests, but that is left for future work. Change-Id: Id5e4f1b8afb91210d40622aadd3d21d7ed94c2a7 Reviewed-on: http://gerrit.cloudera.org:8080/3151 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:19 -07:00
Tim Armstrong	6e89f1a250	Add ninja support for faster incremental builds Ninja resolves dependencies much faster, so if only a couple of files are changed "ninja -j ${IMPALA_BUILD_THREADS} impalad" returns within a second or two, while make can take tens of seconds to resolve all the dependencies. This requires ninja to be installed. It is widely available, e.g. in the ninja-build package on Ubuntu. Ninja can be enabled by passing "-ninja" to buildall.sh or make_impala.sh. The same targets should work as with make. The default Ninja status output is fairly terse. It can be customised with an environment variable. E.g. I have export NINJA_STATUS="[%u to run/%r running/%f finished] " Also fixes a bug in make_impala.sh where invalid arguments were ignored. Change-Id: I2cea479615fe850c98d30110de043ecb6358dcda Reviewed-on: http://gerrit.cloudera.org:8080/2923 Tested-by: Internal Jenkins Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2016-05-12 14:17:53 -07:00
Misha Dmitriev	4f9e16055f	IMPALA-3384: Added support for building Impala Front End separately (and quickly) Change-Id: I486bb95757334f9df77c4a97150b2b34c5c0e2c4 Reviewed-on: http://gerrit.cloudera.org:8080/2875 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:45 -07:00
Lars Volker	a65ffda542	Add -release switch to buildall.sh help, change coverage options. This change documents the -release switch. It also removes the _debug and _release suffixes from -codecoverage_* and determines them from the presence of -release. On top of that it adds sanity checks to the specified options. Change-Id: Id69791264cb2d9e0ffe96a7ac5aabc34a553a7be Reviewed-on: http://gerrit.cloudera.org:8080/2043 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-04-12 14:03:44 -07:00
casey	52841302de	Remove make_test_tarball.sh As far as I know nothing actually uses the "test tarball". For some reason building it take a minute or so on my computer. If it's not used, then it seems best to just get rid of it. Change-Id: I5c8b46f16a18eedfcc159b0e91b6a8b9357c51f2 Reviewed-on: http://gerrit.cloudera.org:8080/2685 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-04-01 01:26:44 +00:00
Alex Behm	7e76e92bef	Consolidate test and cluster logs under a single directory. All logs, test results and SQL files generated during data loading and testing are now consolidated under a single new directory $IMPALA_HOME/logs. The goal is to simplify archiving in Jenkins runs and debugging. The new structure is as follows: $IMPALA_HOME/logs/cluster - logs of Hadoop components and Impala $IMPALA_HOME/logs/data_loading - logs and SQL files produced in data loading $IMPALA_HOME/logs/fe_tests - logs and test output of Frontend unit tests $IMPALA_HOME/logs/be_tests - logs and test output of Backend unit tests $IMPALA_HOME/logs/ee_tests - logs and test output of end-to-end tests $IMPALA_HOME/logs/custom_cluster_tests - logs and test output of custom cluster tests I tested this change with a full data load which was successful. Change-Id: Ief1f58f3320ec39d31b3c6bc6ef87f58ff7dfdfa Reviewed-on: http://gerrit.cloudera.org:8080/2456 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-03-28 19:23:22 +00:00
Tim Armstrong	f13dfcbddc	Suppress maven info logging Maven's INFO log level is very verbose and includes a lot of progress information that is minimally useful. Maven doesn't have an option to output only ERROR and WARNING log messages. As a workaround, use grep to filter out the majority of the output (only warnings, errors, tests, and success/failure). Also add a header with relevant info about the maven command: targets and working directory. Change-Id: I828b870edc2fc80a6460e6ed594d507c46e69c82 Reviewed-on: http://gerrit.cloudera.org:8080/1752 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-01-15 19:38:46 +00:00
Tim Armstrong	c9cb00f4a1	IMPALA-2847: only recreate Sentry Policy DB when formatting cluster We should only need to recreate the Sentry Policy DB when formatting a cluster. Previously buildall.sh always tried to create the database regardless of whether it was needed. E.g. if a machine was just building Impala without running tests, there is no need to create any of the test databases. This fixes a regression when running buildall.sh on a machine without postgres set up. Change-Id: I35bb1cb275bb4da3f91f496010a7f6ee4daa2792 Reviewed-on: http://gerrit.cloudera.org:8080/1782 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-01-14 08:05:29 +00:00
Casey Ching	cfb1ab5c2c	IMPALA-2781: Fix shell error reporting after chdir The original error reporting relied on $0 being accessible from the current working dir, which failed if a script changed the working dir and $0 was relative. This updates the error reporting command to cd back to the original dir before accessing $0. Change-Id: I2185af66e35e29b41dbe1bb08de24200bacea8a1 Reviewed-on: http://gerrit.cloudera.org:8080/1666 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-01-14 07:10:54 +00:00
Casey Ching	e2bfb6ae2f	Misc improvements to shell scripts about error reporting Changes: 1) Consistently use "set -euo pipefail". 2) When an error happens, print the file and line. 3) Consolidated some of the kill scripts. 4) Added better error messages to the load data script. 5) Changed use of #!/bin/sh to bash. Change-Id: I14fef66c46c1b4461859382ba3fd0dee0fbcdce1 Reviewed-on: http://gerrit.cloudera.org:8080/1620 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2015-12-17 18:25:27 +00:00
Martin Grund	40d002a94f	Extracting CLEAN_ACTION from buildall into separate script. This script should be used before switching release branches or going from a toolchain branch to non-toolchain branch. Change-Id: I8fb958868286f9fe00f91b581f774d48fa75230e Reviewed-on: http://gerrit.cloudera.org:8080/1372 Tested-by: Internal Jenkins Reviewed-by: Martin Grund <mgrund@cloudera.com>	2015-11-12 23:28:16 +00:00
Martin Grund	a2b54f8334	Make CLEAN_ACTION cleanup generated CMake files Before, even after running buildall.sh with clean, we would have left-over generated CMake files that interfere when switching compilers and libraries. This patch makes sure that these files are deleted when running buildall.sh. In addition calls `make clean` in the CLEAN_ACTION case to remove the compiled code before removing the CMakeFiles and we might have left-overs that will breaking subsequent compilations on different branches. Change-Id: If5ed04b3d3664f239dd76cd42ad66e4f0cd6dfe7 Reviewed-on: http://gerrit.cloudera.org:8080/1262 Tested-by: Internal Jenkins Reviewed-by: Dan Hecht <dhecht@cloudera.com>	2015-10-28 18:52:48 +00:00
Martin Grund	579be1c542	IMPALA-2284: Disallow long (1<<30) strings in group_concat() This is the first step to fix issues with large memory allocations. In this patch, the built-in `group_concat` is no longer allowed to allocate arbitraryly large strings and crash impala, but is limited to the upper bound of possible allocations in Impala. This patch does not perform any functional change, but rather avoids unnecessary crashes. However, it changes the parameter type of FindChunk() in MemPool to be a signed 64bit integer. This change allows the mempool to allocate internally memory of more than one 1GB, but the public interface of Allocate() is not changed, so the general limitation remains. The reason for this change is as follows: 1) In a UDF FunctionContext::Reallocate() would allocate slightly more than 512MB from the FreePool. 2) The free pool tries to double this size to alloocate 1GB from the MemPool. 3) The MemPool doubles the size again and overflows the signed 32bit integer in the FindChunk() method. This will then only allocate 1GB instead of the expected 2GB. What happens is that one of the callers expected a larger allocation than actually happened, which will in turn lead to memory corruption as soon as the memory is accessed. Change-Id: I068835dfa0ac8f7538253d9fa5cfc3fb9d352f6a Reviewed-on: http://gerrit.cloudera.org:8080/858 Tested-by: Internal Jenkins Reviewed-by: Dan Hecht <dhecht@cloudera.com>	2015-09-23 15:15:55 -07:00
Martin Grund	5afd5bc8f6	Toolchain Cleanup and ASAN Improvements This patch provides the last fixes to finally enable the toolchain: - Remove static OpenSSL dependency - Fixing inline assembly problems in ASAN - Issues with non-relocatable LLVM 3.3 - adds manual system includes to fix issues with hardcoded header paths in clang. When the toolchain is enabled and we build for ASAN we use a specific toolchain file to build with LLVM-trunk as the main compiler. Even though this uses LLVM-trunk for compiling the Impala code, this will use LLVM 3.3 for codegen. In addition, this enables us to follow up with TSAN and LEAKSAN. Change-Id: I0abb914ca3f192cb7edd83ead134bc9e2d02071f Reviewed-on: http://gerrit.cloudera.org:8080/556 Tested-by: Internal Jenkins Reviewed-by: Martin Grund <mgrund@cloudera.com>	2015-08-21 20:14:31 +00:00
ishaan	b2d9d45977	Clean stale python object files and cached directories in buildall. We can potentially leave stale object files and directories in the build directory, causing the python imports to get confused; More importantly, this results in a stale build environment. This patch cleans cached object files and directories. Additionally, it makes buildall more robust by using pushd/popd instead of simply cd'ing into a directory. Change-Id: Ie8b20fc1844189d15d8c87ffcbd65e095cc4293e Reviewed-on: http://gerrit.cloudera.org:8080/482 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Internal Jenkins	2015-06-25 00:09:23 +00:00
Martin Grund	81f247b171	Optional Impala Toolchain This patch allows to optionally enable the new Impala binary toolchain. For now there are now major version differences in the toolchain dependencies and what is currently kept in thirdparty. To enable the toolchain, export the variable IMPALA_TOOLCHAIN to the folder where the binaries are available. In addition this patch moves gutil from the thirdparty directory into the source tree of be/src to allow easy propagation of compiler and linker flags. Furthermore, the thrift-cpp target was added as a dependency to all targets that require the generated thrift sources to be available before the build is started. What is the new toolchain: The goal of the toolchain is to homogenize the build environment and to make sure that Impala is build nearly identical on every platform. To achieve this, we limit the flexibility of using the systems host libraries and rather rely on a set of custom produced binaries including the necessary compiler. Change-Id: If2dac920520e4a18be2a9a75b3184a5bd97a065b Reviewed-on: http://gerrit.cloudera.org:8080/427 Reviewed-by: Adar Dembo <adar@cloudera.com> Tested-by: Internal Jenkins Reviewed-by: Martin Grund <mgrund@cloudera.com>	2015-06-13 03:11:44 +00:00
Alex Behm	1bd3eca22f	Quietly resolve dependencies in Jenkins runs to avoid log spew. Change-Id: If38a683785f3c6c9d92f762a2dfd86f009ce9d84 Reviewed-on: http://gerrit.cloudera.org:8080/392 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-05-19 09:12:43 +00:00
ishaan	058978dccb	Enable using isilon as the underlying filesystem. This patch enables the Impala test suite to run the end to end tests against an isilon namenode. There are a few caveats: - The fe test will currently not work. - Only loading data from both the test-warehouse snapshot and the metadata snapshot is supported. - The test suite cannot be run by multiple people (unless we have access to multiple isilon namenodes) Change-Id: I786b4e4f51b99e79ad42abc676f537ebfc189237 Reviewed-on: http://gerrit.cloudera.org:8080/356 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Internal Jenkins	2015-05-12 01:28:19 +00:00
ishaan	b54d95bc1e	Fix the logic in buildall that deals with create-load-data command line paramters. The current logic only worked when: - Both the test-warehouse and metastore snapshot were present - Neither of them were present. All other conditions mapped to the second case. This patch fixes the problem by applying the correct bash test operators. Change-Id: Ie090aefe3be14a01b1aadc0a136e870582a6379c Reviewed-on: http://gerrit.cloudera.org:8080/235 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2015-03-16 18:33:54 -07:00
Dan Hecht	c8fb10f50a	S3: Some more work toward enabling additional S3 test coverage Add skip markers for S3 that can be used to categorize the tests that are skipped against S3 to help see what coverage is missing. Soon we'll be reworking some tests and/or adding new tests to get back the important gaps. Also, add a mechanism to parameterize paths in the .test files, and start using these new variables. This is a step toward enabling some more tests against S3. Finally, a fix for buildall.sh to stop the minicluster before applying the metastore snapshot. Otherwise, this fails since the ms db is in use. Change-Id: I142434ed67bed407e61d7b2c90f825734fc0dce0 Reviewed-on: http://gerrit.cloudera.org:8080/127 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-03-03 08:29:13 +00:00
Dan Hecht	60b3fd253d	Some S3 test fixes 1) Fix buildall.sh check for data to use the filesystem prefix. 2) Skip one of the cancellation test cases that tests INSERT. 3) Skip one of the explain test cases since it uses hdfs_client (hdfs web ui). Change-Id: Ice4e7517dec6e88b1561a0c2362653ab251f14ce Reviewed-on: http://gerrit.cloudera.org:8080/113 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-02-26 01:04:28 +00:00
Dan Hecht	bc6299730f	Fix buildall.sh S3 argument checking Don't give the error if $TESTDATA_ACTION is not 1, so that buildall.sh can still be used to build / run tests without specifying snapshots. The snapshots are needed only if loading data. Change-Id: Ica1ded42810d73160e0b30f9b2e5ee4ae308ec1d Reviewed-on: http://gerrit.cloudera.org:8080/79 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-02-23 21:39:51 +00:00
ishaan	2386fb84a8	Enable the data loading infrastructure to switch the underlying file system. This patch enables loading data to s3 instead of hdfs. It is preliminary in nature, as such, there are a few caveats: - The fe tests do not work. - Only loading from a test-warehouse snapshot and metastore snapshot is enabled. - Until hive works with s3, only a subset of all the tests will work. Change-Id: Ia66a5f836b4245e3b022a49de805eec337a51324 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5851 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2015-02-03 01:02:42 -08:00
Ippokratis Pandis	706d2a46cf	Adding an option to build release from buildall.sh Previously in order to build release from buildall.sh we had to declare an env variable (TARGET_BUILD_TYPE). This patch adds the option: ./buildall.sh -release Change-Id: Ib19702584fa291b161513bd37b1269e527176cfa Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5838 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins	2015-01-28 17:30:09 -08:00
ishaan	07efc0cb17	Add the ability to only reload the metastore snapshot in buildall and misc. changes. This commit adds the ability to only load the metastore snapshot, with the assumption that the hdfs data is already loaded. It also additionally adds the ability to specify some buildall parameters via the environment. Change-Id: I4a07d4cf3a63479c377d4be79c4a2140c2a52fb8 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5665 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2015-01-09 12:40:06 -08:00
ishaan	dee6911b20	Enable loading metadata from the hive metastore snapshot and cleanup build scripts. This patch contains the following changes: - Add a metastore_snapshot_file parameter to build.sh - Enable skipping loading the metadata. - create-load-data.sh is refactored into functions. - A lot of scripts source impala-config, which creates a lot of log spew. This has now been muted. - Unecessary log spew from compute-table-stats has been muted. - build_thirdparty.sh determins its parallelism from the system, it was previously hard coded to 4 - Only force load data of the particular dataset if a schema change is detected. Change-Id: I909336451e5c1ca57d21f040eb94c0e831546837 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5540 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-12-19 13:41:00 -08:00
ishaan	09b97f3881	Add the ability to load a metastore snapshot file. This patch includes the following changes: - Modifies buildall to accept a hive metastore snapshot file as an argument. - Adds a script to load the hive metastore snapshot. Change-Id: I7b9fc5b0643afe62fd4739a81eaa3bf9af1630da Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5510 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-12-08 18:16:45 -08:00
Mike Yoder	3acee6b2b6	[CDH5] Removal of erroneous sanity checks The Kerberization work introduced sanity checks that ensured that HADOOP_LZO and IMPALA_LZO were set in the environment. However, the packaging builds don't have those set. This submittal removes those newly added checks. Change-Id: I08ae867e00e99e244221b32158724b06ee9fb901 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4194 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-09-05 16:32:18 -07:00
Mike Yoder	75a97d3d7e	[CDH5] Kerberize mini-cluster and Impala daemons This is the first iteration of a kerberized development environment. All the daemons start and use kerberos, with the sole exception of the hive metastore. This is sufficient to test impala authentication. When buildall.sh is run using '-kerberize', it will stop before loading data or attempting to run tests. Loading data into the cluster is known to not work at this time, the root causes being that Beeline -> HiveServer2 -> MapReduce throws errors, and Beeline -> HiveServer2 -> HBase has problems. These are left for later work. However, the impala daemons will happily authenticate using kerberos both from clients (like the impala shell) and amongst each other. This means that if you can get data into the mini-cluster, you could query it. Usage: * Supply a '-kerberize' option to buildall.sh, or * Supply a '-kerberize' option to create-test-configuration.sh, then 'run-all.sh -format', re-source impala-config.sh, and then start impala daemons as usual. You must reformat the cluster because kerberizing it will change all the ownership of all files in HDFS. Notable changes: * Added clean start/stop script for the llama-minikdc * Creation of Kerberized HDFS - namenode and datanodes * Kerberized HBase (and Zookeeper) * Kerberized Hive (minus the MetaStore) * Kerberized Impala * Loading of data very nearly working Still to go: * Kerberize the MetaStore * Get data loading working * Run all tests * The unknown unknowns * Extensive testing Change-Id: Iee3f56f6cc28303821fc6a3bf3ca7f5933632160 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4019 Reviewed-by: Michael Yoder <myoder@cloudera.com> Tested-by: jenkins	2014-09-05 12:36:21 -07:00
Skye Wanderman-Milne	bc8a6f7a30	buildall.sh -testdata shouldn't format cluster and metastore Otherwise it's impossible to use buildall to do an incremental data load Change-Id: I5f601e235a0bf0de4823266f6f4d54558d886d8a Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4123 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 42458a0cb78e9c0a06844ec5b5d236aebcc3b470) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4131	2014-09-04 11:36:25 -07:00
Lenni Kuff	a313a4b6b7	Update buildall to avoid killing Hadoop services when -noclean is specified Updates buildall to avoid killing Hadoop services when -noclean is specified. The exception is if someone specifies -noclean with -format*, in which case we need to kill everything to drop all connections to Postgres. Change-Id: I7e6ecb6c20165f0480456bc7fac1908780e76163 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4037 Reviewed-by: Michael Yoder <myoder@cloudera.com> Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit c6d4c2f1c406055499924841f958943dfe8e9dc7) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4107 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-08-29 00:45:36 -07:00
Lenni Kuff	a37003e64d	Allow passing -so or -build_shared_libs to buildall Default is to statically link, but specifying these flags will dynamically link the executables. Change-Id: Ic67a209b36285027e9b44e5fa491b197f443d84f Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3869 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-08-17 12:46:05 -07:00
Lenni Kuff	a20f03ff39	[CDH5] Re-enable JDBC tests by using Hive .12 JDBC driver Re-enables the FE JDBC tests by using the Hive .12 JDBC driver. We need to be careful that we don't mix Hive .12 and Hive .13 JARs outside of the test environment, so added the dependency at the "test" scope and updated the dependency plugin to include everything but "test" dependencies. We actually invoke the mvn dependency plugin as part of mvn package, so I also removed this call from buildall. Change-Id: I0e92aab2ddbbf067421efa844192cd42409155a0 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3845 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-08-13 22:29:59 -07:00
Lenni Kuff	f3ae861b0f	Add script to package tests + workload runner in a standalone tarball This adds a new make_test_tarball.sh script which will copy all the required dependencies to run the workload runner and impala Python tests outside of the Impala source tree. The goal is to make it very easy to run workloads on an arbitrary cluster without having to clone the impala source tree or build anything. As part of the make process, it will generate a simple set-env.sh script that configures the required environment variables, such as IMPALA_HOME and PYTHONPATH. It does not include any test data files, but contains the scripts required to recreate the table metadata. This tarball + a snapshot file should be sufficient to run most of the tests. Change-Id: If3bb12defa3c16a368a353075f8e784442464746 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3605 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit 4ff721236fb9e8bb6254d6ba46205a0ed147bf20) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3654	2014-08-01 00:01:04 -07:00
Henry Robinson	3e3c0991cc	Remove PGO and duplicated code from build scripts We had four scripts doing roughly the same thing - calling cmake and then calling make_impala.sh. This patch pushes the cmake call into make_impala.sh and then changes make_[asan\|debug\|release].sh to be trivial one-line calls into make_impala.sh. This patch also removes the PGO build, which is no longer used and no longer worked. Change-Id: Ib5c8ba910e52b030c172678f86db7d56e3f8c306 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3001 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3621	2014-07-27 02:14:42 -07:00
Matthew Jacobs	ebc6c5894e	External Data Source: Frontend and catalog changes Initial frontend and catalog changes for external data sources. Change-Id: Ia0e61ef97cfd7a4e138ef555c17f2e45bbf08c18 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2224 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit dfa14c828957f751db9c89bae0bdc040ce6f648c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2485	2014-05-08 14:56:19 -07:00

1 2 3 4

178 Commits