impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Tamas Mate	736b508e75	IMPALA-12263: Build with C++ Avro library when USE_AVRO_CPP is true This change updates the AVRO CMake module to use the C++ Avro library when USE_AVRO_CPP is set to true. This is the next step towards Avro backend update. Building with the C++ library fails at this point. Testing: - Manually tested configuring the project with USE_AVRO_CPP Change-Id: I0a81c3f7ab5a6651d507d8d9fac77ea17b8bb1a1 Reviewed-on: http://gerrit.cloudera.org:8080/20156 Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-07-05 12:21:39 +00:00
Michael Smith	f4c3a1e5a3	IMPALA-11459: Use new LLVM Pass Manager LLVM developed a new pass manager - https://llvm.org/docs/NewPassManager.html - to overcome some of the limitations of LegacyPassManager. It offers improved optimization performance by reusing analysis across all types and levels of optimization passes. It also appears to be better maintained in future releases of LLVM. Switches to using the new PassManager via PassBuilder and a ModulePassManager. Breaks out PruneModule into a separate FunctionPruneTime timer to more easily track any regressions there. Change-Id: I947a5b067da50c18f62c3f9af9876463e542f58a Reviewed-on: http://gerrit.cloudera.org:8080/20014 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com>	2023-06-23 14:43:50 +00:00
Michael Smith	683bef1ca4	IMPALA-11253: Support testing with Java 11 (take 2) Adds new environment variable IMPALA_JDK_VERSION which can be 'system', '8', or '11'. The default is 'system', which uses the same logic as before. If set to 8 or 11, it will ignore the system java and search for java of that specific version (based on specific directories for Ubuntu and Redhat). This is used by bin/bootstrap_system.sh to determine whether to install java 8 or java 11 (other versions can come later). If IMPALA_JDK_VERSION=11, then bin/start-impala-cluster.py adds the opens needed to deal with the ehcache issue. This no longer puts JAVA_HOME in bin/impala-config-local.sh as part of bootstrap_system.sh. Instead, it provides a new environment variable IMPALA_JAVA_HOME_OVERRIDE, which will be preferred over IMPALA_JDK_VERSION. This also updates the versions of Maven plugins related to the build. Source and target releases are still set to Java 8 compatibility. Adds a verifier to the end of run-all-tests that InaccessibleObjectException is not present in impalad logs. Tested with JDBC_TEST=false EE_TEST=false FE_TEST=false BE_TEST=false \ CLUSTER_TEST_FILES=custom_cluster/test_local_catalog.py \ run-all-tests.sh Testing: ran test suite with Java 11 This reverts the revert commit `1b6011c`, restoring these changes minus code to update IMPALA_JDK_VERSION based on $JAVA -version as that could break subsequent sourcing of impala-config.sh. Change-Id: Ie16504ad5738b1f228f97044afd3d9017ccc6c53 Reviewed-on: http://gerrit.cloudera.org:8080/19928 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-05-25 16:04:29 +00:00
Michael Smith	1b6011c6a0	Revert "IMPALA-11253: Support testing with Java 11" This reverts commit `ee6395db76` as it is not flexible enough at detecting Java automatically in likely build environments. Change-Id: I836c9f7fd10740b15f7e40b2e7f889ac7ee61fc3 Reviewed-on: http://gerrit.cloudera.org:8080/19908 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com>	2023-05-21 14:00:14 +00:00
Michael Smith	ee6395db76	IMPALA-11253: Support testing with Java 11 Adds new environment variable IMPALA_JDK_VERSION which can be 'system', '8', or '11'. The default is 'system', which uses the same logic as before. If set to 8 or 11, it will ignore the system java and search for java of that specific version (based on specific directories for Ubuntu and Redhat). This is used by bin/bootstrap_system.sh to determine whether to install java 8 or java 11 (other versions can come later). If IMPALA_JDK_VERSION=11, then bin/start-impala-cluster.py adds the opens needed to deal with the ehcache issue. This no longer puts JAVA_HOME in bin/impala-config-local.sh as part of bootstrap_system.sh. Instead, it provides a new environment variable IMPALA_JAVA_HOME_OVERRIDE, which will be preferred over IMPALA_JDK_VERSION. This also updates the versions of Maven plugins related to the build. Source and target releases are still set to Java 8 compatibility. Adds a verifier to the end of run-all-tests that InaccessibleObjectException is not present in impalad logs. Tested with JDBC_TEST=false EE_TEST=false FE_TEST=false BE_TEST=false \ CLUSTER_TEST_FILES=custom_cluster/test_local_catalog.py \ run-all-tests.sh Testing: ran test suite with Java 11 Change-Id: I15d309e2092c12d7fdd2c99b727f3a8eed8bc07a Reviewed-on: http://gerrit.cloudera.org:8080/19539 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2023-05-19 22:32:00 +00:00
Joe McDonnell	ba4cb95b62	IMPALA-11257: Fix CMake warnings for module names and cmake_minimum_required This fixes a few different CMake warnings: 1. This removes cmake_minimum_required invocations except for the top-most CMakeLists.txt. This eliminates the warnings like this: Compatibility with CMake < 2.8.12 will be removed from a future version of CMake. Update the VERSION argument <min> value or use a ...<max> suffix to tell CMake that the project does not need compatibility with older versions. Moving to a later version also required setting CMAKE_ENABLE_EXPORTS to continue exporting symbols. 2. This modifies the module names so that they match the corresponding module names from Find*.cmake. This is mostly dealing with case differences. This address warnings like: The package name passed to `find_package_handle_standard_args` (PROTOBUF) does not match the name of the calling package (Protobuf). This can lead to problems in calling code that expects `find_package` result variables (e.g., `_FOUND`) to follow a certain pattern. This fixed the detection logic for KerberosPrograms, and so it required adding more Kerberos packages to bin/bootstrap_build.sh. 3. This adds a missing .cc suffix. This addresses the following warning: CMake Warning (dev) at be/src/util/CMakeLists.txt:141 (add_library): Policy CMP0115 is not set: Source file extensions must be explicit. Run "cmake --help-policy CMP0115" for policy details. Use the cmake_policy command to set the policy and suppress this warning. These fixes mostly match how these warnings were handled in Apache Kudu. Testing: - Ran GVO Change-Id: I2a97dd07cdd0831e90882a2035415ac71d670147 Reviewed-on: http://gerrit.cloudera.org:8080/18444 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-08-11 05:48:36 +00:00
Riza Suminto	06b1db4675	IMPALA-11369: Separate thrift compiler for different component Impala used to have one thrift compiler version to compile C++, Java, and Python code. Most Thrift serialization/deserialization between minor versions are compatible with each other. So it is possible to have different thrift compiler versions for different target codes. It is beneficial to do so because it will allow Impala to upgrade separate components independently. This patch implements the infrastructure change required to do so. It replace most of the 'THRIFT_' environment variable and CMake variable with 'THRFIT_CPP_', 'THRFIT_JAVA_', and 'THRFIT_PY_' to compile C++, Java, and Python code accordingly. All three still refer to the same thrift version (thrift-0.11.0-p5). Testing: - Build Impala and pass core tests. Change-Id: I56479dc69b79024d1a4d09211bbe88a61fa0c6a4 Reviewed-on: http://gerrit.cloudera.org:8080/18636 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-06-21 02:40:59 +00:00
Joe McDonnell	7b490eed5b	IMPALA-10951 (preparation): Update Kudu to a more recent version As part of moving to a newer protobuf, this updates the Kudu version to get the fix for KUDU-3334. With this newer Kudu version, Clang builds hit an error while linking: lib/libLLVMCodeGen.a(TargetPassConfig.cpp.o):TargetPassConfig.cpp: function llvm::TargetPassConfig::createRegAllocPass(bool): error: relocation refers to global symbol "std::call_once<void (&)()>(std::once_flag&, void (&)())::{lambda()#2}::_FUN()", which is defined in a discarded section section group signature: "_ZZSt9call_onceIRFvvEJEEvRSt9once_flagOT_DpOT0_ENKUlvE0_clEv" prevailing definition is from ../../build/debug/security/libsecurity.a(openssl_util.cc.o) (This is from a newer binutils that will be pursued separately.) As a hack to get around this error, this adds the calloncehack shared library. The shared library publicly defines the symbol that was coming from kudu_client. By linking it ahead of kudu_client, the linker uses that rather than the one from kudu_client. This fixes the Clang builds. The new Kudu also requires a minor change to the flags for tserver startup. Testing: - Ran debug tests and verified calloncehack is not used - Ran ASAN tests Change-Id: Ieccbe284f11445e1de792352ebc7c9e1fa2ca0c3 Reviewed-on: http://gerrit.cloudera.org:8080/18129 Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-01-07 01:44:58 +00:00
Zoltan Borok-Nagy	b45cd1bf02	IMPALA-10933: Impala build finds system libcurl instead of toolchain version This patch modifies FindCurl.cmake to ignore the system version of libcurl. Without this patch the build might find a wrong version of libcurl which causes errors during link time. Change-Id: I3c2d315e9bc06b9b926a492fa8d3729baddc2c82 Reviewed-on: http://gerrit.cloudera.org:8080/17876 Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-28 22:53:59 +00:00
wzhou-code	03a7a59f5d	IMPALA-10876: Support to download JWKS from given URL This patch added functionality to download JWKS from a given URL and support key rotation by periodically checking the JWKS URL for updates. We use Kudu's EasyCurl wrapper to download file from the given URL. curl was added to native-toolchain. This patch modified makefiles and bootstrap_toolchain.py to integrate libcurl and libkudu_curl_util. Added end-end JWT authentication test cases with JWKS specified as HTTP/HTTPS URL. Testing: - Passed core run, including new test cases. Change-Id: Ic6ac8cf0010c13db30219776d1d275709bf211df Reviewed-on: http://gerrit.cloudera.org:8080/17802 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-28 04:45:23 +00:00
wzhou-code	025500ccb5	IMPALA-10489: Implement JWT support This patch added JWT support with following functionality: * Load and parse JWKS from pre-installed JSON file. * Read the JWT token from the HTTP Header. * Verify the JWT's signature with public key in JWKS. * Get the username out of the payload of JWT token. * Support following JSON Web Algorithms (JWA): HS256, HS384, HS512, RS256, RS384, RS512. We use third party library jwt-cpp to verify JWT token. jwt-cpp is a headers only C++ library. It was added to native-toolchain. This patch modified bootstrap_toolchain.py to download jwt-cpp from toolchain s3 bucket, and modified makefiles to add jwt-cpp/include in the include path. Added BE unit-tests for loading JWKS file and verifying JWT token. Also added FE custom cluster test for JWT authentication. Testing: - Passed core run. Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Reviewed-on: http://gerrit.cloudera.org:8080/17435 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-08 23:10:32 +00:00
Joe McDonnell	56ee90c598	IMPALA-9760: Add IMPALA_TOOLCHAIN_PACKAGES_HOME to prepare for GCC7 The locations for native-toolchain packages in IMPALA_TOOLCHAIN currently do not include the compiler version. This means that the toolchain can't distinguish between native-toolchain packages built with gcc 4.9.2 versus gcc 7.5.0. The collisions can cause issues when switching back and forth between branches. This introduces the IMPALA_TOOLCHAIN_PACKAGES_HOME environment variable, which is a location inside IMPALA_TOOLCHAIN that would hold native-toolchain packages. Currently, it is set to the same as IMPALA_TOOLCHAIN, so there is no difference in behavior. This lays the groundwork to add the compiler version to this path when switching to GCC7. Testing: - The only impediment to building with IMPALA_TOOLCHAIN_PACKAGES_HOME=$IMPALA_TOOLCHAIN/test is Impala-lzo. With a custom Impala-lzo, compilation succeeds. Either Impala-lzo will be fixed or it will be removed. - Core tests Change-Id: I1ff641e503b2161baf415355452f86b6c8bfb15b Reviewed-on: http://gerrit.cloudera.org:8080/15991 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-05-30 16:25:37 +00:00
Thomas Tauber-Marshall	19a4d8fe79	IMPALA-9335 (part 2): Fix rebased KRPC to compile This patch applies various fixes to Impala and to the copied Kudu source code in be/src/kudu/* to allow everything to compile. Some highlights of the changes made: - Various Kudu files were removed from compilation due to issues like relying on libraries that Impala does not provide. The linking of some executable is also changed for similar reasons. - The Kudu Cache implementation changed to support unique_ptr, allowing us to remove various uses of MakeScopeExitTrigger. - Some flags that have a DEFINE in both Kudu and Impala are modified to change one of the DEFINEs to a DECLARE. This patch was in part based on the patches that were applied the last time we rebased the Kudu code in IMPALA-7006, and I ensured that all changes from those commits that are still relevant were included here. I also went through all commits that have been applied to the be/src/kudu directory since the last rebase and ensured that all relevant changes from those are included here. Testing: - Passed an exhaustive DEBUG build and a core ASAN build. Change-Id: I1eb4caf927c729109426fb50a28b5e15d6ac46cb Reviewed-on: http://gerrit.cloudera.org:8080/15144 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>	2020-02-04 23:03:58 +00:00
Abhishek	51e8175c62	IMPALA-8450: Add support for zstd in parquet Makefile was updated to include zstd in the ${IMPALA_HOME}/toolchain directory. Other changes were made to make zstd headers and libs accessible. Class ZstandardCompressor/ZstandardDecompressor was added to provide interfaces for calling ZSTD_compress/ZSTD_decompress functions. Zstd supports different compression levels (clevel) from 1 to ZSTD_maxCLevel(). Zstd also supports -ive clevels, but since the -ive values represents uncompressed data they won't be supported. The default clevel is ZSTD_CLEVEL_DEFAULT. HdfsParquetTableWriter was updated to support ZSTD codec. The new codecs can be set using existing query option as follows: set COMPRESSION_CODEC=ZSTD:<clevel>; set COMPRESSION_CODEC=ZSTD; // uses ZSTD_CLEVEL_DEFAULT Testing: - Added unit test in DecompressorTest class with ZSTD_CLEVEL_DEFAULT clevel and a random clevel. The test unit decompresses an input compressed data and validates the result. It also tests for expected behavior when passing an over/under sized buffer for decompressing. - Added unit tests for valid/invalid values for COMPRESSION_CODEC. - Added e2e test in test_insert_parquet.py which tests writing/read- ing (null/non-null) data into/from a table (w different data type columns) using multiple codecs. Other existing e2e tests were updated to also use parquet/zstd table format. - Manual interoperability tests were run between Impala and Hive. Change-Id: Id2c0e26e6f7fb2dc4024309d733983ba5197beb7 Reviewed-on: http://gerrit.cloudera.org:8080/13507 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-06-05 11:15:04 +00:00
Joe McDonnell	2c45ab0933	Remove references to the $IMPALA_HOME/thirdparty directory The $IMPALA_HOME/thirdparty directory is a remnant from before Impala was an Apache project. It is obsolete and unused, so this removes code that references this directory. Testing: - Ran core tests Change-Id: I2edfd499febb5a25fdcf59b5183eccf192a08be0 Reviewed-on: http://gerrit.cloudera.org:8080/13092 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-04-25 04:24:23 +00:00
Philip Zeyliger	1772b3bbb5	Allow CMake to find JNI libraries from JDK JNI libraries can be in JAVA_HOME/jre/lib/amd64 or JAVA_HOME/lib/amd64. We were missing one entry in the list of places to look. This came up when I built a custom OpenJDK for myself and wanted to use it for building. Change-Id: I6e9f9e5b96e2a1c3c0b0ad6cae1a34ca22c1ec19 Reviewed-on: http://gerrit.cloudera.org:8080/12580 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-02-26 04:25:59 +00:00
Lars Volker	837d386886	Bump toolchain version, include libunwind Change-Id: I0b26f6a342dd7ba282c3f6c4de93745aff2dd095 Reviewed-on: http://gerrit.cloudera.org:8080/10755 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-07-06 22:06:03 +00:00
Attila Jeges	17749dbcfc	IMPALA-3307: Add support for IANA time-zone db Impala currently uses two different libraries for timestamp manipulations: boost and glibc. Issues with boost: - Time-zone database is currently hard coded in timezone_db.cc. Impala admins cannot update it without upgrading Impala. - Time-zone database is flat, therefore can’t track year-to-year changes. - Time-zone database is not updated on a regular basis. Issues with glibc: - Uses /usr/share/zoneinfo/ database which could be out of sync on some of the nodes in the Impala cluster. - Uses the host system’s local time-zone. Different nodes in the Impala cluster might use a different local time-zone. - Conversion functions take a global lock, which causes severe performance degradation. In addition to the issues above, the fact that /usr/share/zoneinfo/ and the hard-coded boost time-zone database are both in use is a source of inconsistency in itself. This patch makes the following changes: - Instead of boost and glibc, impalad uses Google's CCTZ to implement time-zone conversions. - Introduces a new startup flag (--hdfs_zone_info_zip) to impalad to specify an HDFS/S3/ADLS path to a zip archive that contains the shared compiled IANA time-zone database. If the startup flag is set, impalad will use the specified time-zone database. Otherwise, impalad will use the default /usr/share/zoneinfo time-zone database. - Introduces a new startup flag (--hdfs_zone_alias_conf) to impalad to specify an HDFS/S3/ADLS path to a shared config file that contains definitions for non-standard time-zone aliases. - impalad reads the entire time-zone database into an in-memory map on startup for fast lookups. - The name of the coordinator node’s local time-zone is saved to the query context when preparing query execution. This time-zone is used whenever the current time-zone is referred afterwards in an execution node. - Adds a new ZipUtil class to extract files from a zip archive. The implementation is not vulnerable to Zip Slip. Cherry-picks: not for 2.x. Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Reviewed-on: http://gerrit.cloudera.org:8080/9986 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Attila Jeges <attilaj@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-06-22 13:18:58 +00:00
stiga-huang	818cd8fa27	IMPALA-5717: Support for reading ORC data files This patch integrates the orc library into Impala and implements HdfsOrcScanner as a middle layer between them. The HdfsOrcScanner supplies input needed from the orc-reader, tracks memory consumption of the reader and transfers the reader's output (orc::ColumnVectorBatch) into impala::RowBatch. The ORC version we used is release-1.4.3. A startup option --enable_orc_scanner is added for this feature. It's set to true by default. Setting it to false will fail queries on ORC tables. Currently, we only support reading primitive types. Writing into ORC table has not been supported neither. Tests - Most of the end-to-end tests can run on ORC format. - Add tpcds, tpch tests for ORC. - Add some ORC specific tests. - Haven't enabled test_scanner_fuzz for ORC yet, since the ORC library is not robust for corrupt files (ORC-315). Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4 Reviewed-on: http://gerrit.cloudera.org:8080/9134 Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-04-11 05:13:02 +00:00
Michael Ho	b4ea57a7e3	IMPALA-4856: Port data stream service to KRPC This patch implements a new data stream service which utilizes KRPC. Similar to the thrift RPC implementation, there are 3 major components to the data stream services: KrpcDataStreamSender serializes and sends row batches materialized by a fragment instance to a KrpcDataStreamRecvr. KrpcDataStreamMgr is responsible for routing an incoming row batch to the appropriate receiver. The data stream service runs on the port FLAGS_krpc_port which is 29000 by default. Unlike the implementation with thrift RPC, KRPC provides an asynchronous interface for invoking remote methods. As a result, KrpcDataStreamSender doesn't need to create a thread per connection. There is one connection between two Impalad nodes for each direction (i.e. client and server). Multiple queries can multi-plex on the same connection for transmitting row batches between two Impalad nodes. The asynchronous interface also prevents avoids the possibility that a thread is stuck in the RPC code for extended amount of time without checking for cancellation. A TransmitData() call with KRPC is in essence a trio of RpcController, a serialized protobuf request buffer and a protobuf response buffer. The call is invoked via a DataStreamService proxy object. The serialized tuple offsets and row batches are sent via "sidecars" in KRPC to avoid extra copy into the serialized request buffer. Each impalad node creates a singleton DataStreamService object at start-up time. All incoming calls are served by a service thread pool created as part of DataStreamService. By default, the number of service threads equals the number of logical cores. The service threads are shared across all queries so the RPC handler should avoid blocking as much as possible. In thrift RPC implementation, we make a thrift thread handling a TransmitData() RPC to block for extended period of time when the receiver is not yet created when the call arrives. In KRPC implementation, we store TransmitData() or EndDataStream() requests which arrive before the receiver is ready in a per-receiver early sender list stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to when the receiver is created or when timeout occurs. Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr. If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit to exceed, the request will be stashed in a queue for deferred processing. The stashed RPC requests will not be responded to until they are processed so as to exert back pressure to the senders. An alternative would be to reply with an error and the request / row batches need to be sent again. This may end up consuming more network bandwidth than the thrift RPC implementation. This change adopts the behavior of allowing one stashed request per sender. All rpc requests and responses are serialized using protobuf. The equivalent of TRowBatch would be ProtoRowBatch which contains a serialized header about the meta-data of the row batch and two Kudu Slice objects which contain pointers to the actual data (i.e. tuple offsets and tuple data). This patch is based on an abandoned patch by Henry Robinson. TESTING ------- * Builds {exhaustive/debug, core/release, asan} passed with FLAGS_use_krpc=true. TO DO ----- * Port some BE tests to KRPC services. Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Reviewed-on: http://gerrit.cloudera.org:8080/8023 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins	2017-11-09 20:05:08 +00:00
Sailesh Mukil	4592ed445e	IMPALA-5129: Use Kudu's Kinit code to avoid expensive fork Impala currently kinits by forking off a child process. This has proved to be expensive in many cases since the subprocess tries to reserve as much memory as Impala is currently using which can be quite a lot. This patch adds a flag called 'use_kudu_kinit' that defaults to true. When it's true, it uses the Kudu security library's kinit code that programatically uses the krb5 library to kinit. When it's false, we run our current path which kicks off the kinit-thread and forks off a kinit process periodically to reacquire tickets based on FLAGS_kerberos_reinit_interval. Converted existing tests in thrift-server-test to run with and without kerberos. We now run this BE test with kerberos by using Kudu's MiniKdc utility. This introduces a new dependency on some kerberos binaries that are checked through FindKerberosPrograms.cmake. Note that this is only a test dependency and not a dependency for the impalad binaries and friends. Compilation will still succeed if the kerberos binaries for the MiniKdc are not found, however, the thrift-server-test will fail. We run with and without the 'use_kudu_kinit' flag. TODO: Since the setting up and tearing down of our security code isn't idempotent, we can run only any one test in a process with Kerberos now (IMPALA-6085). Updated bin/bootstrap_system.sh to install new sasl-gssapi modules and the kerberos binaries required for the MiniKdc. Also fixed a bug that didn't transfer the environment into 'sudo' in bin/bootstrap_system.sh. Testing: Verified with thrift-server-test and also manually on a live kerberized cluster. Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Reviewed-on: http://gerrit.cloudera.org:8080/7938 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Impala Public Jenkins	2017-10-27 00:19:44 +00:00
Michael Ho	dd4c6be8e0	IMPALA-4670: Introduces RpcMgr class This patch introduces a new class, RpcMgr which is the abstraction layer around KRPC core mechanics. It provides an interface RegisterService() for various services to register themselves. Kudu RPC is invoked via an auto-generated interface called proxy. This change implements an inline wrapper for KRPC client to obtain a proxy for a particular service exported by remote server. Last but not least, the RpcMgr will start all registered services if FLAGS_use_krpc is true. This patch hasn't yet added any service except for some test services in rpc-mgr-test. This patch is based on an abandoned patch by Henry Robinson. Testing done: a new backend test is added to exercise the code and demonstrate the way to interact with KRPC framework. Change-Id: I8adb10ae375d7bf945394c38a520f12d29cf7b46 Reviewed-on: http://gerrit.cloudera.org:8080/7901 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins	2017-10-06 07:09:55 +00:00
Henry Robinson	f20b1626b8	IMPALA-5846: Fix output path for kudu libraries Prior to this patch, libraries and executables built using ADD_EXPORTABLE_LIBRARY (i.e. those built from be/src/kudu) were placed in their source directory - not in be/build/<etc>. The problem appears to be related to how LIBRARY_OUTPUT_PATH was set by ADD_EXPORTABLE_LIBRARY. I confess I don't completely understand the bug, but this more idiomatic (and clear, IMHO) way of setting the output dirs has the expected behaviour. Change-Id: I73f3dd5435bceb35bc929ff6d5f2c92300e2a1d2 Reviewed-on: http://gerrit.cloudera.org:8080/7818 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-08-26 01:44:26 +00:00
Henry Robinson	1135261980	IMPALA-4669: [KRPC] Add kudu_rpc library to build Import FindKRPC.cmake from Apache Kudu. Add some files to protoc-gen-krpc link to allow it to find symbols now defined within Impala (without linking all of Impala's libraries). Change-Id: I33203e95dff07c87a6ec5c7a31b7a583b91849bc Reviewed-on: http://gerrit.cloudera.org:8080/5719 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Impala Public Jenkins	2017-08-25 22:51:41 +00:00
Henry Robinson	f51c4435c9	IMPALA-4669: [SECURITY] Add security library to build * Minor compilation fix * Add krb5 as a non-toolchain dependency * Handle legacy versions of libkrb5.so by providing implementation of krb5_is_config_principal(). * Link against openssl from the toolchain if 1.0.0 or higher not found on build machine. * Update LICENSE.txt and NOTICE.txt re: OpenSSL code in x509_check_host.{h,c}. Change-Id: I4f327810066bee7f3ac107b0295480fb9ed45e14 Reviewed-on: http://gerrit.cloudera.org:8080/5717 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2017-08-15 00:47:26 +00:00
Henry Robinson	84b8155cc3	IMPALA-4669: [SECURITY] Import Kudu security library from kudu@314c9d8 The security library provides Kerberos and TLS facilities to the rpc library. Change-Id: I76daeead00f672aa468f5ab6de4d70eac2078cb2 Reviewed-on: http://gerrit.cloudera.org:8080/5716 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2017-08-15 00:45:44 +00:00
Henry Robinson	d79e01ef9f	IMPALA-5659: Begin standardizing treatment of thirdparty libraries If Impala was built with --build_shared_libs, some thirdparty libraries were still statically linked; this could cause runtime errors if the libraries were also linked into a .so. This patch fixes that issue (for gflags, glog and protobuf at least) by ensuring that build_shared_libs is respected for those libraries. * Standardize thirdparty library handling w/CMake by adding IMPALA_ADD_THIRDPARTY_LIB. This creates a symbolic name for each library, allowing us to switch the underlying library files (e.g. change from static to dynamic linking) without having to individually change the link clauses for each target. * Remove most cases of add_library() from cmake_modules/* - that is all handled by IMPALA_ADD_THIRDPARTY_LIB. * Add shared library detection for a couple of thirdparty dependencies (many only detect static libraries), just to prove the concept. * All thirdparty libraries now print a standard set of messages. For example: -- ----------> Adding thirdparty library protoc. <---------- -- Header files: /data/henry/src/cloudera/impala-toolchain/protobuf-2.6.1/include -- Added shared library dependency protoc: /data/henry/src/cloudera/impala-toolchain/protobuf-2.6.1/lib/libprotoc.so -- ----------> Adding thirdparty library libev. <---------- -- Header files: /data/henry/src/cloudera/impala-toolchain/libev-4.20/include -- Added shared library dependency libev: /data/henry/src/cloudera/impala-toolchain/libev-4.20/lib/libev.so * Some libraries don't quite fit this pattern (LLVM and Boost) - leave them as is for now. * Remove FindOpenSSL.cmake - the toolchain one is more modern. Change-Id: Ib7a6bc5610aaf2450f91348d94cfb984c6a4b78d Reviewed-on: http://gerrit.cloudera.org:8080/7418 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2017-07-19 02:44:18 +00:00
Henry Robinson	23100102c0	IMPALA-4758: (2/2) Impala-side changes to build with latest gutil Meant to be taken as a whole with the previous commit. This patch makes the necessary code changes to Impala and the gutil/ library to fix all compilation errors. Future upgrades to gutil/ should redo the work in this commit. * Remove kudu/ include prefix with command: git grep -l "include \"kudu/" \| xargs sed -i 's/include \"kudu\//include \"/g' * Change KUDU_GUTIL_* guards to be GUTIL_* git grep -l KUDU_GUTIL \| xargs sed -i 's/KUDU_GUTIL/GUTIL/g' * Replace glog/logging.h with common/logging.h git grep -l "glog/logging" \| xargs sed -i 's/glog\/logging/common\/logging/g' * Provide our own implementation of since-removed MonotonicNanos() * Reinstate COMPILE_FLAGS argument to ADD_EXPORTABLE_LIBRARY, used by gutil. * Replay overwritten parts of following commits: `a7c3f30` - Remove AMD Opteron Rev E workaround from atomicops `54194af` - IMPALA-4631: don't use floating point operations for time unit conversions `152c586` - Improve AtomicInt abstraction and implementation * Comment out non-compiling deprecated function definitions in numbers.h * Overwrite changes from 92fafa "Use more efficient gutil implementation of Log2Ceiling" in favour of implementing them in Impala code only. * Couple of misc fixes. Change-Id: I4ac21d7d6401f21fcdfdd1132b8f322bfba4bb80 Reviewed-on: http://gerrit.cloudera.org:8080/5688 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins	2017-03-29 02:52:34 +00:00
Henry Robinson	5a333c47c5	Fix typo in Flatbuffers cmake module Change-Id: I0786344b5485a92c02a246b543b6acda279e199c Reviewed-on: http://gerrit.cloudera.org:8080/6398 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: Impala Public Jenkins	2017-03-15 03:57:40 +00:00
Dimitris Tsirogiannis	60c1c6e81b	IMPALA-4966: Add flatbuffers to build FlatBuffers version 1.6.0 is already included in the toolchain. This commit adds it to the build system. Change-Id: I2ca255ddf08ac846b454bfa1470ed67b1338d2b0 Reviewed-on: http://gerrit.cloudera.org:8080/6180 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: Impala Public Jenkins	2017-03-02 09:43:03 +00:00
Henry Robinson	60c41c4f0f	IMPALA-4652: Add crcutil to build Add crcutil, built from a git hash since there are no released versions, to Impala's build. crcutil is available at https://github.com/rurban/crcutil FindCrcutil.cmake was taken from Apache Kudu. Change-Id: I095d1c6b8e9e8f40cf62c1ecfdc880e708a72c28 Reviewed-on: http://gerrit.cloudera.org:8080/5660 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2017-01-12 23:50:14 +00:00
Henry Robinson	a81ad5eaab	IMPALA-4651: Add LibEv to build Add libev 4.20 to the Impala build. This is a dependency of KRPC. FindLibEv.cmake was taken from Apache Kudu. Change-Id: Iaf0646533592e6a8cd929a8cb015b83a7ea3008f Reviewed-on: http://gerrit.cloudera.org:8080/5659 Tested-by: Impala Public Jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2017-01-12 23:44:26 +00:00
Henry Robinson	ed0aa66ee1	IMPALA-4650: Allow protobuf to find non-system libraries and binaries This change makes PROTOBUF_GENERATE_CPP able to pick up Protobuf libraries and binaries that are found by CMake but not installed on the system LD_LIBRARY_PATH. Change-Id: I942b3f18e25e2abc5aac167412b65abb680d3c5a Reviewed-on: http://gerrit.cloudera.org:8080/5658 Tested-by: Impala Public Jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2017-01-12 05:18:33 +00:00
Henry Robinson	4b3fdc3301	IMPALA-4650: Add Protobuf to build This patch adds Protobuf 2.6.1 to Impala's build, and bumps the toolchain version so that the dependency is available. Protobuf is unused in this commit, but is required for KRPC. FindProtobuf.cmake includes some utility CMake methods to generate source code from Protobuf definitions. It is taken from Kudu. Change-Id: Ic9357fe0f201cbf7df1ba19fe4773dfb6c10b4ef Reviewed-on: http://gerrit.cloudera.org:8080/5657 Tested-by: Impala Public Jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2017-01-12 05:18:17 +00:00
Henry Robinson	44bb99a61d	Add Kudu cmake utilities This commit imports some CMake utility methods from Kudu, in preparation for adding KRPC and its dependencies to Impala's build. The methods are unused in this patch, but will be used both by thirdparty dependencies (e.g. Protobuf) and by the Kudu libraries themselves. Some methods are stubbed out to make it easier to import Kudu's CMakeLists.txt files without adding extra test targets etc. to Impala's build. Change-Id: Ibaae645d650ab1555452e4cc2574d6c84a90d941 Reviewed-on: http://gerrit.cloudera.org:8080/5656 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Impala Public Jenkins	2017-01-12 02:53:45 +00:00
Jim Apple	14891fe004	IMPALA-3676: Use clang as a static analysis tool This patch adds a script to run clang-tidy over the whole code base. It is a first step towards running clang-tidy over patches as a tool to help users spot bugs before code review. Because of the number of clang-tidy checks, this patch only addresses some of them. In particular, only checks starting with 'clang' are considered. Many of them which are flaky or not part of our style are excluded from the analysis. This patch also exlcudes some checks which are part of our current style but which would be too laborious to fix over the entire codebase, like using nullptr rather than NULL. This patch also fixes a number of small bugs found by clang-tidy. Finally, this patch adds the class AlignedNew, the purpose of which is to provide correct alignment on heap-allocated data. The global new operator only guarantees 16-byte alignment. A class that includes a member variable that must be aligned on a k-byte boundary for k>16 can inherit from AlignedNew<k> to ensure correct alignment on the heap, quieting clang's -Wover-aligned warning. (Static and stack allocation are required by the standard to respect the alignment of the type and its member variables, so no extra code is needed for allocation in those places.) Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06 Reviewed-on: http://gerrit.cloudera.org:8080/4758 Reviewed-by: Jim Apple <jbapple@cloudera.com> Tested-by: Internal Jenkins	2016-11-04 00:13:12 +00:00
Tim Armstrong	df680cfe3a	IMPALA-4277: allow overriding of Hive/Hadoop versions/locations This is to help with IMPALA-4277 to make it easier to build against Hadoop/Hive distributions where the directory layout doesn't exactly match our current CDH dependencies, or where we may want to temporarily override a version without making a source change. Change-Id: I7da10e38f9c4309f2d193dc25f14a6ea308c9639 Reviewed-on: http://gerrit.cloudera.org:8080/4720 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Internal Jenkins	2016-10-18 05:54:09 +00:00
Jim Apple	bd2947329e	IMPALA-4110: Clean up issues found by Apache RAT. Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962 Reviewed-on: http://gerrit.cloudera.org:8080/4361 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-09-14 22:09:24 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Michael Ho	86ff18eee9	IMPALA-3223: Removal of non-toolchain builds. This change removes the option to build without specifying the environment variable $IMPALA_TOOLCHAIN. By default, if it's not set, sourcing impala-config.sh will set it to $IMPALA_HOME/toolchain. A user can override it by setting $IMPALA_TOOLCHAIN to his/her own toolchain directory. The user can also set $SKIP_TOOLCHAIN_BOOTSTRAP to true to avoid running the toolchain bootstrapping script (e.g. a particular component in toolchain is at a version not checked into S3). $IMPALA_TOOLCHAIN holds some third party binaries which Impala relies on. They can be compiled from source in the native toolchain which is public. This commit also removes build_thirdparty.sh as it's no longer used. By default, Impala will be built with the compiler in $IMPALA_TOOLCHAIN but this option can be overridden by setting environment variable $USE_SYSTEM_GCC to 1. Change-Id: I42b60e99fb9caf1294be7ab242856ca3b9a5ab73 Reviewed-on: http://gerrit.cloudera.org:8080/3259 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Michael Ho <kwho@cloudera.com>	2016-06-07 17:29:59 -07:00
Michael Ho	0b7ae6e4eb	IMPALA-3223: Relocate squeasel and mustache directories This change moves the source and header files of squeasel and mustache to be/src/thirdparty. This is a step towards removing thirdparty as a preparation to move to ASF. There is also corresponding change to Impala-lzo to update its include path. Change-Id: I782e493bc28086a1587274b3c474ea6b6f201855 Reviewed-on: http://gerrit.cloudera.org:8080/3206 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Michael Ho <kwho@cloudera.com>	2016-05-31 23:31:41 -07:00
Michael Ho	9a5e701209	IMPALA-3223: Remove boost multiprecision in thirdparty. Boost library header is already included in the toolchain. Also removes the environment variable IMPALA_MIN_BOOST_VERSION and standardizes on the boost library version in toolchain. Change-Id: I297edac7053964bfa113e0d5bf411fa3934b3796 Reviewed-on: http://gerrit.cloudera.org:8080/3159 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:19 -07:00
Tim Armstrong	2b61ae7f2a	IMPALA-3534: allow overriding of CMAKE_CXX_COMPILER for ASAN This makes it consistent with the regular toolchain and makes it easier to use wrapper scripts like distcc. Change-Id: I3ab488182c46f9ccb1850a0a2b064653e7e3da26 Reviewed-on: http://gerrit.cloudera.org:8080/3050 Reviewed-by: Jim Apple <jbapple@cloudera.com> Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 23:06:36 -07:00
Tim Armstrong	1c704f3cfd	IMPALA-3166: basic perf support and asm dumps for codegened code Adds support for communicating function-level symbols to perf by writing /tmp/perf-<pid>.data if the --perf_map=true argument is set. Perf must be run under the same user as Impala. I.e. 'sudo perf top' does not work. To get perf to work under a non-root user you will probably need to disable some kernel security features that perf complains about: sudo bash -c 'echo -1 > /proc/sys/kernel/perf_event_paranoid' sudo bash -c 'echo 0 > /proc/sys/kernel/kptr_restrict' Once you get it working you should see IR function names concatenated with the fragment instance id in 'perf top'. 'perf annotate' does not work. Implements --asm_module_dir, analogous to --opt_module_dir. We dump disassembly to files there. Debug symbols are interleaved with the assembly if they are available. I enabled them for the debug build, now that we have some purpose for them. In some cases it would be useful to have them for the release build, but they make the llvm module much larger so I haven't enabled them there. The asm dump for a random exception constructor looks like this: Disassembly for __cxx_global_var_init.165:324bc8754182e7c6:22735c36d7a2bc0 (0x7f50f2140300): date_facet.hpp:date_facet.hpp:<invalid>:363:0 date_facet.hpp:date_facet.hpp:<invalid>:363:58 0: movabsq $0, %rax 10: movb (%rax), %cl 12: cmpb $0, %cl 15: jne 17 date_facet.hpp:date_facet.hpp:<invalid>:363:58 17: movabsq $0, %rax 27: movq $1, (%rax) date_facet.hpp:date_facet.hpp:<invalid>:363:58 34: retq Change-Id: If25de61e46f4db005956686cddbd4d71a1424528 Reviewed-on: http://gerrit.cloudera.org:8080/2793 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:18:03 -07:00
Tim Armstrong	5c56ec0997	Fix some ASAN compile warnings and remove redundant flags Change-Id: I7b2772d917449ca747820641c56e65545f610b23 Reviewed-on: http://gerrit.cloudera.org:8080/3025 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:18:02 -07:00
Lars Volker	c9df348c38	IMPALA-2686: Add breakpad crash handler to all daemons This changes add breakpad crash handling support to catalogd, impalad, and statestored. The destination folder for minidump files can be configured via the 'minidump_path' command line flag. Leaving it empty will disable minidump generation. The daemons will rotate minidump files. The number of files to keep can be configured with the 'max_minidumps' command line flag. Change-Id: I7a37a38488716ffe34296f3490ae291bbb7228d6 Reviewed-on: http://gerrit.cloudera.org:8080/2028 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:52 -07:00
Tim Armstrong	d6613e9531	IMPALA-775,IMPALA-3374: Upgrade LLVM to 3.8.0 This is the same as the previous LLVM upgrade patch, except we've removed the libtinfo dependency, so we assume we're building against an LLVM that doesn't require that. This requires various changes for Impala to be fully functional with the new version of LLVM. The original JIT was removed from LLVM, we need to switch to the new MCJIT API and implementation. MCJIT only supports module-at-a-time compilation, so the module must be finalised before any compilation happens. We didn't depend on the old behaviour deeply, but various small fixes were required. MCJIT requires that every IR module has a name. We relied on the old JIT's workaround for the __dso_handle symbol, which we have to emulate for MCJIT with a customer memory manager until we can get rid of global initialisers in cross-compiled code. LLVM made a number of incompatible API changes and reorganised headers. Clang took over responsibility for padding structs by marking structs as packed and inserting bytes so that members are aligned correctly (previously it relies LLVM aligning struct members based on the target's alignment rules). This means Impala also needs to manually pad its structs since clang-emitted structs look to LLVM like they have do not need to be inlined. Our inlining pass would require some modification to work and is redundant with LLVM's inlining pass, so was removed along with the unused subexpr elimination pass. There were various issues with __builtin_add_overflow and __builtin_mul_overflow that are newly available in LLVM 3.8. First, LLVM emitted a call to a function in libclang_rt, which we don't link in and has symbols that conflict with the gcc runtime library. Second, the performance actually regressed by using the builtins (I tested this manually by copying across the definition of the required function). Change-Id: I60b18a40a2df3f1adf326721f0df2a639d53a7c2 Reviewed-on: http://gerrit.cloudera.org:8080/2866 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:42 -07:00
Tim Armstrong	b4a9dfcc92	Revert "IMPALA-775,IMPALA-3374: Upgrade LLVM to 3.8.0" Reverting until we can sort out libtinfo build dependencies on various OSes. This reverts commit 1e77048be06aeb511e3483193db4257c8dbc7cf3. Change-Id: I281b0b040941d9e4e6a5199c5d228471ad8c031c Reviewed-on: http://gerrit.cloudera.org:8080/2857 Tested-by: Internal Jenkins Reviewed-by: Dan Hecht <dhecht@cloudera.com>	2016-05-12 14:17:40 -07:00
Tim Armstrong	be415f380f	IMPALA-775,IMPALA-3374: Upgrade LLVM to 3.8.0 This requires various changes for Impala to be fully functional with the new version of LLVM. The original JIT was removed from LLVM, we need to switch to the new MCJIT API and implementation. MCJIT only supports module-at-a-time compilation, so the module must be finalised before any compilation happens. We did't depend on the old behaviour deeply, but various small fixes were required. MCJIT requires that every IR module has a name. We relied on the old JIT's workaround for the __dso_handle symbol, which we have to emulate for MCJIT with a customer memory manager until we can get rid of global initialisers in cross-compiled code. LLVM made a number of incompatible API changes and reorganised headers. Clang took over responsibility for padding structs by marking structs as packed and inserting bytes so that members are aligned correctly (previously it relies LLVM aligning struct members based on the target's alignment rules). This means Impala also needs to manually pad its structs since clang-emitted structs look to LLVM like they have do not need to be inlined. Our inlining pass would require some modification to work and is redundant with LLVM's inlining pass, so was removed along with the unused subexpr elimination pass. LLVM now depends on another system library libtinfo, so we use llvm-config to get the required system libs directly. There were various issues with __builtin_add_overflow and __builtin_mul_overflow that are newly available in LLVM 3.8. First, LLVM emitted a call to a function in libclang_rt, which we don't link in and has symbols that conflict with the gcc runtime library. Second, the performance actually regressed by using the builtins (I tested this manually by copying across the definition of the required function). Change-Id: I17d7afd05ad3b472a0bfe035bfc3daada5597b2d Reviewed-on: http://gerrit.cloudera.org:8080/2486 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:40 -07:00
Matthew Jacobs	62dbdb06d0	IMPALA-3162: Upgrade to gperftools 2.5 (take 2) Switches the gperftools version from 2.0 to 2.5 which is also updated in the native-toolchain. The unmodified source is also checked into thirdparty for those not using the toolchain. This commit reverts "CDH-38434: Fix Impala packaging build" (commit 5666ef84977c4b92dec5b10ed71bbe36740a50c7) now that the toolchain dependencies have been built for sles12. Change-Id: I3fdc5091dfa4557968bf1a40f7e6d3eab91e7c15 Reviewed-on: http://gerrit.cloudera.org:8080/2581 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-03-18 23:08:09 +00:00

1 2 3

111 Commits