impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Philip Zeyliger	7263c33ea7	Use "mvn -B" in builds to avoid dowloading progress bars in logs. Maven's batch (or non-interactive) mode prevents progress bar output when Maven is downloading artifacts, which isn't generally useful. Now that we keep Maven logs in logs/mvn/mvn.log, this makes them slightly more tidy. Change-Id: I5aa117272c2a86b63b0f9062099a4145324eb6fc Reviewed-on: http://gerrit.cloudera.org:8080/9792 Reviewed-by: Michael Brown <mikeb@cloudera.com> Tested-by: Impala Public Jenkins	2018-03-27 04:04:28 +00:00
Philip Zeyliger	5c8da5d13a	Consistently use Java 1.7 compiler. We use Java 1.7 in fe/pom.xml, where most of our Java code is. For consistency, this updates the rest of our Maven configurations to use the same version of Java. A change I'm working with uses try-with-resources in HBase splitting, which is how I ran into this. Testing: ran core tests Change-Id: I6cecddf367f00185a14a8b08c03456e3b756bd70 Reviewed-on: http://gerrit.cloudera.org:8080/9600 Reviewed-by: Philip Zeyliger <philip@cloudera.com> Tested-by: Impala Public Jenkins	2018-03-17 04:08:53 +00:00
Philip Zeyliger	d2fe9f437e	IMPALA-6270: create Impala parent pom This commit links together all the individual pom.xml files to have a new "impala-parent" pom as the parent. This enables de-duplicating all the repository configuration. I ran the build to test this. Change-Id: Id744e4357ee4d8e4be4e5490b2159bb76a2192f0 Reviewed-on: http://gerrit.cloudera.org:8080/8753 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2017-12-12 04:30:15 +00:00
Jim Apple	edc70c1661	Impala is graduating; remove outdated references to incubation Change-Id: I4e6080a2b196926e46b1e641f6530ba1fa9bd444 Reviewed-on: http://gerrit.cloudera.org:8080/8577 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Impala Public Jenkins	2017-11-16 22:31:38 +00:00
Michael Ho	35f5c7bd37	IMPALA-4856: Rename thrift-deps to gen-deps As a preparation to start generating Protobuf files for IMPALA-4856, this change introduces a new build target "gen-deps" which serves as an umbrella for all build targets of generated code. For now, it only includes thrift-deps and protobuf targets will be added in the future. Change-Id: I360c63773efdeab4c26ca96b915e0c8d0ce2b9c9 Reviewed-on: http://gerrit.cloudera.org:8080/7851 Reviewed-by: Lars Volker <lv@cloudera.com> Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-08-30 00:26:52 +00:00
Tim Armstrong	392c4badff	IMPALA-5224: remove defunct codehaus repository The server is no longer even up, so there is no point in having it in our poms. Change-Id: I2310000e51c5e6d85a0fa30874629f4f19427c6c Reviewed-on: http://gerrit.cloudera.org:8080/6678 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-04-20 03:24:33 +00:00
Tim Armstrong	fc4ee65f9f	Add all build targets to CMake and speed up builds Use CMake's dependency resolution always instead of serial execution of targets via shell scripts. This improves parallelism by building fe, be, and other targets at the same time and avoid some overhead from invoking "make" multiple times. This reduces the time taken for an incremental compilation of fe and be from 56s to 24s with this command: ./buildall.sh -debug -noclean -notests -skiptests -ninja Also use Impala-lzo's build script. This depends on the IMPALA-4277 fixes to the Impala-lzo build script. Log directory creation is also moved from impala-config.sh to buildall.sh. This means that impala-config.sh has no side-effects and can be run concurrently with no issues. Also make sure that "make" builds all the same artifacts as buildall.sh when run with no args. Testing: Ran a jenkins core job, also experimented locally. Ran a jenkins core job with distcc disabled - this exposed some concurrency bugs where impala-config.sh fails if run concurrently. Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Reviewed-on: http://gerrit.cloudera.org:8080/4790 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2016-12-14 23:42:19 +00:00
Tim Armstrong	75a857c0ce	IMPALA-4259: build Impala without any test cluster setup. The main outcome of this change is to avoid making unnecessary modification to the Impala or other source trees when we don't need the test cluster. To achieve that, this refactors the script to make the flow easier to understand and makes it more consistent which build steps are executed in which modes. Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137 Reviewed-on: http://gerrit.cloudera.org:8080/4685 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-10-13 05:45:47 +00:00
Thomas Tauber-Marshall	b2c2fe7813	IMPALA-3786: Replace "cloudera" with "apache" (part 2) As part of the ASF transition, we need to replace references to Cloudera in Impala with references to Apache. This primarily means changing Java package names from com.cloudera.impala.* to org.apache.impala.* A prior patch renamed all the files as necessary, and this patch performs the actual code changes. Most of the changes in this patch were generated with some commands of the form: find . \| grep "\.java\\|\.py\\|\.h\\|\.cc" \| \ xargs sed -i s/'com\(.\)cloudera\(\.\)impala/org\1apache\2impala/g along with some manual fixes. After this patch, the remaining references to Cloudera in the repo mostly fall into the categories: - External components that have cloudera in their own package names, eg. com.cloudera.kudu/llama - URLs, eg. https://repository.cloudera.com/ Change-Id: I0d35fa6602a7fc0c212b2ef5e2b3322b77dde7e2 Reviewed-on: http://gerrit.cloudera.org:8080/3937 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Reviewed-by: Jim Apple <jbapple@cloudera.com> Tested-by: Internal Jenkins	2016-09-29 21:14:13 +00:00
Thomas Tauber-Marshall	b544f019aa	IMPALA-3786: Replace "cloudera" with "apache" (part 1) As part of the ASF transition, we need to replace references to Cloudera in Impala with references to Apache. This primarily means changing Java package names from com.cloudera.impala.* to org.apache.impala.* To make this easier to review, this patch only renames files, eg. fe/src/main/java/com/cloudera -> fe/src/main/java/org/apache A follow up patch performs the actual code updates. Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db Reviewed-on: http://gerrit.cloudera.org:8080/3936 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Reviewed-by: Jim Apple <jbapple@cloudera.com> Tested-by: Internal Jenkins	2016-09-29 21:13:52 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Misha Dmitriev	33e4ba9454	IMPALA-3384: add missing frontend -> ext-data-source dependency. Change-Id: If15f6d737d6a9c301df682f73b56e5ebeabfcb96 Reviewed-on: http://gerrit.cloudera.org:8080/2912 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:47 -07:00
Tim Armstrong	93c703b602	Fix misc mvn warnings Maven was complaining that the source encoding was not set, and that the version of a plugin was not specified. Change-Id: I2bc6bbe95fc71575aeec5b6969cc869794309a49 Reviewed-on: http://gerrit.cloudera.org:8080/1741 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-01-11 21:11:15 +00:00
Alex Behm	26467d1f98	Upgrade a few important mvn plugins. Change-Id: I84cb4834744e3a8a3dfde82d20c9205a155b7a31 Reviewed-on: http://gerrit.cloudera.org:8080/399 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-05-20 03:12:57 +00:00
Martin Grund	b582cdc22b	IMPALA-1598: Adding Error Codes to Log Messages This patch introduces the concept of error codes for errors that are recorded in Impala and are going to be presented to the client. These error codes are used to aggregate and group incoming error / warning messages to reduce the spill on the shell and increase the usefulness of the messages. By splitting the message string from the implementation, it becomes possible to edit the string independently of the code and pave the way for internationalization. Error messages are defined as a combination of an enum value and a string. Both are defined in the Error.thrift file that is automatically generated using the script in common/thrift/generate_error_codes.py. The goal of the script is to have a central understandable repository of error messages. Adding new messages to this file will require rebuilding the thrift part. The proxy class ErrorMessage is responsible to represent an error and capture the parameters that are used to format the error message string. When error messages are recorded they are recorded based on the following algorithm: - If an error message is of type GENERAL, do not aggregate this message and simply add it to the total number of messages - If an error messages is of specific type, record the first error message as a sample and for all other occurrences increment the count. - The coordinator will merge all error messages except the ones of type GENERAL and display a count. For example, in the case of the parquet file spanning multiple blocks the output will look like: Parquet files should not be split into multiple hdfs-blocks. file=hdfs://localhost:20500/fid.parq (1 of 321 similar) All messages are always logged to VLOG. In the coordinator error messages are merged across all backends to retain readability in the case of large clusters. The current version of this patch adds these new error codes to some of the most important error messages as a reference implementation. Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8 Reviewed-on: http://gerrit.cloudera.org:8080/39 Reviewed-by: Martin Grund <mgrund@cloudera.com> Tested-by: Internal Jenkins	2015-03-01 03:37:32 +00:00
Matthew Jacobs	64f55f32fe	Refactor thrift for ext-data-source to generate only necessary structs ext-data-source only needs a small subset of the thrift structures, so this separates the dependencies between files so that just the necessary structs are generated for ext-data-source. Afterwards, we can remove extra maven dependencies which were using environment variables to get versions. While the environment variables work when building the pom, they are not propagated to dependencies so building fe/pom.xml ended up producing lots of warnings which are now gone. Change-Id: I267fe7bc7a54c3c21aad8c1ffce07cf1a1e07c5e Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3748 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 1f738962ccb7a34834decfe6cb27307ed4548870) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3767	2014-08-05 11:33:46 -07:00
Alex Behm	e9864d5f78	Introduce type hierarchy and add complex types. This patch replaces ColumnType with a hierarchy of types that models the existing scalar types as well as the new complex types ARRAY, MAP, and STRUCT. Change-Id: Ia895f41153e99febb0c35412acac12689c3c2064 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3491 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3538	2014-07-21 20:00:46 -07:00
Victor Bittorf	2d7f2e19b2	IMPALA 938: Infer schema from Parquet file Syntax is "CREATE TABLE name LIKE fileformat '/path/to/file'". Supports all options that CREATE TABLE does. Currently only PARQUET is supported. Run testdata/bin/create-load-data.sh after pulling this patch. Change-Id: Ibb9fbb89dbde6acceb850b914c48d12f22b33f55 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2720 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3158	2014-06-20 17:38:01 -07:00
Matthew Jacobs	f5da019555	IMPALA-1025: Use converse of data source predicate operators if expr has val before slot Change-Id: I31790c037e2fa9af7b80c01014f7507ba5053e63 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2925 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins	2014-06-09 23:54:09 -07:00
Matthew Jacobs	0fa2d6db9b	External Data Source: Set scan handle parameter in close() Change-Id: Ibd2d61ba52a4532b0f7b79224f70abbff1b363e4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2519 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins (cherry picked from commit 3d4f9f44d512bb5c16c89716dec29dbf1463dfa1) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2535 Reviewed-by: Matthew Jacobs <mj@cloudera.com>	2014-05-12 22:51:08 -07:00
Matthew Jacobs	0c533bb152	External Data Source: Backend changes Change-Id: Ifa62b4ea231da47facb31c3f8d43e5e3ac73591f Reviewed-on: http://gerrit.ent.cloudera.com:8080/2284 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins (cherry picked from commit f1e5db2853135c4346788192e2dbc632d4fe1dfb) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2497 Reviewed-by: Matthew Jacobs <mj@cloudera.com>	2014-05-09 02:24:41 -07:00
Matthew Jacobs	ebc6c5894e	External Data Source: Frontend and catalog changes Initial frontend and catalog changes for external data sources. Change-Id: Ia0e61ef97cfd7a4e138ef555c17f2e45bbf08c18 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2224 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit dfa14c828957f751db9c89bae0bdc040ce6f648c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2485	2014-05-08 14:56:19 -07:00
Matthew Jacobs	61b36a42bd	External Data Source: Few small API changes * Rename getStats() to prepare() * Adds TRowBatch.num_rows to indicate number of rows when no cols are materialized * Changes api and sample poms to produce source jars Change-Id: I02dcc89e27716978708386cfc3f7940ee5dbc023 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2406 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 2d7fcba8b7442b54a388f8b994d0cfa08940bbd7) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2434	2014-05-02 17:10:25 -07:00
Matthew Jacobs	1f07f2d7ee	External Data Source: Thrift structure changes A few changes to the external data source thrift types: * Change RowBatch to return entire columns. Adds Data.TColumnData to represent an entire column. * Makes all fields in ExternalDataSource (except for status fields on the result structures) optional in case fields become deprecated in the future. * Adds a limit parameter to the TOpenParams structure in case the data source needs to apply the limit itself. Change-Id: I62db68bfb64d2190dfdd0c84be5925ad5db031ef Reviewed-on: http://gerrit.ent.cloudera.com:8080/2345 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins (cherry picked from commit faf220d628359be1368f898493900fc2e2913c53) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2385 Reviewed-by: Matthew Jacobs <mj@cloudera.com>	2014-04-27 12:57:13 -07:00
Matthew Jacobs	25c0ebf58c	External Data Source: Public API Adds the thrift structures for the public external data source API and a new maven project containing the Java ExternalDataSource interface and the generated Java thrift classes. The ExternalDataSource.thrift structures can evolve in a backward compatible way. The ExternalDataSource Java interface will always contain a version number in the namespace (e.g. com.cloudera.impala.extdatasource.v1 for V1) so we can potentially make breaking changes to the interface in the future but still support older versions. A trivial implementation of the ExternalDataSource API is also added for testing purposes. TODO: Make the sample data source implementation realistic. Change-Id: I827d6420a87ed7a2bce34e050362ca98ddc5dbcc Reviewed-on: http://gerrit.ent.cloudera.com:8080/2241 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit f29814e9ede9d4c889f2648606fcf511feeb47ae) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2313	2014-04-22 18:34:48 -07:00

25 Commits