Commit Graph

25 Commits

Author SHA1 Message Date
Philip Zeyliger
7263c33ea7 Use "mvn -B" in builds to avoid dowloading progress bars in logs.
Maven's batch (or non-interactive) mode prevents progress bar output
when Maven is downloading artifacts, which isn't generally useful.
Now that we keep Maven logs in logs/mvn/mvn.log, this makes
them slightly more tidy.

Change-Id: I5aa117272c2a86b63b0f9062099a4145324eb6fc
Reviewed-on: http://gerrit.cloudera.org:8080/9792
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
2018-03-27 04:04:28 +00:00
Philip Zeyliger
5c8da5d13a Consistently use Java 1.7 compiler.
We use Java 1.7 in fe/pom.xml, where most of our Java code is. For
consistency, this updates the rest of our Maven configurations to use
the same version of Java. A change I'm working with uses
try-with-resources in HBase splitting, which is how I ran into
this.

Testing: ran core tests

Change-Id: I6cecddf367f00185a14a8b08c03456e3b756bd70
Reviewed-on: http://gerrit.cloudera.org:8080/9600
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins
2018-03-17 04:08:53 +00:00
Philip Zeyliger
d2fe9f437e IMPALA-6270: create Impala parent pom
This commit links together all the individual pom.xml files to have a
new "impala-parent" pom as the parent. This enables de-duplicating all
the repository configuration.

I ran the build to test this.

Change-Id: Id744e4357ee4d8e4be4e5490b2159bb76a2192f0
Reviewed-on: http://gerrit.cloudera.org:8080/8753
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-12-12 04:30:15 +00:00
Jim Apple
edc70c1661 Impala is graduating; remove outdated references to incubation
Change-Id: I4e6080a2b196926e46b1e641f6530ba1fa9bd444
Reviewed-on: http://gerrit.cloudera.org:8080/8577
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-16 22:31:38 +00:00
Michael Ho
35f5c7bd37 IMPALA-4856: Rename thrift-deps to gen-deps
As a preparation to start generating Protobuf files
for IMPALA-4856, this change introduces a new build
target "gen-deps" which serves as an umbrella for all
build targets of generated code. For now, it only
includes thrift-deps and protobuf targets will be added
in the future.

Change-Id: I360c63773efdeab4c26ca96b915e0c8d0ce2b9c9
Reviewed-on: http://gerrit.cloudera.org:8080/7851
Reviewed-by: Lars Volker <lv@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-08-30 00:26:52 +00:00
Tim Armstrong
392c4badff IMPALA-5224: remove defunct codehaus repository
The server is no longer even up, so there is no point in having it in
our poms.

Change-Id: I2310000e51c5e6d85a0fa30874629f4f19427c6c
Reviewed-on: http://gerrit.cloudera.org:8080/6678
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-04-20 03:24:33 +00:00
Tim Armstrong
fc4ee65f9f Add all build targets to CMake and speed up builds
Use CMake's dependency resolution always instead of serial execution of
targets via shell scripts.  This improves parallelism by building fe,
be, and other targets at the same time and avoid some overhead from
invoking "make" multiple times. This reduces the time taken for
an incremental compilation of fe and be from 56s to 24s with this
command:

  ./buildall.sh -debug -noclean -notests -skiptests -ninja

Also use Impala-lzo's build script. This depends on the IMPALA-4277
fixes to the Impala-lzo build script.

Log directory creation is also moved from impala-config.sh to
buildall.sh. This means that impala-config.sh has no side-effects and
can be run concurrently with no issues.

Also make sure that "make" builds all the same artifacts as buildall.sh
when run with no args.

Testing:
Ran a jenkins core job, also experimented locally. Ran a jenkins core
job with distcc disabled - this exposed some concurrency bugs where
impala-config.sh fails if run concurrently.

Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Reviewed-on: http://gerrit.cloudera.org:8080/4790
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2016-12-14 23:42:19 +00:00
Tim Armstrong
75a857c0ce IMPALA-4259: build Impala without any test cluster setup.
The main outcome of this change is to avoid making unnecessary
modification to the Impala or other source trees when we don't need the
test cluster.

To achieve that, this refactors the script to make the flow easier
to understand and makes it more consistent which build steps are
executed in which modes.

Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
Reviewed-on: http://gerrit.cloudera.org:8080/4685
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-10-13 05:45:47 +00:00
Thomas Tauber-Marshall
b2c2fe7813 IMPALA-3786: Replace "cloudera" with "apache" (part 2)
As part of the ASF transition, we need to replace references to
Cloudera in Impala with references to Apache. This primarily means
changing Java package names from com.cloudera.impala.* to
org.apache.impala.*

A prior patch renamed all the files as necessary, and this patch
performs the actual code changes. Most of the changes in this patch
were generated with some commands of the form:

find . | grep "\.java\|\.py\|\.h\|\.cc" | \
  xargs sed -i s/'com\(.\)cloudera\(\.\)impala/org\1apache\2impala/g

along with some manual fixes.

After this patch, the remaining references to Cloudera in the repo
mostly fall into the categories:
- External components that have cloudera in their own package names,
  eg. com.cloudera.kudu/llama
- URLs, eg. https://repository.cloudera.com/

Change-Id: I0d35fa6602a7fc0c212b2ef5e2b3322b77dde7e2
Reviewed-on: http://gerrit.cloudera.org:8080/3937
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
2016-09-29 21:14:13 +00:00
Thomas Tauber-Marshall
b544f019aa IMPALA-3786: Replace "cloudera" with "apache" (part 1)
As part of the ASF transition, we need to replace references to
Cloudera in Impala with references to Apache. This primarily means
changing Java package names from com.cloudera.impala.* to
org.apache.impala.*

To make this easier to review, this patch only renames files,
eg. fe/src/main/java/com/cloudera -> fe/src/main/java/org/apache

A follow up patch performs the actual code updates.

Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db
Reviewed-on: http://gerrit.cloudera.org:8080/3936
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
2016-09-29 21:13:52 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Misha Dmitriev
33e4ba9454 IMPALA-3384: add missing frontend -> ext-data-source dependency.
Change-Id: If15f6d737d6a9c301df682f73b56e5ebeabfcb96
Reviewed-on: http://gerrit.cloudera.org:8080/2912
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-05-12 14:17:47 -07:00
Tim Armstrong
93c703b602 Fix misc mvn warnings
Maven was complaining that the source encoding was not set, and that the
version of a plugin was not specified.

Change-Id: I2bc6bbe95fc71575aeec5b6969cc869794309a49
Reviewed-on: http://gerrit.cloudera.org:8080/1741
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-01-11 21:11:15 +00:00
Alex Behm
26467d1f98 Upgrade a few important mvn plugins.
Change-Id: I84cb4834744e3a8a3dfde82d20c9205a155b7a31
Reviewed-on: http://gerrit.cloudera.org:8080/399
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-05-20 03:12:57 +00:00
Martin Grund
b582cdc22b IMPALA-1598: Adding Error Codes to Log Messages
This patch introduces the concept of error codes for errors that are
recorded in Impala and are going to be presented to the client. These
error codes are used to aggregate and group incoming error / warning
messages to reduce the spill on the shell and increase the usefulness of
the messages. By splitting the message string from the implementation,
it becomes possible to edit the string independently of the code and
pave the way for internationalization.

Error messages are defined as a combination of an enum value and a
string. Both are defined in the Error.thrift file that is automatically
generated using the script in common/thrift/generate_error_codes.py. The
goal of the script is to have a central understandable repository of
error messages. Adding new messages to this file will require rebuilding
the thrift part. The proxy class ErrorMessage is responsible to
represent an error and capture the parameters that are used to format
the error message string.

When error messages are recorded they are recorded based on the
following algorithm:

- If an error message is of type GENERAL, do not aggregate this message
  and simply add it to the total number of messages
- If an error messages is of specific type, record the first error
  message as a sample and for all other occurrences increment the count.
- The coordinator will merge all error messages except the ones of type
  GENERAL and display a count.

For example, in the case of the parquet file spanning multiple blocks
the output will look like:

    Parquet files should not be split into multiple hdfs-blocks.
    file=hdfs://localhost:20500/fid.parq (1 of 321 similar)

All messages are always logged to VLOG. In the coordinator error
messages are merged across all backends to retain readability in the
case of large clusters.

The current version of this patch adds these new error codes to some of
the most important error messages as a reference implementation.

Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8
Reviewed-on: http://gerrit.cloudera.org:8080/39
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-03-01 03:37:32 +00:00
Matthew Jacobs
64f55f32fe Refactor thrift for ext-data-source to generate only necessary structs
ext-data-source only needs a small subset of the thrift structures, so this
separates the dependencies between files so that just the necessary structs
are generated for ext-data-source. Afterwards, we can remove extra maven
dependencies which were using environment variables to get versions. While the
environment variables work when building the pom, they are not propagated to
dependencies so building fe/pom.xml ended up producing lots of warnings which
are now gone.

Change-Id: I267fe7bc7a54c3c21aad8c1ffce07cf1a1e07c5e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3748
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 1f738962ccb7a34834decfe6cb27307ed4548870)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3767
2014-08-05 11:33:46 -07:00
Alex Behm
e9864d5f78 Introduce type hierarchy and add complex types.
This patch replaces ColumnType with a hierarchy of types that models
the existing scalar types as well as the new complex types ARRAY, MAP,
and STRUCT.

Change-Id: Ia895f41153e99febb0c35412acac12689c3c2064
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3491
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3538
2014-07-21 20:00:46 -07:00
Victor Bittorf
2d7f2e19b2 IMPALA 938: Infer schema from Parquet file
Syntax is "CREATE TABLE name LIKE fileformat '/path/to/file'".
Supports all options that CREATE TABLE does. Currently only PARQUET is supported.
Run testdata/bin/create-load-data.sh after pulling this patch.

Change-Id: Ibb9fbb89dbde6acceb850b914c48d12f22b33f55
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2720
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3158
2014-06-20 17:38:01 -07:00
Matthew Jacobs
f5da019555 IMPALA-1025: Use converse of data source predicate operators if expr has val before slot
Change-Id: I31790c037e2fa9af7b80c01014f7507ba5053e63
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2925
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-06-09 23:54:09 -07:00
Matthew Jacobs
0fa2d6db9b External Data Source: Set scan handle parameter in close()
Change-Id: Ibd2d61ba52a4532b0f7b79224f70abbff1b363e4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2519
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 3d4f9f44d512bb5c16c89716dec29dbf1463dfa1)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2535
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-05-12 22:51:08 -07:00
Matthew Jacobs
0c533bb152 External Data Source: Backend changes
Change-Id: Ifa62b4ea231da47facb31c3f8d43e5e3ac73591f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2284
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f1e5db2853135c4346788192e2dbc632d4fe1dfb)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2497
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-05-09 02:24:41 -07:00
Matthew Jacobs
ebc6c5894e External Data Source: Frontend and catalog changes
Initial frontend and catalog changes for external data sources.

Change-Id: Ia0e61ef97cfd7a4e138ef555c17f2e45bbf08c18
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2224
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit dfa14c828957f751db9c89bae0bdc040ce6f648c)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2485
2014-05-08 14:56:19 -07:00
Matthew Jacobs
61b36a42bd External Data Source: Few small API changes
* Rename getStats() to prepare()
* Adds TRowBatch.num_rows to indicate number of rows when no cols are
  materialized
* Changes api and sample poms to produce source jars

Change-Id: I02dcc89e27716978708386cfc3f7940ee5dbc023
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2406
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 2d7fcba8b7442b54a388f8b994d0cfa08940bbd7)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2434
2014-05-02 17:10:25 -07:00
Matthew Jacobs
1f07f2d7ee External Data Source: Thrift structure changes
A few changes to the external data source thrift types:
* Change RowBatch to return entire columns. Adds Data.TColumnData to
  represent an entire column.
* Makes all fields in ExternalDataSource (except for status fields on
  the result structures) optional in case fields become deprecated in
  the future.
* Adds a limit parameter to the TOpenParams structure in case the
  data source needs to apply the limit itself.

Change-Id: I62db68bfb64d2190dfdd0c84be5925ad5db031ef
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2345
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit faf220d628359be1368f898493900fc2e2913c53)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2385
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-04-27 12:57:13 -07:00
Matthew Jacobs
25c0ebf58c External Data Source: Public API
Adds the thrift structures for the public external data source API
and a new maven project containing the Java ExternalDataSource
interface and the generated Java thrift classes.

The ExternalDataSource.thrift structures can evolve in a backward
compatible way. The ExternalDataSource Java interface will always
contain a version number in the namespace (e.g.
com.cloudera.impala.extdatasource.v1 for V1) so we can potentially
make breaking changes to the interface in the future but still
support older versions.

A trivial implementation of the ExternalDataSource API is also
added for testing purposes.
TODO: Make the sample data source implementation realistic.

Change-Id: I827d6420a87ed7a2bce34e050362ca98ddc5dbcc
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2241
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f29814e9ede9d4c889f2648606fcf511feeb47ae)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2313
2014-04-22 18:34:48 -07:00