Commit Graph

14 Commits

Author SHA1 Message Date
Philip Zeyliger
d2fe9f437e IMPALA-6270: create Impala parent pom
This commit links together all the individual pom.xml files to have a
new "impala-parent" pom as the parent. This enables de-duplicating all
the repository configuration.

I ran the build to test this.

Change-Id: Id744e4357ee4d8e4be4e5490b2159bb76a2192f0
Reviewed-on: http://gerrit.cloudera.org:8080/8753
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-12-12 04:30:15 +00:00
Jim Apple
edc70c1661 Impala is graduating; remove outdated references to incubation
Change-Id: I4e6080a2b196926e46b1e641f6530ba1fa9bd444
Reviewed-on: http://gerrit.cloudera.org:8080/8577
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-16 22:31:38 +00:00
Tim Armstrong
392c4badff IMPALA-5224: remove defunct codehaus repository
The server is no longer even up, so there is no point in having it in
our poms.

Change-Id: I2310000e51c5e6d85a0fa30874629f4f19427c6c
Reviewed-on: http://gerrit.cloudera.org:8080/6678
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-04-20 03:24:33 +00:00
Thomas Tauber-Marshall
b2c2fe7813 IMPALA-3786: Replace "cloudera" with "apache" (part 2)
As part of the ASF transition, we need to replace references to
Cloudera in Impala with references to Apache. This primarily means
changing Java package names from com.cloudera.impala.* to
org.apache.impala.*

A prior patch renamed all the files as necessary, and this patch
performs the actual code changes. Most of the changes in this patch
were generated with some commands of the form:

find . | grep "\.java\|\.py\|\.h\|\.cc" | \
  xargs sed -i s/'com\(.\)cloudera\(\.\)impala/org\1apache\2impala/g

along with some manual fixes.

After this patch, the remaining references to Cloudera in the repo
mostly fall into the categories:
- External components that have cloudera in their own package names,
  eg. com.cloudera.kudu/llama
- URLs, eg. https://repository.cloudera.com/

Change-Id: I0d35fa6602a7fc0c212b2ef5e2b3322b77dde7e2
Reviewed-on: http://gerrit.cloudera.org:8080/3937
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
2016-09-29 21:14:13 +00:00
Thomas Tauber-Marshall
b544f019aa IMPALA-3786: Replace "cloudera" with "apache" (part 1)
As part of the ASF transition, we need to replace references to
Cloudera in Impala with references to Apache. This primarily means
changing Java package names from com.cloudera.impala.* to
org.apache.impala.*

To make this easier to review, this patch only renames files,
eg. fe/src/main/java/com/cloudera -> fe/src/main/java/org/apache

A follow up patch performs the actual code updates.

Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db
Reviewed-on: http://gerrit.cloudera.org:8080/3936
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
2016-09-29 21:13:52 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Tim Armstrong
93c703b602 Fix misc mvn warnings
Maven was complaining that the source encoding was not set, and that the
version of a plugin was not specified.

Change-Id: I2bc6bbe95fc71575aeec5b6969cc869794309a49
Reviewed-on: http://gerrit.cloudera.org:8080/1741
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-01-11 21:11:15 +00:00
Alex Behm
26467d1f98 Upgrade a few important mvn plugins.
Change-Id: I84cb4834744e3a8a3dfde82d20c9205a155b7a31
Reviewed-on: http://gerrit.cloudera.org:8080/399
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-05-20 03:12:57 +00:00
Martin Grund
b582cdc22b IMPALA-1598: Adding Error Codes to Log Messages
This patch introduces the concept of error codes for errors that are
recorded in Impala and are going to be presented to the client. These
error codes are used to aggregate and group incoming error / warning
messages to reduce the spill on the shell and increase the usefulness of
the messages. By splitting the message string from the implementation,
it becomes possible to edit the string independently of the code and
pave the way for internationalization.

Error messages are defined as a combination of an enum value and a
string. Both are defined in the Error.thrift file that is automatically
generated using the script in common/thrift/generate_error_codes.py. The
goal of the script is to have a central understandable repository of
error messages. Adding new messages to this file will require rebuilding
the thrift part. The proxy class ErrorMessage is responsible to
represent an error and capture the parameters that are used to format
the error message string.

When error messages are recorded they are recorded based on the
following algorithm:

- If an error message is of type GENERAL, do not aggregate this message
  and simply add it to the total number of messages
- If an error messages is of specific type, record the first error
  message as a sample and for all other occurrences increment the count.
- The coordinator will merge all error messages except the ones of type
  GENERAL and display a count.

For example, in the case of the parquet file spanning multiple blocks
the output will look like:

    Parquet files should not be split into multiple hdfs-blocks.
    file=hdfs://localhost:20500/fid.parq (1 of 321 similar)

All messages are always logged to VLOG. In the coordinator error
messages are merged across all backends to retain readability in the
case of large clusters.

The current version of this patch adds these new error codes to some of
the most important error messages as a reference implementation.

Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8
Reviewed-on: http://gerrit.cloudera.org:8080/39
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-03-01 03:37:32 +00:00
Matthew Jacobs
64f55f32fe Refactor thrift for ext-data-source to generate only necessary structs
ext-data-source only needs a small subset of the thrift structures, so this
separates the dependencies between files so that just the necessary structs
are generated for ext-data-source. Afterwards, we can remove extra maven
dependencies which were using environment variables to get versions. While the
environment variables work when building the pom, they are not propagated to
dependencies so building fe/pom.xml ended up producing lots of warnings which
are now gone.

Change-Id: I267fe7bc7a54c3c21aad8c1ffce07cf1a1e07c5e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3748
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 1f738962ccb7a34834decfe6cb27307ed4548870)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3767
2014-08-05 11:33:46 -07:00
Matthew Jacobs
ebc6c5894e External Data Source: Frontend and catalog changes
Initial frontend and catalog changes for external data sources.

Change-Id: Ia0e61ef97cfd7a4e138ef555c17f2e45bbf08c18
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2224
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit dfa14c828957f751db9c89bae0bdc040ce6f648c)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2485
2014-05-08 14:56:19 -07:00
Matthew Jacobs
61b36a42bd External Data Source: Few small API changes
* Rename getStats() to prepare()
* Adds TRowBatch.num_rows to indicate number of rows when no cols are
  materialized
* Changes api and sample poms to produce source jars

Change-Id: I02dcc89e27716978708386cfc3f7940ee5dbc023
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2406
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 2d7fcba8b7442b54a388f8b994d0cfa08940bbd7)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2434
2014-05-02 17:10:25 -07:00
Matthew Jacobs
1f07f2d7ee External Data Source: Thrift structure changes
A few changes to the external data source thrift types:
* Change RowBatch to return entire columns. Adds Data.TColumnData to
  represent an entire column.
* Makes all fields in ExternalDataSource (except for status fields on
  the result structures) optional in case fields become deprecated in
  the future.
* Adds a limit parameter to the TOpenParams structure in case the
  data source needs to apply the limit itself.

Change-Id: I62db68bfb64d2190dfdd0c84be5925ad5db031ef
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2345
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit faf220d628359be1368f898493900fc2e2913c53)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2385
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-04-27 12:57:13 -07:00
Matthew Jacobs
25c0ebf58c External Data Source: Public API
Adds the thrift structures for the public external data source API
and a new maven project containing the Java ExternalDataSource
interface and the generated Java thrift classes.

The ExternalDataSource.thrift structures can evolve in a backward
compatible way. The ExternalDataSource Java interface will always
contain a version number in the namespace (e.g.
com.cloudera.impala.extdatasource.v1 for V1) so we can potentially
make breaking changes to the interface in the future but still
support older versions.

A trivial implementation of the ExternalDataSource API is also
added for testing purposes.
TODO: Make the sample data source implementation realistic.

Change-Id: I827d6420a87ed7a2bce34e050362ca98ddc5dbcc
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2241
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f29814e9ede9d4c889f2648606fcf511feeb47ae)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2313
2014-04-22 18:34:48 -07:00