For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
Parts of the virtualenv were added to the PYTHONPATH presumably for the
shell but the shell should gets its thrift stuff from shell/gen-py.
Removing the virtualenv from the PYTHONPATH fixes a build problem on
CentOS 5 (packaging build).
Change-Id: I54345d4d772588f8dc42341f5cc51492df6a90ed
This changes implements support for PARTITIONED BY clauses in CTAS
statements. The syntax and semantics follow the PARTITION feature of
insert from select statements: inside the PARTITIONED BY (...) column
list the user must specify names of the columns to partition by. These
column names must appear in that particular order at the end of the
select statement. A remapping between columns of the source and
destination tables is not possible, because the destination table does
not yet exist. Specifying static values for the partition columns is
also not possible, as their type needs to be deduced from columns in the
select statement. Example:
CREATE TABLE t (a DOUBLE, b INT);
INSERT INTO t VALUES (1.5, 3);
CREATE TABLE p PARTITIONED BY (b) AS SELECT a, b FROM t;
This change also contains a fix for setting the PYTHONPATH environment
variable correctly, so you can run single python tests from the command
line.
Change-Id: I5f61854d36d1ee30cfcd1c6b2b3eb971f6cf4b2f
Reviewed-on: http://gerrit.cloudera.org:8080/1740
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Internal Jenkins
For wide Avro tables, ReadZLong() would get inlined many times into a
single function body, causing LLVM to crash. Not inlining doesn't seem
to have a performance impact on narrow tables, and helps with wide
tables.
This change also adds tests over wide (i.e. many-column) tables. The
test tables are produced by specifying shell commands to generate test
tables in functional_schema_template.sql, which are executed in
generate-schema-statements.py. In the SQL templates, sections starting
with a ` are treated as shell commands. The output of the shell
command is then used as the section text. This is only a starting
point; it isn't currently implemented for all sections, and may have
to be tweaked if we use this mechanism for all tables.
Change-Id: Ife0d857d19b21534167a34c8bc06bc70bef34910
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2206
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
(cherry picked from commit 1c5951e3cce25a048208ab9bb3a3aed95e41cf67)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2353
Tested-by: jenkins
Python modules on Redhat systems might be in lib or in lib64, unlike Debian systems which
symlink one to the other
Change-Id: Ia1e2d362e3d7e13b87c70e7578644827a5234a91
Reviewed-on: http://gerrit.ent.cloudera.com:8080/544
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
This patch allows Impala to start either Beeswax or HS2 on an
SSL-secured port. SSL is a certificate-based authentication scheme,
where the server provides a certificate to the client as part of the
handshake process. The client verifies that certificate, either by
contacting a trusted third-party certificate authority (CA), or by
accepting a 'self-signed' certificate from the server that is also
provided to the client out-of-band; the client simply compares the two
certificate copies.
Once the certificate is verified, the client and server negotiate an
encryption key for the session, using a public key provided by the
server to encrypt that negotiation. Therefore the server has to have
access to a private key in order to decrypt the encryption key.
Both certificate and key are stored in industry standard .PEM
format. Impala uses the same certificate and key for both Beeswax and
HS2, and the files containing the certificate and key are provided via
--ssl_server_certificate and --ssl_private_key. If either are non-blank,
SSL is enabled for Beeswax and HS2.
The Python shell supports SSL as of this patch via new --ssl and
--ca_cert flags.
Finally, this patch also adds support for Impala's ThriftClients to use
SSL, paving the way for having the backend service use encryption on the
wire as well (although such a configuration is not used by this
patch). The client SSL support is only currently used for the new test
case.
This patch does not enable 'mutual' authentication, where clients
provide certificates to the server in order to authenticate
themselves. Impala has other authentication mechanisms for that purpose.
Change-Id: I3942aa0d21b34b7cda748292f04a9523f35ee6d4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/514
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>