Commit Graph

17 Commits

Author SHA1 Message Date
Anurag Mantripragada
567b3cd04c IMPALA-9311: Store SQLPrimaryKeys in canonical order.
HMS seems to be returning SQLPrimaryKeys in inconsistent orders.
This makes some of the primary keys tests flaky. This change sorts
the list of primary keys and stores them in canonical order within
Impala.

Testing:
- Modified the tests that were relying on HMS to return same order
  every time.
- Ran parametrized job.

Change-Id: I0f798d7a2659c6cd061002db151f3fa787eb6370
Reviewed-on: http://gerrit.cloudera.org:8080/15106
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2020-01-27 21:48:23 +00:00
Anurag Mantripragada
cfe60858da IMPALA-9158: Support loading primary key/foreign key constraints
in LocalCatalog Mode.

This change add a new method 'loadConstraints()' to the MetaProvider
interface.

1. In CatalogdMetaProvider implementation, we fetch the primary key
  (PK) and foreign key(FK) information via the GetPartialCatalogObject()
  RPC to the catalogd. This is modified to include PK/FK information.
  This is because, on catalog side we eagerly load PK/FK information
  which can be sent over to local catalog in a single RPC to Catalog.
  This information is then stored in TableMetaRef object for future
  consumers.
2. In the DirectMetaProvider implementation, we make two RPCs to HMS
  to directly get PK/FK information.

Load constraints can be extended to include other constraints later
(for ex: unique constraints.)

Testing:
- Added tests in LocalCatalogTest, CatalogTest and PartialCatalogInfoTest
- This change also modifies the toSqlUtil for show create table
  statements. Added a test for the same.

Change-Id: I7ea7e1bacf6eb502c67caf310a847b32687e0d58
Reviewed-on: http://gerrit.cloudera.org:8080/14731
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-01-18 03:36:37 +00:00
Vihang Karajgaonkar
6ebea33a9d IMPALA-9092: Add support for creating external Kudu table
In HMS-3 the translation layer converts a managed kudu table into an
external kudu table and adds additional table property
'external.table.purge' to 'true'. This means any installation which
is using HMS-3 (or a Hive version which has HIVE-22158) will always
create Kudu tables as external tables. This is problematic since the
output of show create table will now be different and may confuse
the users.

In order to improve the user experience of such synchronized tables
(external tables with external.table.purge property set to true),
this patch adds support in Impala to create
external Kudu tables. Previous versions of Impala disallowed
creating a external Kudu table if the Kudu table did not exist.
After this patch, Impala will check if the Kudu table exists and if
it does not it will create a Kudu table based on the schema provided
in the create table statement. The command will error out if the Kudu
table already exists. However, this applies to only the synchronized
tables. Previous way to create a pure external table behaves the
same.

Following syntax of creating a synchronized table is now allowed:

CREATE EXTERNAL TABLE foo (
  id int PRIMARY KEY,
  name string)
PARTITION BY HASH PARTITIONS 8
STORED AS KUDU
TBLPROPERTIES ('external.table.purge'='true')

The syntax is very similar to creating a managed table, except for
the EXTERNAL keyword and additional table property. A synchronized
table will behave similar to managed Kudu tables (drops and renames
are allowed). The output of show create table on a synchronized
table will display the full column and partition spec similar to the
managed tables.

Testing:
1. After the CDP version bump all of the existing Kudu tables now
create synchronized tables so there is good coverage there.
2. Added additional tests which create synchronized tables and
compares the show create table output.
3. Ran exhaustive tests with both CDP and CDH builds.

Change-Id: I76f81d41db0cf2269ee1b365857164a43677e14d
Reviewed-on: http://gerrit.cloudera.org:8080/14750
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-12-13 23:02:13 +00:00
norbert.luksa
288c8c41b5 IMPALA-8755: Frontend support for Z-ordering
Extended the SQL grammar with an optional and a default flag for
SORT BY, namely ZORDER and LEXICAL. If set, the new 'sort.algorithm'
table property will be set to ZORDER and the information will sink
down to the backend. The default order is indicated by LEXICAL
and can be omitted. Examples are:

CREATE TABLE t (a INT, b INT) PARTITIONED BY (c INT)
  SORT BY ZORDER (a, b);
CREATE TABLE t SORT BY ZORDER (int_col,id) LIKE u;
CREATE TABLE t LIKE PARQUET '/foo' SORT BY ZORDER (id,zip);

ALTER TABLE t SORT BY ZORDER (int_col,id);

The following two are the same statements:
CREATE TABLE t (a INT, b INT) SORT BY (a, b);
CREATE TABLE t (a INT, b INT) SORT BY LEXICAL (a, b);

For strings, varchars, floats and doubles Z-ordering is currently
not supported. It's not suitable for strings and varchars, but
support can be added for floats and doubles later. The supported
types are: boolean, int types, decimals, date, timestamp, and char.

Currently ZORDER has the same functionality as a simple SORT BY clause,
therefore hidden behind a feature flag: unlock_zorder. The custom
sorting with Z-ordering will be in a different commit later.

Testing:
 * Added tests for the ZORDER option for every SORT BY test.
 * Modified some tests by adding the LEXICAL option.
 * The .test workloads are temporarily put in separate test files
   in order to set up the feature flag. These tests are run from
   tests/custom_cluster/test_zorder.py which is a duplication of
   the relevant tests, but with CustomClusterTestSuite decorator.

Change-Id: Ie122002ca8f52ca2c1e1ec8ff1d476ae1f4f875d
Reviewed-on: http://gerrit.cloudera.org:8080/13955
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-09-26 18:35:06 +00:00
Tianyi Wang
3cb784310f IMPALA-7347: Ignore numFilesErasureCoded in TestShowCreateTable
This table properties only exist for HDFS tables. To get the test work
on local tables, it needs to be ignored.

Change-Id: Icc8494fb91c4777cee662a97f750486aa8e79a8e
Reviewed-on: http://gerrit.cloudera.org:8080/11192
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-13 21:36:16 +00:00
Tianyi Wang
fb3d47d356 IMPALA-7347: Update tests to accomodate HIVE-18118
HIVE-18118 adds 'numFilesErasureCoded' to table properties. This patch
addes it to test_show_create_table to work with the latest Hive.

Change-Id: I6aae402dd38374de90b35c32166a9507e6eb29f9
Reviewed-on: http://gerrit.cloudera.org:8080/11108
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-02 20:25:31 +00:00
Fredy Wijaya
8173e9ab4d IMPALA-6571: NullPointerException in SHOW CREATE TABLE for HBase tables
This patch fixes the NullPointerException in SHOW CREATE TABLE for HBase
tables.

Testing:
- Moved the content of back hbase-show-create-table.test to
  show-create-table.test
- Ran show-create-table end-to-end tests

Change-Id: Ibe018313168fac5dcbd80be9a8f28b71a2c0389b
Reviewed-on: http://gerrit.cloudera.org:8080/9884
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2018-04-04 00:12:30 +00:00
Lars Volker
1ada9dac88 IMPALA-4166: Add SORT BY sql clause
This change adds support for adding SORT BY (...) clauses to CREATE
TABLE and ALTER TABLE statements. Examples are:

CREATE TABLE t (i INT, j INT, k INT) PARTITIONED BY (l INT) SORT BY (i, j);
CREATE TABLE t SORT BY (int_col,id) LIKE u;
CREATE TABLE t LIKE PARQUET '/foo' SORT BY (id,zip);

ALTER TABLE t SORT BY (int_col,id);
ALTER TABLE t SORT BY ();

Sort columns can only be specified for Hdfs tables and effectiveness may
vary based on storage type; for example TEXT tables will not see
improved compression. The SORT BY clause must not contain clustering
columns. The columns in the SORT BY clause are stored in the
'sort.columns' table property and will result in an additional SORT node
being added to the plan before the final table sink. Specifying sort
columns also enables clustering during inserts, so the SORT node will
contain all partitioning columns first, followed by the sort columns. We
do this because sort columns add a SORT node to the plan and adding the
clustering columns to the SORT node is cheap.

Sort columns supersede the sortby() hint, which we will remove in a
subsequent change (IMPALA-5144). Until then, it is possible to specify
sort columns using both ways at the same time and the column lists
will be concatenated.

Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
Reviewed-on: http://gerrit.cloudera.org:8080/6495
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Impala Public Jenkins
2017-05-12 15:43:30 +00:00
Joe McDonnell
5755261954 IMPALA-4036: invalid SQL generated for partitioned table with comment
For a table that has both a table comment and a partition specified,
"show create table" incorrectly outputs the comment before the partition.
This is not the correct order, and it results in an invalid SQL.

This transaction fixes the ordering (partition comes before comment) and
adds tests for this case.

Change-Id: I29a33cfd142b473997fdc3acfe3f0966bc7ed784
Reviewed-on: http://gerrit.cloudera.org:8080/5648
Tested-by: Impala Public Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2017-01-12 20:41:35 +00:00
Dimitris Tsirogiannis
1da57019ad IMPALA-4579: SHOW CREATE VIEW fails for view containing a subquery
This commit fixes an issue where a SHOW CREATE VIEW statement throws an
analysis error if the view contains a subquery.

Change-Id: I4a89e46a022f0ccec198b6e3e2b30230103831ce
Reviewed-on: http://gerrit.cloudera.org:8080/5333
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
2016-12-04 08:35:15 +00:00
Tim Armstrong
9894cf6a55 IMPALA-783: add show create view as alias for show create table
SHOW CREATE TABLE already outputs information for views. As a
convenience, this patch adds SHOW CREATE VIEW as an alias for SHOW
CREATE TABLE.

Switch some SHOW CREATE VIEW tests to use SHOW CREATE VIEW and add
additional test for SHOW CREATE VIEW on a table so that expected
behaviour is tested.

Change-Id: I9925e0789573e9b097a2ef52b5023964dcf8f32c
Reviewed-on: http://gerrit.cloudera.org:8080/1661
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-01-20 04:32:21 +00:00
Lars Volker
82a1aef91b IMPALA-1687: Expand CTAS to allow partition clauses
This changes implements support for PARTITIONED BY clauses in CTAS
statements. The syntax and semantics follow the PARTITION feature of
insert from select statements: inside the PARTITIONED BY (...) column
list the user must specify names of the columns to partition by. These
column names must appear in that particular order at the end of the
select statement. A remapping between columns of the source and
destination tables is not possible, because the destination table does
not yet exist. Specifying static values for the partition columns is
also not possible, as their type needs to be deduced from columns in the
select statement. Example:

CREATE TABLE t (a DOUBLE, b INT);
INSERT INTO t VALUES (1.5, 3);
CREATE TABLE p PARTITIONED BY (b) AS SELECT a, b FROM t;

This change also contains a fix for setting the PYTHONPATH environment
variable correctly, so you can run single python tests from the command
line.

Change-Id: I5f61854d36d1ee30cfcd1c6b2b3eb971f6cf4b2f
Reviewed-on: http://gerrit.cloudera.org:8080/1740
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Internal Jenkins
2016-01-18 16:55:45 +00:00
Tim Armstrong
ab3e9f19bf IMPALA-783: view support for show create table
SHOW CREATE TABLE now supports views. It returns a CREATE VIEW statement
with column names and the original sql statement.

Authorization allows SHOW CREATE TABLE to be run on view if the user has
VIEW_METADATA privilege on the view and SELECT privilege on all
underlying views and table.

E.g. "SHOW CREATE TABLE some_view" returns output of form:
CREATE VIEW a_database.some_view (id, bool_col, tinyint_col) AS
SELECT id, bool_col, tinyint_col FROM functional.alltypes

Change-Id: Id633af2f5c1f5b0e01c13ed85c4bf9c045dc0666
Reviewed-on: http://gerrit.cloudera.org:8080/713
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-12-17 03:28:32 +00:00
ishaan
8369c3b13b Remove explicit references to functional_hbase tables from .test files.
Additionally, this patch also disabled the hbase/none test dimension if the
TARGET_FILESYSTEM environment variable is set to either s3 of isilon.

Change-Id: I63aecaa478d2ba9eb68de729e9640071359a2eeb
Reviewed-on: http://gerrit.cloudera.org:8080/74
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2015-02-23 23:32:41 +00:00
Alex Behm
19bab59854 Create/alter/describe tables with complex types.
This patch adds parsing of complex types and tests for using complex
types in various exprs and create/alter/describe stmts.

Change-Id: Ibc211a560c889f5ccfb616813700b923c89d8245
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3577
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3594
2014-07-23 17:26:14 -07:00
Lenni Kuff
76fa3b2ded Update DDL to support 'STORED AS PARQUET' and 'STORED AS AVRO' syntax
This change updates our DDL syntax support to allow for using 'STORED AS PARQUET'
as well as 'STORED AS PARQUETFILE'. Moving forward we should prefer the new syntax,
but continue to support the old.  I made the same change for 'AVROFILE', but since
we have not yet documented the 'AVROFILE' syntax I left out support for the old syntax.

Change-Id: I10c73a71a94ee488c9ae205485777b58ab8957c9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1053
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:18 -08:00
Matthew Jacobs
51bfc99c63 IMPALA-395: Impala "show create table" statement
Adds support for "show create table", a DDL statement that outputs a DDL statement that
creates the specified table.

In general, the output DDL works in Impala, so a user can copy the output and execute it
to create the same table. However, there are a few special cases that output Hive DDL
because we do not support creating some tables in Impala: HBase tables and tables with
LZO compressed text. When we do support creating these tables in Impala, users should
be able to execute the DDL in Impala as well.

Change-Id: I8c130297a657810dea5b994bf99d72b0e61b847b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/842
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
2014-01-08 10:53:53 -08:00