This patch changes syntax of creating JDBC table statement as
CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
(col_name data_type
[constraint_specification]
[COMMENT 'col_comment']
[, ...]
)
[COMMENT 'table_comment']
STORED BY JDBC
TBLPROPERTIES ('key1'='value1', 'key2'='value2', ...)
Both "STORED BY JDBC" and "STORED AS JDBC" are acceptable. A table
property '__IMPALA_DATA_SOURCE_NAME' is added to the JDBC table with
value 'impalajdbcdatasource', which is shown in the output of command
'show create table'.
Following required JDBC parameters must be specified as table
properties: database.type, jdbc.url, jdbc.driver, driver.url, and table.
Otherwise, AnalysisException will be thrown.
Testing:
- Added frontend unit tests for new syntax of creating JDBC table.
- Updated end-to-end unit tests to create JDBC tables without data
source.
- Passed core tests
Change-Id: I765aa86b430246786ad85ab6857cefaf4332c920
Reviewed-on: http://gerrit.cloudera.org:8080/21016
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
for external data source table
This patch adds support for datatype date as predicates
for external data sources.
Testing:
- Added tests for date predicates with operators:
'=', '>', '<', '>=', '<=', '!=', 'BETWEEN'.
Change-Id: Ibf13cbefaad812a0f78755c5791d82b24a3395e4
Reviewed-on: http://gerrit.cloudera.org:8080/20915
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch uses the "external data source" mechanism in Impala to
implement data source for querying JDBC.
It has some limitations due to the restrictions of "external data
source":
- It is not distributed, e.g, fragment is unpartitioned. The queries
are executed on coordinator.
- Queries which read following data types from external JDBC tables
are not supported:
BINARY, CHAR, DATETIME, and COMPLEX.
- Only support binary predicates with operators =, !=, <=, >=,
<, > to be pushed to RDBMS.
- Following data types are not supported for predicates:
DECIMAL, TIMESTAMP, DATE, and BINARY.
- External tables with complex types of columns are not supported.
- Support is limited to the following databases:
MySQL, Postgres, Oracle, MSSQL, H2, DB2, and JETHRO_DATA.
- Catalog V2 is not supported (IMPALA-7131).
- DataSource objects are not persistent (IMPALA-12375).
Additional fixes are planned on top of this patch.
Source files under jdbc/conf, jdbc/dao and jdbc/exception are
replicated from Hive JDBC Storage Handler.
In order to query the RDBMS tables, the following steps should be
followed (note that existing data source table will be rebuilt):
1. Make sure the Impala cluster has been started.
2. Copy the jar files of JDBC drivers and the data source library into
HDFS.
${IMPALA_HOME}/testdata/bin/copy-ext-data-sources.sh
3. Create an `alltypes` table in the Postgres database.
${IMPALA_HOME}/testdata/bin/load-ext-data-sources.sh
4. Create data source tables (alltypes_jdbc_datasource and
alltypes_jdbc_datasource_2).
${IMPALA_HOME}/bin/impala-shell.sh -f\
${IMPALA_HOME}/testdata/bin/create-ext-data-source-table.sql
5. It's ready to run query to access data source tables created
in last step. Don't need to restart Impala cluster.
Testing:
- Added unit-test for Postgres and ran unit-test with JDBC driver
postgresql-42.5.1.jar.
- Ran manual unit-test for MySql with JDBC driver
mysql-connector-j-8.1.0.jar.
- Ran core tests successfully.
Change-Id: I8244e978c7717c6f1452f66f1630b6441392e7d2
Reviewed-on: http://gerrit.cloudera.org:8080/17842
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Reviewed-by: Kurt Deschler <kdeschle@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>