Commit Graph

9 Commits

Author SHA1 Message Date
Alex Behm
e9864d5f78 Introduce type hierarchy and add complex types.
This patch replaces ColumnType with a hierarchy of types that models
the existing scalar types as well as the new complex types ARRAY, MAP,
and STRUCT.

Change-Id: Ia895f41153e99febb0c35412acac12689c3c2064
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3491
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3538
2014-07-21 20:00:46 -07:00
Victor Bittorf
2d7f2e19b2 IMPALA 938: Infer schema from Parquet file
Syntax is "CREATE TABLE name LIKE fileformat '/path/to/file'".
Supports all options that CREATE TABLE does. Currently only PARQUET is supported.
Run testdata/bin/create-load-data.sh after pulling this patch.

Change-Id: Ibb9fbb89dbde6acceb850b914c48d12f22b33f55
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2720
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3158
2014-06-20 17:38:01 -07:00
Matthew Jacobs
f5da019555 IMPALA-1025: Use converse of data source predicate operators if expr has val before slot
Change-Id: I31790c037e2fa9af7b80c01014f7507ba5053e63
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2925
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-06-09 23:54:09 -07:00
Matthew Jacobs
0fa2d6db9b External Data Source: Set scan handle parameter in close()
Change-Id: Ibd2d61ba52a4532b0f7b79224f70abbff1b363e4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2519
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 3d4f9f44d512bb5c16c89716dec29dbf1463dfa1)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2535
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-05-12 22:51:08 -07:00
Matthew Jacobs
0c533bb152 External Data Source: Backend changes
Change-Id: Ifa62b4ea231da47facb31c3f8d43e5e3ac73591f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2284
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f1e5db2853135c4346788192e2dbc632d4fe1dfb)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2497
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-05-09 02:24:41 -07:00
Matthew Jacobs
ebc6c5894e External Data Source: Frontend and catalog changes
Initial frontend and catalog changes for external data sources.

Change-Id: Ia0e61ef97cfd7a4e138ef555c17f2e45bbf08c18
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2224
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit dfa14c828957f751db9c89bae0bdc040ce6f648c)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2485
2014-05-08 14:56:19 -07:00
Matthew Jacobs
61b36a42bd External Data Source: Few small API changes
* Rename getStats() to prepare()
* Adds TRowBatch.num_rows to indicate number of rows when no cols are
  materialized
* Changes api and sample poms to produce source jars

Change-Id: I02dcc89e27716978708386cfc3f7940ee5dbc023
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2406
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 2d7fcba8b7442b54a388f8b994d0cfa08940bbd7)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2434
2014-05-02 17:10:25 -07:00
Matthew Jacobs
1f07f2d7ee External Data Source: Thrift structure changes
A few changes to the external data source thrift types:
* Change RowBatch to return entire columns. Adds Data.TColumnData to
  represent an entire column.
* Makes all fields in ExternalDataSource (except for status fields on
  the result structures) optional in case fields become deprecated in
  the future.
* Adds a limit parameter to the TOpenParams structure in case the
  data source needs to apply the limit itself.

Change-Id: I62db68bfb64d2190dfdd0c84be5925ad5db031ef
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2345
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit faf220d628359be1368f898493900fc2e2913c53)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2385
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-04-27 12:57:13 -07:00
Matthew Jacobs
25c0ebf58c External Data Source: Public API
Adds the thrift structures for the public external data source API
and a new maven project containing the Java ExternalDataSource
interface and the generated Java thrift classes.

The ExternalDataSource.thrift structures can evolve in a backward
compatible way. The ExternalDataSource Java interface will always
contain a version number in the namespace (e.g.
com.cloudera.impala.extdatasource.v1 for V1) so we can potentially
make breaking changes to the interface in the future but still
support older versions.

A trivial implementation of the ExternalDataSource API is also
added for testing purposes.
TODO: Make the sample data source implementation realistic.

Change-Id: I827d6420a87ed7a2bce34e050362ca98ddc5dbcc
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2241
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f29814e9ede9d4c889f2648606fcf511feeb47ae)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2313
2014-04-22 18:34:48 -07:00