Impala detects the HDFS version by reading the Namenode web UI and run
the corresponding check.
On 4.1, Impala tries to check the datanode (server side) config by reading
the datanode web UI.
- added PlanNode.numNodes, PlanNode.avgRowSize and PlanNode.computeStats()
- fixing up some cardinality estimates
- Planner now tries to do a cost-based decision between broadcast join and join with full repartitioning (both inputs)
- ExchangeNode now distinguishes between its input and output row descriptor: the output potentially contains more tuples
- fixed problem related to cancellation and concurrent hash table builds.
Not included:
- partitioned joins that take advantage of existing partitions of the inputs; those will have to wait for a follow-on change
Adds support for:
* ALTER TABLE <table> PARTITION (partitionSpec) SET FILEFORMAT
* ALTER TABLE <table> PARTITION (partitionSpec) SET LOCATION
This enables setting the location and fileformat of specific partitions.
This patch adds support for
- ALTER TABLE ADD|REPLACE COLUMNS
- ALTER TABLE DROP COLUMN
- ALTER TABLE ADD/DROP PARTITION
- ALTER TABLE SET FILEFORMAT
- ALTER TABLE SET LOCATION
- ALTER TABLE RENAME
This includes:
- adding Expr.numDistinctValues and Expr.selectivity
- propagating column stats to SlotDescriptor
- adding TupleDescriptor.avgSerializedSize
- adding PlanNode.selectivity and changing the finalize() implementations in
the subclasses to compute selectivity
Not included:
- some cleanup (still has extra logging output)
- tests (those will show up as part of the planner tests for repartitioning joins)
* Changed frontend analysis for HBase tables
* Changed Thrift messages to allow HBase as a sink type.
* JNI Wrapper around htable
* Create hbase-table-sink
* Create hbase-table-writer
* Static init lots of JNI related code for HBase.
* Cleaned up some cpplint issues.
* Changed junit analysis tests
* Create a new HBase test table.
* Added functional tests for HBase inserts.
This patch adds support for the following SQL constructs
- Unary + operator
- The ALL keyword, in SELECT ALL and SELECT aggregate_func(ALL *)
- REAL and INTEGER as type synonyms for DOUBLE and INT respectively
- The AS keyword after a table spec. e.g. SELECT * FROM tbl AS t0
This makes partition pruning more effective by extending it to predicates that are fully bound by the partition column,
e.g., '<col> IN (1, 2, 3)' will also be used to prune partitions, in addition to equality and binary comparisons.