Files
impala/testdata
Vuk Ercegovac 08ca346f2e IMPALA-3562: support column restriction for compute stats
The 'compute stats' statement currently computes column-level
statistics for all columns of a table.
This adds potentially unneeded work for columns whose stats
are not needed by queries. It can be especially costly for
very wide tables and unneeded large string fields.

This change modifies the 'compute stats' (non-incremental only)
to support a user-specified list of columns for which stats
should be computed. An example with the extension is as follows:

compute stats my_db.my_table(column_a, column_b);

While the phrase "for columns ..." is commonly used, since
'compute stats' seems fairly unique (vs. 'analyze table ...'),
this change favors brevity with the parenthesized column list.

Whereas currently 'compute stats' is applied to the columns that
can be analyzed, the 'compute stats' in this change results in
an error when a column is specified that cannot be analyzed
(e.g., column does not exist, column is of an unsupported type,
column is a partitioning column). Moreover, an empty column
list can be supplied which means that no columns will be analyzed.

Testing:
  - analyzing a subset of columns is already supported (e.g., not all
    columns can be analyzed), so the focus with testing is to check
    that the user-specified columns are handled as expected.
  - tests include: parser tests, ddl analysis, end-to-end tests.

Change-Id: If8b25dd248e578dc7ddd35468125cca12d1b9f27
Reviewed-on: http://gerrit.cloudera.org:8080/9133
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2018-02-01 20:27:14 +00:00
..
2014-01-08 10:46:44 -08:00
2014-01-08 10:46:35 -08:00
2014-01-08 10:46:35 -08:00
2011-09-28 09:02:07 -07:00
2011-12-30 00:26:27 -08:00
2011-09-28 09:02:07 -07:00
2014-01-08 10:44:40 -08:00
2016-09-22 02:00:50 +00:00
2017-08-31 01:40:47 +00:00