mirror of
https://github.com/apache/impala.git
synced 2026-01-06 06:01:03 -05:00
Adds a new command to manually set the table-level column stats.
Syntax:
ALTER TABLE [<db_name>.]<tbl_name> SET COLUMN STATS <col_name>
('statsKey'='val','statsKey2',='val2')
Valid values for 'statsKey': numDVs, numNulls, avgSize, maxSize
The 'val' portion needs to be a number appropriate for the given stats
key (e.g., a long for numDVs, a float for avgSize).
The special value of '-1' is allowed to reset stats to 'unknown'.
The keys as well as the values are specified as string literals to be
consistent with the existing DDL for setting TBLPROPERTIES/SERDEPROPERTIES,
in particular, setting the 'numRows' table/partition property.
Testing: Ran the tests locally on exhaustive. Did private runs
on core/hdfs and core/S3.
Change-Id: I45cd8aa7241ea962788ba9ca7d0bbfd864c4304f
Reviewed-on: http://gerrit.cloudera.org:8080/3189
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
This directory contains Impala test workloads. The directory layout for the workloads should follow: workloads/ <data set name>/<data set name>_dimensions.csv <- The test dimension file <data set name>/<data set name>_core.csv <- A test vector file <data set name>/<data set name>_pairwise.csv <data set name>/<data set name>_exhaustive.csv <data set name>/queries/<query test>.test <- The queries for this workload