impala

mirror of https://github.com/apache/impala.git synced 2026-01-23 03:00:12 -05:00

Files

Alex Behm ee0fc260d1 IMPALA-5309: Adds TABLESAMPLE clause for HDFS table refs.

Syntax:
<tableref> TABLESAMPLE SYSTEM(<number>) [REPEATABLE(<number>)]
The first number specifies the percent of table bytes to sample.
The second number specifies the random seed to use.

The sampling is coarse-grained. Impala keeps randomly adding
files to the sample until at least the desired percentage of
file bytes have been reached.

Examples:
SELECT * FROM t TABLESAMPLE SYSTEM(10)
SELECT * FROM t TABLESAMPLE SYSTEM(50) REPEATABLE(1234)

Testing:
- Added parser, analyser, planner, and end-to-end tests
- Private core/hdfs run passed

Change-Id: Ief112cfb1e4983c5d94c08696dc83da9ccf43f70
Reviewed-on: http://gerrit.cloudera.org:8080/6868
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins

2017-05-24 02:38:08 +00:00

queries/PlannerTest

IMPALA-5309: Adds TABLESAMPLE clause for HDFS table refs.

2017-05-24 02:38:08 +00:00

functional-planner_core.csv

Update benchmark tests to run against generic workload, data loading with scale factor, +more

2014-01-08 10:44:22 -08:00

functional-planner_dimensions.csv

Update benchmark tests to run against generic workload, data loading with scale factor, +more

2014-01-08 10:44:22 -08:00

functional-planner_exhaustive.csv

Update benchmark tests to run against generic workload, data loading with scale factor, +more