Files
impala/testdata/workloads/functional-query
Taras Bobrovytsky 75553165ee IMPALA-4883: Union Codegen
For each non-passthrough child of the Union node, codegen the loop that
does per row tuple materialization.

Testing:
Ran test_queries.py test locally in exchaustive mode.

Benchmark:
Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned
store_sales table.

SELECT
  COUNT(c),
  COUNT(ss_customer_sk),
  COUNT(ss_cdemo_sk),
  COUNT(ss_hdemo_sk),
  COUNT(ss_addr_sk),
  COUNT(ss_store_sk),
  COUNT(ss_promo_sk),
  COUNT(ss_ticket_number),
  COUNT(ss_quantity),
  COUNT(ss_wholesale_cost),
  COUNT(ss_list_price),
  COUNT(ss_sales_price),
  COUNT(ss_ext_discount_amt),
  COUNT(ss_ext_sales_price),
  COUNT(ss_ext_wholesale_cost),
  COUNT(ss_ext_list_price),
  COUNT(ss_ext_tax),
  COUNT(ss_coupon_amt),
  COUNT(ss_net_paid),
  COUNT(ss_net_paid_inc_tax),
  COUNT(ss_net_profit),
  COUNT(ss_sold_date_sk)
FROM (
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned
) t

Before: 39s704ms
Operator          #Hosts   Avg Time   Max Time    #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail
------------------------------------------------------------------------------------------------------------------------------
13:AGGREGATE           1  194.504us  194.504us        1           1   28.00 KB        -1.00 B  FINALIZE
12:EXCHANGE            1   17.284us   17.284us        3           1          0        -1.00 B  UNPARTITIONED
11:AGGREGATE           3    2s202ms    2s934ms        3           1  115.00 KB       10.00 MB
00:UNION               3   32s514ms   34s926ms  288.01M     288.01M    3.08 MB              0
|--02:SCAN HDFS        3  158.373ms  216.085ms   28.80M      28.80M  489.71 MB        1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS        3  167.002ms  171.738ms   28.80M      28.80M  489.74 MB        1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS        3  125.331ms  145.496ms   28.80M      28.80M  489.57 MB        1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS        3  148.478ms  194.311ms   28.80M      28.80M  489.69 MB        1.88 GB  tpcds_10_parquet.store_sales
|--06:SCAN HDFS        3  143.995ms  162.781ms   28.80M      28.80M  489.57 MB        1.88 GB  tpcds_10_parquet.store_sales
|--07:SCAN HDFS        3  169.731ms  250.201ms   28.80M      28.80M  489.58 MB        1.88 GB  tpcds_10_parquet.store_sales
|--08:SCAN HDFS        3  164.110ms  254.374ms   28.80M      28.80M  489.61 MB        1.88 GB  tpcds_10_parquet.store_sales
|--09:SCAN HDFS        3  135.631ms  162.117ms   28.80M      28.80M  489.63 MB        1.88 GB  tpcds_10_parquet.store_sales
|--10:SCAN HDFS        3  138.736ms  167.778ms   28.80M      28.80M  489.67 MB        1.88 GB  tpcds_10_parquet.store_sales
01:SCAN HDFS           3  202.015ms  248.728ms   28.80M      28.80M  489.68 MB        1.88 GB  tpcds_10_parquet.store_sales

After: 20s177ms
Operator          #Hosts   Avg Time   Max Time    #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail
------------------------------------------------------------------------------------------------------------------------------
13:AGGREGATE           1  174.617us  174.617us        1           1   28.00 KB        -1.00 B  FINALIZE
12:EXCHANGE            1   16.693us   16.693us        3           1          0        -1.00 B  UNPARTITIONED
11:AGGREGATE           3    2s830ms    3s615ms        3           1  115.00 KB       10.00 MB
00:UNION               3    4s296ms    5s258ms  288.01M     288.01M    3.08 MB              0
|--02:SCAN HDFS        3    1s212ms    1s340ms   28.80M      28.80M  488.82 MB        1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS        3    1s387ms    1s570ms   28.80M      28.80M  489.37 MB        1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS        3    1s224ms    1s347ms   28.80M      28.80M  487.22 MB        1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS        3    1s245ms    1s321ms   28.80M      28.80M  489.25 MB        1.88 GB  tpcds_10_parquet.store_sales
|--06:SCAN HDFS        3    1s232ms    1s505ms   28.80M      28.80M  484.21 MB        1.88 GB  tpcds_10_parquet.store_sales
|--07:SCAN HDFS        3    1s348ms    1s518ms   28.80M      28.80M  488.20 MB        1.88 GB  tpcds_10_parquet.store_sales
|--08:SCAN HDFS        3    1s231ms    1s335ms   28.80M      28.80M  483.58 MB        1.88 GB  tpcds_10_parquet.store_sales
|--09:SCAN HDFS        3    1s179ms    1s349ms   28.80M      28.80M  482.76 MB        1.88 GB  tpcds_10_parquet.store_sales
|--10:SCAN HDFS        3    1s121ms    1s154ms   28.80M      28.80M  486.59 MB        1.88 GB  tpcds_10_parquet.store_sales
01:SCAN HDFS           3    1s284ms    1s523ms   28.80M      28.80M  486.70 MB        1.88 GB  tpcds_10_parquet.store_sales

Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439
Reviewed-on: http://gerrit.cloudera.org:8080/6459
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-04-21 04:53:09 +00:00
..
2017-04-21 04:53:09 +00:00