mirror of
https://github.com/apache/impala.git
synced 2026-02-02 06:00:36 -05:00
HashTable implementation in Impala comprises of contiguous array
of Buckets and each Bucket contains either data or pointer to
linked list of duplicate entries named DuplicateNode.
These are the structures of Bucket and DuplicateNode:
struct DuplicateNode {
bool matched;
DuplicateNode* next;
HtData htdata;
};
struct Bucket {
bool filled;
bool matched;
bool hasDuplicates;
uint32_t hash;
union {
HtData htdata;
DuplicateNode* duplicates;
} bucketData;
};
Size of Bucket is currently 16 bytes and size of DuplicateNode is
24 bytes. If we can remove the booleans from both struct size of
Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes.
One of the ways we can remove booleans is to fold it into pointers
already part of struct. Pointers store addresses and on
architectures like x86 and ARM the linear address is only 48 bits
long. With level 5 paging Intel is planning to expand it to 57-bit
long which means we can use most significant 7 bits i.e., 58 to 64
bits to store these booleans. This patch reduces the size of Bucket
and DuplicateNode by implementing this folding. However, there is
another requirement regarding Size of Bucket to be power of 2 and
also for the number of buckets in Hash table to be power of 2.
These requirements are for the following reasons:
1. Memory Allocator allocates memory in power of 2 to avoid
internal fragmentation. Hence, num of buckets * sizeof(Buckets)
should be power of 2.
2. Number of buckets being power of 2 enables faster modulo
operation i.e., instead of slow modulo: (hash % N), faster
(hash & (N-1)) can be used.
Due to this, 4 bytes 'hash' field from Bucket is removed and
stored separately in new array hash_array_ in HashTable.
This ensures sizeof(Bucket) is 8 which is power of 2.
New Classes:
------------
As a part of patch, TaggedPointer is introduced which is a template
class to store a pointer and 7-bit tag together in 64 bit integer.
This structure contains the ownership of the pointer and will take care
of allocation and deallocation of the object being pointed to.
However derived classes can opt out of the ownership of the object
and let the client manage it. It's derived classes for Bucket and
DuplicateNode do the same. These classes are TaggedBucketData and
TaggedDuplicateNode.
Benchmark:
----------
As a part of this patch a new Micro Benchmark for HashTable has
been introduced, which will help in measuring these:
1. Runtime for building hash table and probing it.
2. Memory consumed after building the Table.
This would help measuring the impact of changes to the HashTable's
data structure and algorithm.
Saw 25-30% reduction in memory consumed and no significant
difference in performance (0.91X-1.2X).
Other Benchmarks:
1. Billion row Synthetic benchmark on single node, single daemon:
a. 2-3% improvement in Join GEOMEAN for Probe benchmark.
b. 17% and 21% reduction in PeakMemoryUsage and
CumulativeBytes allocated respectively
2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime
Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b
Reviewed-on: http://gerrit.cloudera.org:8080/17592
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
1117 lines
60 KiB
Plaintext
1117 lines
60 KiB
Plaintext
# Join with tiny build side - should use smallest possible buffers.
|
|
select straight_join *
|
|
from tpch_parquet.customer
|
|
inner join tpch_parquet.nation on c_nationkey = n_nationkey
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=22.97MB Threads=5
|
|
Per-Host Resource Estimates: Memory=100MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.customer INNER
|
|
JOIN tpch_parquet.nation ON c_nationkey = n_nationkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=57.09MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.customer.c_custkey, tpch_parquet.customer.c_name, tpch_parquet.customer.c_address, tpch_parquet.customer.c_nationkey, tpch_parquet.customer.c_phone, tpch_parquet.customer.c_acctbal, tpch_parquet.customer.c_mktsegment, tpch_parquet.customer.c_comment, tpch_parquet.nation.n_nationkey, tpch_parquet.nation.n_name, tpch_parquet.nation.n_regionkey, tpch_parquet.nation.n_comment
|
|
| mem-estimate=46.77MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.33MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1 row-size=327B cardinality=150.00K
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
Per-Host Resources: mem-estimate=26.95MB mem-reservation=18.94MB thread-reservation=2 runtime-filters-memory=1.00MB
|
|
02:HASH JOIN [INNER JOIN, BROADCAST]
|
|
| hash predicates: c_nationkey = n_nationkey
|
|
| fk/pk conjuncts: c_nationkey = n_nationkey
|
|
| runtime filters: RF000[bloom] <- n_nationkey
|
|
| mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=327B cardinality=150.00K
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=109B cardinality=25
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=16.00MB mem-reservation=32.00KB thread-reservation=2
|
|
| 01:SCAN HDFS [tpch_parquet.nation, RANDOM]
|
|
| HDFS partitions=1/1 files=1 size=3.04KB
|
|
| stored statistics:
|
|
| table: rows=25 size=3.04KB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=25
|
|
| mem-estimate=16.00MB mem-reservation=32.00KB thread-reservation=1
|
|
| tuple-ids=1 row-size=109B cardinality=25
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
HDFS partitions=1/1 files=1 size=12.34MB
|
|
runtime filters: RF000[bloom] -> c_nationkey
|
|
stored statistics:
|
|
table: rows=150.00K size=12.34MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=1
|
|
tuple-ids=0 row-size=218B cardinality=150.00K
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=25.91MB Threads=4
|
|
Per-Host Resource Estimates: Memory=103MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.customer INNER
|
|
JOIN tpch_parquet.nation ON c_nationkey = n_nationkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=57.09MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.customer.c_custkey, tpch_parquet.customer.c_name, tpch_parquet.customer.c_address, tpch_parquet.customer.c_nationkey, tpch_parquet.customer.c_phone, tpch_parquet.customer.c_acctbal, tpch_parquet.customer.c_mktsegment, tpch_parquet.customer.c_comment, tpch_parquet.nation.n_nationkey, tpch_parquet.nation.n_name, tpch_parquet.nation.n_regionkey, tpch_parquet.nation.n_comment
|
|
| mem-estimate=46.77MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.33MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1 row-size=327B cardinality=150.00K
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
Per-Host Shared Resources: mem-estimate=1.00MB mem-reservation=1.00MB thread-reservation=0 runtime-filters-memory=1.00MB
|
|
Per-Instance Resources: mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=1
|
|
02:HASH JOIN [INNER JOIN, BROADCAST]
|
|
| hash-table-id=00
|
|
| hash predicates: c_nationkey = n_nationkey
|
|
| fk/pk conjuncts: c_nationkey = n_nationkey
|
|
| mem-estimate=0B mem-reservation=0B spill-buffer=64.00KB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=327B cardinality=150.00K
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--F03:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
| | Per-Instance Resources: mem-estimate=4.89MB mem-reservation=4.88MB thread-reservation=1 runtime-filters-memory=1.00MB
|
|
| JOIN BUILD
|
|
| | join-table-id=00 plan-id=01 cohort-id=01
|
|
| | build expressions: n_nationkey
|
|
| | runtime filters: RF000[bloom] <- n_nationkey
|
|
| | mem-estimate=3.88MB mem-reservation=3.88MB spill-buffer=64.00KB thread-reservation=0
|
|
| |
|
|
| 03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=109B cardinality=25
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=16.00MB mem-reservation=32.00KB thread-reservation=1
|
|
| 01:SCAN HDFS [tpch_parquet.nation, RANDOM]
|
|
| HDFS partitions=1/1 files=1 size=3.04KB
|
|
| stored statistics:
|
|
| table: rows=25 size=3.04KB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=25
|
|
| mem-estimate=16.00MB mem-reservation=32.00KB thread-reservation=0
|
|
| tuple-ids=1 row-size=109B cardinality=25
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
HDFS partitions=1/1 files=1 size=12.34MB
|
|
runtime filters: RF000[bloom] -> c_nationkey
|
|
stored statistics:
|
|
table: rows=150.00K size=12.34MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=0
|
|
tuple-ids=0 row-size=218B cardinality=150.00K
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# Join with large build side - should use default-sized buffers.
|
|
select straight_join *
|
|
from tpch_parquet.lineitem
|
|
left join tpch_parquet.orders on l_orderkey = o_orderkey
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=102.00MB Threads=5
|
|
Per-Host Resource Estimates: Memory=534MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.lineitem LEFT
|
|
OUTER JOIN tpch_parquet.orders ON l_orderkey = o_orderkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=111.20MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment, tpch_parquet.orders.o_orderkey, tpch_parquet.orders.o_custkey, tpch_parquet.orders.o_orderstatus, tpch_parquet.orders.o_totalprice, tpch_parquet.orders.o_orderdate, tpch_parquet.orders.o_orderpriority, tpch_parquet.orders.o_clerk, tpch_parquet.orders.o_shippriority, tpch_parquet.orders.o_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=11.20MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1N row-size=402B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=382.84MB mem-reservation=74.00MB thread-reservation=2
|
|
02:HASH JOIN [LEFT OUTER JOIN, BROADCAST]
|
|
| hash predicates: l_orderkey = o_orderkey
|
|
| fk/pk conjuncts: l_orderkey = o_orderkey
|
|
| mem-estimate=292.49MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1N row-size=402B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=10.34MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=171B cardinality=1.50M
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
| Per-Host Resources: mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=2
|
|
| 01:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
| HDFS partitions=1/1 files=2 size=54.21MB
|
|
| stored statistics:
|
|
| table: rows=1.50M size=54.21MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
| mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=1
|
|
| tuple-ids=1 row-size=171B cardinality=1.50M
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.lineitem, RANDOM]
|
|
HDFS partitions=1/1 files=3 size=193.98MB
|
|
stored statistics:
|
|
table: rows=6.00M size=193.98MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=2.14M
|
|
mem-estimate=80.00MB mem-reservation=40.00MB thread-reservation=1
|
|
tuple-ids=0 row-size=231B cardinality=6.00M
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=136.00MB Threads=4
|
|
Per-Host Resource Estimates: Memory=534MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.lineitem LEFT
|
|
OUTER JOIN tpch_parquet.orders ON l_orderkey = o_orderkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=111.20MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment, tpch_parquet.orders.o_orderkey, tpch_parquet.orders.o_custkey, tpch_parquet.orders.o_orderstatus, tpch_parquet.orders.o_totalprice, tpch_parquet.orders.o_orderdate, tpch_parquet.orders.o_orderpriority, tpch_parquet.orders.o_clerk, tpch_parquet.orders.o_shippriority, tpch_parquet.orders.o_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=11.20MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1N row-size=402B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Instance Resources: mem-estimate=80.00MB mem-reservation=40.00MB thread-reservation=1
|
|
02:HASH JOIN [LEFT OUTER JOIN, BROADCAST]
|
|
| hash-table-id=00
|
|
| hash predicates: l_orderkey = o_orderkey
|
|
| fk/pk conjuncts: l_orderkey = o_orderkey
|
|
| mem-estimate=0B mem-reservation=0B spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1N row-size=402B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--F03:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
| | Per-Instance Resources: mem-estimate=302.84MB mem-reservation=68.00MB thread-reservation=1
|
|
| JOIN BUILD
|
|
| | join-table-id=00 plan-id=01 cohort-id=01
|
|
| | build expressions: o_orderkey
|
|
| | mem-estimate=292.49MB mem-reservation=68.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| |
|
|
| 03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=10.34MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=171B cardinality=1.50M
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
| Per-Instance Resources: mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=1
|
|
| 01:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
| HDFS partitions=1/1 files=2 size=54.21MB
|
|
| stored statistics:
|
|
| table: rows=1.50M size=54.21MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
| mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=171B cardinality=1.50M
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.lineitem, RANDOM]
|
|
HDFS partitions=1/1 files=3 size=193.98MB
|
|
stored statistics:
|
|
table: rows=6.00M size=193.98MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=2.14M
|
|
mem-estimate=80.00MB mem-reservation=40.00MB thread-reservation=0
|
|
tuple-ids=0 row-size=231B cardinality=6.00M
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# Shuffle join with mid-sized input.
|
|
select straight_join *
|
|
from tpch_parquet.orders
|
|
join /*+shuffle*/ tpch_parquet.customer on o_custkey = c_custkey
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=80.00MB Threads=6
|
|
Per-Host Resource Estimates: Memory=231MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.orders INNER
|
|
JOIN /* +shuffle */ tpch_parquet.customer ON o_custkey = c_custkey
|
|
|
|
F03:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=110.77MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.orders.o_orderkey, tpch_parquet.orders.o_custkey, tpch_parquet.orders.o_orderstatus, tpch_parquet.orders.o_totalprice, tpch_parquet.orders.o_orderdate, tpch_parquet.orders.o_orderpriority, tpch_parquet.orders.o_clerk, tpch_parquet.orders.o_shippriority, tpch_parquet.orders.o_comment, tpch_parquet.customer.c_custkey, tpch_parquet.customer.c_name, tpch_parquet.customer.c_address, tpch_parquet.customer.c_nationkey, tpch_parquet.customer.c_phone, tpch_parquet.customer.c_acctbal, tpch_parquet.customer.c_mktsegment, tpch_parquet.customer.c_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
05:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.77MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F02:PLAN FRAGMENT [HASH(o_custkey)] hosts=2 instances=2
|
|
Per-Host Resources: mem-estimate=55.56MB mem-reservation=35.00MB thread-reservation=1 runtime-filters-memory=1.00MB
|
|
02:HASH JOIN [INNER JOIN, PARTITIONED]
|
|
| hash predicates: o_custkey = c_custkey
|
|
| fk/pk conjuncts: o_custkey = c_custkey
|
|
| runtime filters: RF000[bloom] <- c_custkey
|
|
| mem-estimate=34.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--04:EXCHANGE [HASH(c_custkey)]
|
|
| | mem-estimate=10.22MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=2
|
|
| 01:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
| HDFS partitions=1/1 files=1 size=12.34MB
|
|
| stored statistics:
|
|
| table: rows=150.00K size=12.34MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
| mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=1
|
|
| tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
03:EXCHANGE [HASH(o_custkey)]
|
|
| mem-estimate=10.34MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0 row-size=171B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
Per-Host Resources: mem-estimate=41.00MB mem-reservation=25.00MB thread-reservation=2 runtime-filters-memory=1.00MB
|
|
00:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
HDFS partitions=1/1 files=2 size=54.21MB
|
|
runtime filters: RF000[bloom] -> o_custkey
|
|
stored statistics:
|
|
table: rows=1.50M size=54.21MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=1
|
|
tuple-ids=0 row-size=171B cardinality=1.50M
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=80.00MB Threads=5
|
|
Per-Host Resource Estimates: Memory=231MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.orders INNER
|
|
JOIN /* +shuffle */ tpch_parquet.customer ON o_custkey = c_custkey
|
|
|
|
F03:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=110.77MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.orders.o_orderkey, tpch_parquet.orders.o_custkey, tpch_parquet.orders.o_orderstatus, tpch_parquet.orders.o_totalprice, tpch_parquet.orders.o_orderdate, tpch_parquet.orders.o_orderpriority, tpch_parquet.orders.o_clerk, tpch_parquet.orders.o_shippriority, tpch_parquet.orders.o_comment, tpch_parquet.customer.c_custkey, tpch_parquet.customer.c_name, tpch_parquet.customer.c_address, tpch_parquet.customer.c_nationkey, tpch_parquet.customer.c_phone, tpch_parquet.customer.c_acctbal, tpch_parquet.customer.c_mktsegment, tpch_parquet.customer.c_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
05:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.77MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F02:PLAN FRAGMENT [HASH(o_custkey)] hosts=2 instances=2
|
|
Per-Instance Resources: mem-estimate=10.34MB mem-reservation=0B thread-reservation=1
|
|
02:HASH JOIN [INNER JOIN, PARTITIONED]
|
|
| hash-table-id=00
|
|
| hash predicates: o_custkey = c_custkey
|
|
| fk/pk conjuncts: o_custkey = c_custkey
|
|
| mem-estimate=0B mem-reservation=0B spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--F04:PLAN FRAGMENT [HASH(o_custkey)] hosts=2 instances=2
|
|
| | Per-Instance Resources: mem-estimate=45.22MB mem-reservation=35.00MB thread-reservation=1 runtime-filters-memory=1.00MB
|
|
| JOIN BUILD
|
|
| | join-table-id=00 plan-id=01 cohort-id=01
|
|
| | build expressions: c_custkey
|
|
| | runtime filters: RF000[bloom] <- c_custkey
|
|
| | mem-estimate=34.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| |
|
|
| 04:EXCHANGE [HASH(c_custkey)]
|
|
| | mem-estimate=10.22MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=1
|
|
| 01:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
| HDFS partitions=1/1 files=1 size=12.34MB
|
|
| stored statistics:
|
|
| table: rows=150.00K size=12.34MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
| mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
03:EXCHANGE [HASH(o_custkey)]
|
|
| mem-estimate=10.34MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0 row-size=171B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
Per-Host Shared Resources: mem-estimate=1.00MB mem-reservation=1.00MB thread-reservation=0 runtime-filters-memory=1.00MB
|
|
Per-Instance Resources: mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=1
|
|
00:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
HDFS partitions=1/1 files=2 size=54.21MB
|
|
runtime filters: RF000[bloom] -> o_custkey
|
|
stored statistics:
|
|
table: rows=1.50M size=54.21MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=0
|
|
tuple-ids=0 row-size=171B cardinality=1.50M
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# Broadcast join with mid-sized input - should use larger buffers than shuffle join.
|
|
select straight_join *
|
|
from tpch_parquet.orders
|
|
join /*+broadcast*/ tpch_parquet.customer on o_custkey = c_custkey
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=79.00MB Threads=5
|
|
Per-Host Resource Estimates: Memory=220MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.orders INNER
|
|
JOIN /* +broadcast */ tpch_parquet.customer ON o_custkey = c_custkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=110.77MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.orders.o_orderkey, tpch_parquet.orders.o_custkey, tpch_parquet.orders.o_orderstatus, tpch_parquet.orders.o_totalprice, tpch_parquet.orders.o_orderdate, tpch_parquet.orders.o_orderpriority, tpch_parquet.orders.o_clerk, tpch_parquet.orders.o_shippriority, tpch_parquet.orders.o_comment, tpch_parquet.customer.c_custkey, tpch_parquet.customer.c_name, tpch_parquet.customer.c_address, tpch_parquet.customer.c_nationkey, tpch_parquet.customer.c_phone, tpch_parquet.customer.c_acctbal, tpch_parquet.customer.c_mktsegment, tpch_parquet.customer.c_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.77MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
Per-Host Resources: mem-estimate=85.34MB mem-reservation=59.00MB thread-reservation=2 runtime-filters-memory=1.00MB
|
|
02:HASH JOIN [INNER JOIN, BROADCAST]
|
|
| hash predicates: o_custkey = c_custkey
|
|
| fk/pk conjuncts: o_custkey = c_custkey
|
|
| runtime filters: RF000[bloom] <- c_custkey
|
|
| mem-estimate=34.12MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=10.22MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=2
|
|
| 01:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
| HDFS partitions=1/1 files=1 size=12.34MB
|
|
| stored statistics:
|
|
| table: rows=150.00K size=12.34MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
| mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=1
|
|
| tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
HDFS partitions=1/1 files=2 size=54.21MB
|
|
runtime filters: RF000[bloom] -> o_custkey
|
|
stored statistics:
|
|
table: rows=1.50M size=54.21MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=1
|
|
tuple-ids=0 row-size=171B cardinality=1.50M
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=114.00MB Threads=4
|
|
Per-Host Resource Estimates: Memory=255MB
|
|
Analyzed query: SELECT /* +straight_join */ * FROM tpch_parquet.orders INNER
|
|
JOIN /* +broadcast */ tpch_parquet.customer ON o_custkey = c_custkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=110.77MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.orders.o_orderkey, tpch_parquet.orders.o_custkey, tpch_parquet.orders.o_orderstatus, tpch_parquet.orders.o_totalprice, tpch_parquet.orders.o_orderdate, tpch_parquet.orders.o_orderpriority, tpch_parquet.orders.o_clerk, tpch_parquet.orders.o_shippriority, tpch_parquet.orders.o_comment, tpch_parquet.customer.c_custkey, tpch_parquet.customer.c_name, tpch_parquet.customer.c_address, tpch_parquet.customer.c_nationkey, tpch_parquet.customer.c_phone, tpch_parquet.customer.c_acctbal, tpch_parquet.customer.c_mktsegment, tpch_parquet.customer.c_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.77MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
Per-Host Shared Resources: mem-estimate=1.00MB mem-reservation=1.00MB thread-reservation=0 runtime-filters-memory=1.00MB
|
|
Per-Instance Resources: mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=1
|
|
02:HASH JOIN [INNER JOIN, BROADCAST]
|
|
| hash-table-id=00
|
|
| hash predicates: o_custkey = c_custkey
|
|
| fk/pk conjuncts: o_custkey = c_custkey
|
|
| mem-estimate=0B mem-reservation=0B spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=388B cardinality=1.50M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--F03:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
| | Per-Instance Resources: mem-estimate=79.22MB mem-reservation=69.00MB thread-reservation=1 runtime-filters-memory=1.00MB
|
|
| JOIN BUILD
|
|
| | join-table-id=00 plan-id=01 cohort-id=01
|
|
| | build expressions: c_custkey
|
|
| | runtime filters: RF000[bloom] <- c_custkey
|
|
| | mem-estimate=68.00MB mem-reservation=68.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| |
|
|
| 03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=10.22MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=1
|
|
| 01:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
| HDFS partitions=1/1 files=1 size=12.34MB
|
|
| stored statistics:
|
|
| table: rows=150.00K size=12.34MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
| mem-estimate=24.00MB mem-reservation=16.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=218B cardinality=150.00K
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
HDFS partitions=1/1 files=2 size=54.21MB
|
|
runtime filters: RF000[bloom] -> o_custkey
|
|
stored statistics:
|
|
table: rows=1.50M size=54.21MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
mem-estimate=40.00MB mem-reservation=24.00MB thread-reservation=0
|
|
tuple-ids=0 row-size=171B cardinality=1.50M
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# Join with no stats for right input - should use default buffers.
|
|
select straight_join *
|
|
from functional_parquet.alltypes
|
|
left join functional_parquet.alltypestiny on alltypes.id = alltypestiny.id
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=38.17MB Threads=5
|
|
Per-Host Resource Estimates: Memory=2.04GB
|
|
WARNING: The following tables are missing relevant table and/or column statistics.
|
|
functional_parquet.alltypes, functional_parquet.alltypestiny
|
|
Analyzed query: SELECT /* +straight_join */ * FROM functional_parquet.alltypes
|
|
LEFT OUTER JOIN functional_parquet.alltypestiny ON alltypes.id = alltypestiny.id
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=10.49MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: functional_parquet.alltypes.id, functional_parquet.alltypes.bool_col, functional_parquet.alltypes.tinyint_col, functional_parquet.alltypes.smallint_col, functional_parquet.alltypes.int_col, functional_parquet.alltypes.bigint_col, functional_parquet.alltypes.float_col, functional_parquet.alltypes.double_col, functional_parquet.alltypes.date_string_col, functional_parquet.alltypes.string_col, functional_parquet.alltypes.timestamp_col, functional_parquet.alltypes.year, functional_parquet.alltypes.month, functional_parquet.alltypestiny.id, functional_parquet.alltypestiny.bool_col, functional_parquet.alltypestiny.tinyint_col, functional_parquet.alltypestiny.smallint_col, functional_parquet.alltypestiny.int_col, functional_parquet.alltypestiny.bigint_col, functional_parquet.alltypestiny.float_col, functional_parquet.alltypestiny.double_col, functional_parquet.alltypestiny.date_string_col, functional_parquet.alltypestiny.string_col, functional_parquet.alltypestiny.timestamp_col, functional_parquet.alltypestiny.year, functional_parquet.alltypestiny.month
|
|
| mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=503.95KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1N row-size=160B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=2.02GB mem-reservation=34.09MB thread-reservation=2
|
|
02:HASH JOIN [LEFT OUTER JOIN, BROADCAST]
|
|
| hash predicates: alltypes.id = alltypestiny.id
|
|
| fk/pk conjuncts: assumed fk/pk
|
|
| mem-estimate=2.00GB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1N row-size=160B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=251.92KB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=80B cardinality=unavailable
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
| Per-Host Resources: mem-estimate=16.00MB mem-reservation=88.00KB thread-reservation=2
|
|
| 01:SCAN HDFS [functional_parquet.alltypestiny, RANDOM]
|
|
| HDFS partitions=4/4 files=4 size=11.92KB
|
|
| stored statistics:
|
|
| table: rows=unavailable size=unavailable
|
|
| partitions: 0/4 rows=unavailable
|
|
| columns missing stats: id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col
|
|
| extrapolated-rows=disabled max-scan-range-rows=unavailable
|
|
| mem-estimate=16.00MB mem-reservation=88.00KB thread-reservation=1
|
|
| tuple-ids=1 row-size=80B cardinality=unavailable
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [functional_parquet.alltypes, RANDOM]
|
|
HDFS partitions=24/24 files=24 size=202.07KB
|
|
stored statistics:
|
|
table: rows=unavailable size=unavailable
|
|
partitions: 0/24 rows=unavailable
|
|
columns missing stats: id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col
|
|
extrapolated-rows=disabled max-scan-range-rows=unavailable
|
|
mem-estimate=16.00MB mem-reservation=88.00KB thread-reservation=1
|
|
tuple-ids=0 row-size=80B cardinality=unavailable
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=72.34MB Threads=6
|
|
Per-Host Resource Estimates: Memory=2.07GB
|
|
WARNING: The following tables are missing relevant table and/or column statistics.
|
|
functional_parquet.alltypestiny
|
|
Analyzed query: SELECT /* +straight_join */ * FROM functional_parquet.alltypes
|
|
LEFT OUTER JOIN functional_parquet.alltypestiny ON alltypes.id = alltypestiny.id
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=10.98MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: functional_parquet.alltypes.id, functional_parquet.alltypes.bool_col, functional_parquet.alltypes.tinyint_col, functional_parquet.alltypes.smallint_col, functional_parquet.alltypes.int_col, functional_parquet.alltypes.bigint_col, functional_parquet.alltypes.float_col, functional_parquet.alltypes.double_col, functional_parquet.alltypes.date_string_col, functional_parquet.alltypes.string_col, functional_parquet.alltypes.timestamp_col, functional_parquet.alltypes.year, functional_parquet.alltypes.month, functional_parquet.alltypestiny.id, functional_parquet.alltypestiny.bool_col, functional_parquet.alltypestiny.tinyint_col, functional_parquet.alltypestiny.smallint_col, functional_parquet.alltypestiny.int_col, functional_parquet.alltypestiny.bigint_col, functional_parquet.alltypestiny.float_col, functional_parquet.alltypestiny.double_col, functional_parquet.alltypestiny.date_string_col, functional_parquet.alltypestiny.string_col, functional_parquet.alltypestiny.timestamp_col, functional_parquet.alltypestiny.year, functional_parquet.alltypestiny.month
|
|
| mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=1007.95KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0,1N row-size=160B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=6
|
|
Per-Instance Resources: mem-estimate=16.00MB mem-reservation=88.00KB thread-reservation=1
|
|
02:HASH JOIN [LEFT OUTER JOIN, BROADCAST]
|
|
| hash-table-id=00
|
|
| hash predicates: alltypes.id = alltypestiny.id
|
|
| fk/pk conjuncts: assumed fk/pk
|
|
| mem-estimate=0B mem-reservation=0B spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1N row-size=160B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--F03:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
| | Per-Instance Resources: mem-estimate=2.00GB mem-reservation=68.00MB thread-reservation=1
|
|
| JOIN BUILD
|
|
| | join-table-id=00 plan-id=01 cohort-id=01
|
|
| | build expressions: alltypestiny.id
|
|
| | mem-estimate=2.00GB mem-reservation=68.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| |
|
|
| 03:EXCHANGE [BROADCAST]
|
|
| | mem-estimate=335.92KB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=80B cardinality=unavailable
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=3 instances=4
|
|
| Per-Instance Resources: mem-estimate=16.00MB mem-reservation=88.00KB thread-reservation=1
|
|
| 01:SCAN HDFS [functional_parquet.alltypestiny, RANDOM]
|
|
| HDFS partitions=4/4 files=4 size=11.92KB
|
|
| stored statistics:
|
|
| table: rows=unavailable size=unavailable
|
|
| partitions: 0/4 rows=unavailable
|
|
| columns missing stats: id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col
|
|
| extrapolated-rows=disabled max-scan-range-rows=unavailable
|
|
| mem-estimate=16.00MB mem-reservation=88.00KB thread-reservation=0
|
|
| tuple-ids=1 row-size=80B cardinality=unavailable
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [functional_parquet.alltypes, RANDOM]
|
|
HDFS partitions=24/24 files=24 size=202.07KB
|
|
stored statistics:
|
|
table: rows=unavailable size=unavailable
|
|
partitions: 0/24 rows=unavailable
|
|
columns missing stats: id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col
|
|
extrapolated-rows=disabled max-scan-range-rows=unavailable
|
|
mem-estimate=16.00MB mem-reservation=88.00KB thread-reservation=0
|
|
tuple-ids=0 row-size=80B cardinality=unavailable
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# Low NDV aggregation - should scale down buffers to minimum.
|
|
select c_nationkey, avg(c_acctbal)
|
|
from tpch_parquet.customer
|
|
group by c_nationkey
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=9.94MB Threads=4
|
|
Per-Host Resource Estimates: Memory=48MB
|
|
Analyzed query: SELECT c_nationkey, avg(c_acctbal) FROM tpch_parquet.customer
|
|
GROUP BY c_nationkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: c_nationkey, avg(c_acctbal)
|
|
| mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=2 row-size=10B cardinality=25
|
|
| in pipelines: 03(GETNEXT)
|
|
|
|
|
F01:PLAN FRAGMENT [HASH(c_nationkey)] hosts=1 instances=1
|
|
Per-Host Resources: mem-estimate=10.02MB mem-reservation=1.94MB thread-reservation=1
|
|
03:AGGREGATE [FINALIZE]
|
|
| output: avg:merge(c_acctbal)
|
|
| group by: c_nationkey
|
|
| mem-estimate=10.00MB mem-reservation=1.94MB spill-buffer=64.00KB thread-reservation=0
|
|
| tuple-ids=2 row-size=10B cardinality=25
|
|
| in pipelines: 03(GETNEXT), 00(OPEN)
|
|
|
|
|
02:EXCHANGE [HASH(c_nationkey)]
|
|
| mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=10B cardinality=25
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
Per-Host Resources: mem-estimate=34.00MB mem-reservation=4.00MB thread-reservation=2
|
|
01:AGGREGATE [STREAMING]
|
|
| output: avg(c_acctbal)
|
|
| group by: c_nationkey
|
|
| mem-estimate=10.00MB mem-reservation=2.00MB spill-buffer=64.00KB thread-reservation=0
|
|
| tuple-ids=1 row-size=10B cardinality=25
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
HDFS partitions=1/1 files=1 size=12.34MB
|
|
stored statistics:
|
|
table: rows=150.00K size=12.34MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
mem-estimate=24.00MB mem-reservation=2.00MB thread-reservation=1
|
|
tuple-ids=0 row-size=10B cardinality=150.00K
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=9.94MB Threads=3
|
|
Per-Host Resource Estimates: Memory=48MB
|
|
Analyzed query: SELECT c_nationkey, avg(c_acctbal) FROM tpch_parquet.customer
|
|
GROUP BY c_nationkey
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=4.02MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: c_nationkey, avg(c_acctbal)
|
|
| mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=2 row-size=10B cardinality=25
|
|
| in pipelines: 03(GETNEXT)
|
|
|
|
|
F01:PLAN FRAGMENT [HASH(c_nationkey)] hosts=1 instances=1
|
|
Per-Instance Resources: mem-estimate=10.02MB mem-reservation=1.94MB thread-reservation=1
|
|
03:AGGREGATE [FINALIZE]
|
|
| output: avg:merge(c_acctbal)
|
|
| group by: c_nationkey
|
|
| mem-estimate=10.00MB mem-reservation=1.94MB spill-buffer=64.00KB thread-reservation=0
|
|
| tuple-ids=2 row-size=10B cardinality=25
|
|
| in pipelines: 03(GETNEXT), 00(OPEN)
|
|
|
|
|
02:EXCHANGE [HASH(c_nationkey)]
|
|
| mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=10B cardinality=25
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|
|
Per-Instance Resources: mem-estimate=34.00MB mem-reservation=4.00MB thread-reservation=1
|
|
01:AGGREGATE [STREAMING]
|
|
| output: avg(c_acctbal)
|
|
| group by: c_nationkey
|
|
| mem-estimate=10.00MB mem-reservation=2.00MB spill-buffer=64.00KB thread-reservation=0
|
|
| tuple-ids=1 row-size=10B cardinality=25
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.customer, RANDOM]
|
|
HDFS partitions=1/1 files=1 size=12.34MB
|
|
stored statistics:
|
|
table: rows=150.00K size=12.34MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=150.00K
|
|
mem-estimate=24.00MB mem-reservation=2.00MB thread-reservation=0
|
|
tuple-ids=0 row-size=10B cardinality=150.00K
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# Mid NDV aggregation - should scale down buffers to intermediate size.
|
|
select straight_join l_orderkey, o_orderstatus, count(*)
|
|
from tpch_parquet.lineitem
|
|
join tpch_parquet.orders on o_orderkey = l_orderkey
|
|
group by 1, 2
|
|
having count(*) = 1
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=120.00MB Threads=7
|
|
Per-Host Resource Estimates: Memory=414MB
|
|
Analyzed query: SELECT /* +straight_join */ l_orderkey, o_orderstatus, count(*)
|
|
FROM tpch_parquet.lineitem INNER JOIN tpch_parquet.orders ON o_orderkey =
|
|
l_orderkey GROUP BY l_orderkey, o_orderstatus HAVING count(*) = CAST(1 AS
|
|
BIGINT)
|
|
|
|
F04:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=110.10MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: l_orderkey, o_orderstatus, count(*)
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
08:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.10MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 07(GETNEXT)
|
|
|
|
|
F03:PLAN FRAGMENT [HASH(l_orderkey,o_orderstatus)] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=71.23MB mem-reservation=34.00MB thread-reservation=1
|
|
07:AGGREGATE [FINALIZE]
|
|
| output: count:merge(*)
|
|
| group by: l_orderkey, o_orderstatus
|
|
| having: count(*) = CAST(1 AS BIGINT)
|
|
| mem-estimate=61.13MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 07(GETNEXT), 00(OPEN)
|
|
|
|
|
06:EXCHANGE [HASH(l_orderkey,o_orderstatus)]
|
|
| mem-estimate=10.10MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F02:PLAN FRAGMENT [HASH(l_orderkey)] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=111.37MB mem-reservation=69.00MB thread-reservation=1 runtime-filters-memory=1.00MB
|
|
03:AGGREGATE [STREAMING]
|
|
| output: count(*)
|
|
| group by: l_orderkey, o_orderstatus
|
|
| mem-estimate=56.28MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
02:HASH JOIN [INNER JOIN, PARTITIONED]
|
|
| hash predicates: l_orderkey = o_orderkey
|
|
| fk/pk conjuncts: l_orderkey = o_orderkey
|
|
| runtime filters: RF000[bloom] <- o_orderkey
|
|
| mem-estimate=34.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=29B cardinality=5.76M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--05:EXCHANGE [HASH(o_orderkey)]
|
|
| | mem-estimate=10.05MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=21B cardinality=1.50M
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
| Per-Host Resources: mem-estimate=40.00MB mem-reservation=8.00MB thread-reservation=2
|
|
| 01:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
| HDFS partitions=1/1 files=2 size=54.21MB
|
|
| stored statistics:
|
|
| table: rows=1.50M size=54.21MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
| mem-estimate=40.00MB mem-reservation=8.00MB thread-reservation=1
|
|
| tuple-ids=1 row-size=21B cardinality=1.50M
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
04:EXCHANGE [HASH(l_orderkey)]
|
|
| mem-estimate=10.04MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0 row-size=8B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=81.00MB mem-reservation=5.00MB thread-reservation=2 runtime-filters-memory=1.00MB
|
|
00:SCAN HDFS [tpch_parquet.lineitem, RANDOM]
|
|
HDFS partitions=1/1 files=3 size=193.98MB
|
|
runtime filters: RF000[bloom] -> l_orderkey
|
|
stored statistics:
|
|
table: rows=6.00M size=193.98MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=2.14M
|
|
mem-estimate=80.00MB mem-reservation=4.00MB thread-reservation=1
|
|
tuple-ids=0 row-size=8B cardinality=6.00M
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=120.00MB Threads=6
|
|
Per-Host Resource Estimates: Memory=414MB
|
|
Analyzed query: SELECT /* +straight_join */ l_orderkey, o_orderstatus, count(*)
|
|
FROM tpch_parquet.lineitem INNER JOIN tpch_parquet.orders ON o_orderkey =
|
|
l_orderkey GROUP BY l_orderkey, o_orderstatus HAVING count(*) = CAST(1 AS
|
|
BIGINT)
|
|
|
|
F04:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=110.10MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: l_orderkey, o_orderstatus, count(*)
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
08:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.10MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 07(GETNEXT)
|
|
|
|
|
F03:PLAN FRAGMENT [HASH(l_orderkey,o_orderstatus)] hosts=3 instances=3
|
|
Per-Instance Resources: mem-estimate=71.23MB mem-reservation=34.00MB thread-reservation=1
|
|
07:AGGREGATE [FINALIZE]
|
|
| output: count:merge(*)
|
|
| group by: l_orderkey, o_orderstatus
|
|
| having: count(*) = CAST(1 AS BIGINT)
|
|
| mem-estimate=61.13MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 07(GETNEXT), 00(OPEN)
|
|
|
|
|
06:EXCHANGE [HASH(l_orderkey,o_orderstatus)]
|
|
| mem-estimate=10.10MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F02:PLAN FRAGMENT [HASH(l_orderkey)] hosts=3 instances=3
|
|
Per-Instance Resources: mem-estimate=66.32MB mem-reservation=34.00MB thread-reservation=1
|
|
03:AGGREGATE [STREAMING]
|
|
| output: count(*)
|
|
| group by: l_orderkey, o_orderstatus
|
|
| mem-estimate=56.28MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=2 row-size=29B cardinality=4.69M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
02:HASH JOIN [INNER JOIN, PARTITIONED]
|
|
| hash-table-id=00
|
|
| hash predicates: l_orderkey = o_orderkey
|
|
| fk/pk conjuncts: l_orderkey = o_orderkey
|
|
| mem-estimate=0B mem-reservation=0B spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=0,1 row-size=29B cardinality=5.76M
|
|
| in pipelines: 00(GETNEXT), 01(OPEN)
|
|
|
|
|
|--F05:PLAN FRAGMENT [HASH(l_orderkey)] hosts=3 instances=3
|
|
| | Per-Instance Resources: mem-estimate=45.05MB mem-reservation=35.00MB thread-reservation=1 runtime-filters-memory=1.00MB
|
|
| JOIN BUILD
|
|
| | join-table-id=00 plan-id=01 cohort-id=01
|
|
| | build expressions: o_orderkey
|
|
| | runtime filters: RF000[bloom] <- o_orderkey
|
|
| | mem-estimate=34.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| |
|
|
| 05:EXCHANGE [HASH(o_orderkey)]
|
|
| | mem-estimate=10.05MB mem-reservation=0B thread-reservation=0
|
|
| | tuple-ids=1 row-size=21B cardinality=1.50M
|
|
| | in pipelines: 01(GETNEXT)
|
|
| |
|
|
| F01:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
|
|
| Per-Instance Resources: mem-estimate=40.00MB mem-reservation=8.00MB thread-reservation=1
|
|
| 01:SCAN HDFS [tpch_parquet.orders, RANDOM]
|
|
| HDFS partitions=1/1 files=2 size=54.21MB
|
|
| stored statistics:
|
|
| table: rows=1.50M size=54.21MB
|
|
| columns: all
|
|
| extrapolated-rows=disabled max-scan-range-rows=1.18M
|
|
| mem-estimate=40.00MB mem-reservation=8.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=21B cardinality=1.50M
|
|
| in pipelines: 01(GETNEXT)
|
|
|
|
|
04:EXCHANGE [HASH(l_orderkey)]
|
|
| mem-estimate=10.04MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=0 row-size=8B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Host Shared Resources: mem-estimate=1.00MB mem-reservation=1.00MB thread-reservation=0 runtime-filters-memory=1.00MB
|
|
Per-Instance Resources: mem-estimate=80.00MB mem-reservation=4.00MB thread-reservation=1
|
|
00:SCAN HDFS [tpch_parquet.lineitem, RANDOM]
|
|
HDFS partitions=1/1 files=3 size=193.98MB
|
|
runtime filters: RF000[bloom] -> l_orderkey
|
|
stored statistics:
|
|
table: rows=6.00M size=193.98MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=2.14M
|
|
mem-estimate=80.00MB mem-reservation=4.00MB thread-reservation=0
|
|
tuple-ids=0 row-size=8B cardinality=6.00M
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# High NDV aggregation - should use default buffer size.
|
|
select distinct *
|
|
from tpch_parquet.lineitem
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=112.00MB Threads=4
|
|
Per-Host Resource Estimates: Memory=1012MB
|
|
Analyzed query: SELECT DISTINCT * FROM tpch_parquet.lineitem
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=110.69MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.69MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 03(GETNEXT)
|
|
|
|
|
F01:PLAN FRAGMENT [HASH(tpch_parquet.lineitem.l_orderkey,tpch_parquet.lineitem.l_partkey,tpch_parquet.lineitem.l_suppkey,tpch_parquet.lineitem.l_linenumber,tpch_parquet.lineitem.l_quantity,tpch_parquet.lineitem.l_extendedprice,tpch_parquet.lineitem.l_discount,tpch_parquet.lineitem.l_tax,tpch_parquet.lineitem.l_returnflag,tpch_parquet.lineitem.l_linestatus,tpch_parquet.lineitem.l_shipdate,tpch_parquet.lineitem.l_commitdate,tpch_parquet.lineitem.l_receiptdate,tpch_parquet.lineitem.l_shipinstruct,tpch_parquet.lineitem.l_shipmode,tpch_parquet.lineitem.l_comment)] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=473.84MB mem-reservation=34.00MB thread-reservation=1
|
|
03:AGGREGATE [FINALIZE]
|
|
| group by: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment
|
|
| mem-estimate=463.16MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 03(GETNEXT), 00(OPEN)
|
|
|
|
|
02:EXCHANGE [HASH(tpch_parquet.lineitem.l_orderkey,tpch_parquet.lineitem.l_partkey,tpch_parquet.lineitem.l_suppkey,tpch_parquet.lineitem.l_linenumber,tpch_parquet.lineitem.l_quantity,tpch_parquet.lineitem.l_extendedprice,tpch_parquet.lineitem.l_discount,tpch_parquet.lineitem.l_tax,tpch_parquet.lineitem.l_returnflag,tpch_parquet.lineitem.l_linestatus,tpch_parquet.lineitem.l_shipdate,tpch_parquet.lineitem.l_commitdate,tpch_parquet.lineitem.l_receiptdate,tpch_parquet.lineitem.l_shipinstruct,tpch_parquet.lineitem.l_shipmode,tpch_parquet.lineitem.l_comment)]
|
|
| mem-estimate=10.69MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=427.37MB mem-reservation=74.00MB thread-reservation=2
|
|
01:AGGREGATE [STREAMING]
|
|
| group by: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment
|
|
| mem-estimate=347.37MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.lineitem, RANDOM]
|
|
HDFS partitions=1/1 files=3 size=193.98MB
|
|
stored statistics:
|
|
table: rows=6.00M size=193.98MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=2.14M
|
|
mem-estimate=80.00MB mem-reservation=40.00MB thread-reservation=1
|
|
tuple-ids=0 row-size=231B cardinality=6.00M
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=112.00MB Threads=3
|
|
Per-Host Resource Estimates: Memory=1012MB
|
|
Analyzed query: SELECT DISTINCT * FROM tpch_parquet.lineitem
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=110.69MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment
|
|
| mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=10.69MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 03(GETNEXT)
|
|
|
|
|
F01:PLAN FRAGMENT [HASH(tpch_parquet.lineitem.l_orderkey,tpch_parquet.lineitem.l_partkey,tpch_parquet.lineitem.l_suppkey,tpch_parquet.lineitem.l_linenumber,tpch_parquet.lineitem.l_quantity,tpch_parquet.lineitem.l_extendedprice,tpch_parquet.lineitem.l_discount,tpch_parquet.lineitem.l_tax,tpch_parquet.lineitem.l_returnflag,tpch_parquet.lineitem.l_linestatus,tpch_parquet.lineitem.l_shipdate,tpch_parquet.lineitem.l_commitdate,tpch_parquet.lineitem.l_receiptdate,tpch_parquet.lineitem.l_shipinstruct,tpch_parquet.lineitem.l_shipmode,tpch_parquet.lineitem.l_comment)] hosts=3 instances=3
|
|
Per-Instance Resources: mem-estimate=473.84MB mem-reservation=34.00MB thread-reservation=1
|
|
03:AGGREGATE [FINALIZE]
|
|
| group by: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment
|
|
| mem-estimate=463.16MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 03(GETNEXT), 00(OPEN)
|
|
|
|
|
02:EXCHANGE [HASH(tpch_parquet.lineitem.l_orderkey,tpch_parquet.lineitem.l_partkey,tpch_parquet.lineitem.l_suppkey,tpch_parquet.lineitem.l_linenumber,tpch_parquet.lineitem.l_quantity,tpch_parquet.lineitem.l_extendedprice,tpch_parquet.lineitem.l_discount,tpch_parquet.lineitem.l_tax,tpch_parquet.lineitem.l_returnflag,tpch_parquet.lineitem.l_linestatus,tpch_parquet.lineitem.l_shipdate,tpch_parquet.lineitem.l_commitdate,tpch_parquet.lineitem.l_receiptdate,tpch_parquet.lineitem.l_shipinstruct,tpch_parquet.lineitem.l_shipmode,tpch_parquet.lineitem.l_comment)]
|
|
| mem-estimate=10.69MB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Instance Resources: mem-estimate=427.37MB mem-reservation=74.00MB thread-reservation=1
|
|
01:AGGREGATE [STREAMING]
|
|
| group by: tpch_parquet.lineitem.l_orderkey, tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, tpch_parquet.lineitem.l_comment
|
|
| mem-estimate=347.37MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=231B cardinality=6.00M
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [tpch_parquet.lineitem, RANDOM]
|
|
HDFS partitions=1/1 files=3 size=193.98MB
|
|
stored statistics:
|
|
table: rows=6.00M size=193.98MB
|
|
columns: all
|
|
extrapolated-rows=disabled max-scan-range-rows=2.14M
|
|
mem-estimate=80.00MB mem-reservation=40.00MB thread-reservation=0
|
|
tuple-ids=0 row-size=231B cardinality=6.00M
|
|
in pipelines: 00(GETNEXT)
|
|
====
|
|
# Aggregation with unknown input - should use default buffer size.
|
|
select string_col, count(*)
|
|
from functional_parquet.alltypestiny
|
|
group by string_col
|
|
---- DISTRIBUTEDPLAN
|
|
Max Per-Host Resource Reservation: Memory=72.01MB Threads=4
|
|
Per-Host Resource Estimates: Memory=282MB
|
|
WARNING: The following tables are missing relevant table and/or column statistics.
|
|
functional_parquet.alltypestiny
|
|
Analyzed query: SELECT string_col, count(*) FROM functional_parquet.alltypestiny
|
|
GROUP BY string_col
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Host Resources: mem-estimate=10.07MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: string_col, count(*)
|
|
| mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=71.99KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 03(GETNEXT)
|
|
|
|
|
F01:PLAN FRAGMENT [HASH(string_col)] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=128.07MB mem-reservation=34.00MB thread-reservation=1
|
|
03:AGGREGATE [FINALIZE]
|
|
| output: count:merge(*)
|
|
| group by: string_col
|
|
| mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 03(GETNEXT), 00(OPEN)
|
|
|
|
|
02:EXCHANGE [HASH(string_col)]
|
|
| mem-estimate=71.99KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
|
Per-Host Resources: mem-estimate=144.00MB mem-reservation=34.01MB thread-reservation=2
|
|
01:AGGREGATE [STREAMING]
|
|
| output: count(*)
|
|
| group by: string_col
|
|
| mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [functional_parquet.alltypestiny, RANDOM]
|
|
HDFS partitions=4/4 files=4 size=11.92KB
|
|
stored statistics:
|
|
table: rows=unavailable size=unavailable
|
|
partitions: 0/4 rows=unavailable
|
|
columns: unavailable
|
|
extrapolated-rows=disabled max-scan-range-rows=unavailable
|
|
mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1
|
|
tuple-ids=0 row-size=12B cardinality=unavailable
|
|
in pipelines: 00(GETNEXT)
|
|
---- PARALLELPLANS
|
|
Max Per-Host Resource Reservation: Memory=140.02MB Threads=5
|
|
Per-Host Resource Estimates: Memory=554MB
|
|
WARNING: The following tables are missing relevant table and/or column statistics.
|
|
functional_parquet.alltypestiny
|
|
Analyzed query: SELECT string_col, count(*) FROM functional_parquet.alltypestiny
|
|
GROUP BY string_col
|
|
|
|
F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
|
| Per-Instance Resources: mem-estimate=10.09MB mem-reservation=4.00MB thread-reservation=1
|
|
PLAN-ROOT SINK
|
|
| output exprs: string_col, count(*)
|
|
| mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0
|
|
|
|
|
04:EXCHANGE [UNPARTITIONED]
|
|
| mem-estimate=95.99KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 03(GETNEXT)
|
|
|
|
|
F01:PLAN FRAGMENT [HASH(string_col)] hosts=3 instances=4
|
|
Per-Instance Resources: mem-estimate=128.09MB mem-reservation=34.00MB thread-reservation=1
|
|
03:AGGREGATE [FINALIZE]
|
|
| output: count:merge(*)
|
|
| group by: string_col
|
|
| mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 03(GETNEXT), 00(OPEN)
|
|
|
|
|
02:EXCHANGE [HASH(string_col)]
|
|
| mem-estimate=95.99KB mem-reservation=0B thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=4
|
|
Per-Instance Resources: mem-estimate=144.00MB mem-reservation=34.01MB thread-reservation=1
|
|
01:AGGREGATE [STREAMING]
|
|
| output: count(*)
|
|
| group by: string_col
|
|
| mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB thread-reservation=0
|
|
| tuple-ids=1 row-size=20B cardinality=unavailable
|
|
| in pipelines: 00(GETNEXT)
|
|
|
|
|
00:SCAN HDFS [functional_parquet.alltypestiny, RANDOM]
|
|
HDFS partitions=4/4 files=4 size=11.92KB
|
|
stored statistics:
|
|
table: rows=unavailable size=unavailable
|
|
partitions: 0/4 rows=unavailable
|
|
columns: unavailable
|
|
extrapolated-rows=disabled max-scan-range-rows=unavailable
|
|
mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=0
|
|
tuple-ids=0 row-size=12B cardinality=unavailable
|
|
in pipelines: 00(GETNEXT)
|
|
====
|