impala

mirror of https://github.com/apache/impala.git synced 2026-01-07 00:02:28 -05:00

Files

Alex Behm 7067a5d94d IMPALA-1519: Fix wrapping of exprs via a TupleIsNullPredicate with analytics.

The bug:
Analytic functions introduced a few challenges in properly wrapping
exprs with TupleIsNullPredicates when substituting exprs from outer-joined
inline views.

1. The logical to physical tuple mapping during the plan generation of analytics
invalidated the tuple ids originally set in upstream TupleIsNullPredicates
introduced during analysis (e.g., in the result exprs).

2. TupleIsNullPredicates require specific tuple ids for evaluation.
Since sort nodes materializes a new tuple, it's impossible to evaluate
TupleIsNullPredicates referring to a sort's input after the sort.
Non-analytic sorts handle this case during analysis by materializing
the result of that select block. However, analytic sorts used to only materialize
the slots of materialized tuple ids of the input plan node.

The fixes:

1. Move the TupleIsNullPredicate wrapping from the inline-view analysis into
the inline-view planning. This avoids the original problem because all physical
output tuples are known during plan generation. This simple change has a few
subtle consequences: First, we must rely on the plan root's output smap for
substituting the final result exprs, and *not* use the top-level base table smap
generated during analysis. Second, during plan generation we must use an inline
view's smap (and *not* its base table smap) for generating the output smap of its
plan such that we can properly wrap the rhs exprs in TupleIsNullPredicates
at every level.
This change also fixes IMPALA-1946 by deferring the TupleIsNullWrapping to
planning time.

2. To preserve the information whether an input tuple was null or not at an
anlytic sort, we materialize TupleIsNullPredicates, which are then substituted
by a SlotRef into the sort's tuple in ancestor nodes.

This patch also cleans up and consolidates the code used for wrapping exprs into
TupleIsNullPredicate itself.

Change-Id: I5c6d142bdf9c99ece2a564e557d4ffe22ac90865
Reviewed-on: http://gerrit.cloudera.org:8080/317
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins

2015-04-14 23:33:20 +00:00

functional-planner

IMPALA-1519: Fix wrapping of exprs via a TupleIsNullPredicate with analytics.

2015-04-14 23:33:20 +00:00

functional-query

IMPALA-1519: Fix wrapping of exprs via a TupleIsNullPredicate with analytics.

2015-04-14 23:33:20 +00:00

hive-benchmark

Refactor testing framework to generate Avro tables.

2014-01-08 10:48:45 -08:00

targeted-perf

Add targted-perf query that makes local expr allocations

2014-10-07 15:48:32 -07:00

targeted-stress

BufferedBlockMgr: bug fixes for stress.

2014-10-06 15:09:13 -07:00

tpcds

Nested types: BE changes for Parquet struct support

2015-02-26 00:19:25 +00:00

tpcds-insert

[CDH5] Modified TPCDS schema and queries to match Impala TPCDS kit

2014-08-08 02:20:40 -07:00

tpch

IMPALA-1705: Support writing values larger than 64KB to Parquet files

2015-03-03 05:44:55 +00:00

README

Move functional data loading to new framework + initial changes for workload directory structure

2014-01-08 10:44:18 -08:00

README

This directory contains Impala test workloads. The directory layout for the workloads should follow:

workloads/
   <data set name>/<data set name>_dimensions.csv  <- The test dimension file
   <data set name>/<data set name>_core.csv  <- A test vector file
   <data set name>/<data set name>_pairwise.csv
   <data set name>/<data set name>_exhaustive.csv
   <data set name>/queries/<query test>.test <- The queries for this workload