impala

mirror of https://github.com/apache/impala.git synced 2025-12-30 03:01:44 -05:00

Files

poojanilangekar c6f9b61ec2 IMPALA-6625: Skip computing parquet conjuncts for non-Parquet scans

This change ensures that the planner computes parquet conjuncts
only when for scans containing parquet files. Additionally, it
also handles PARQUET_DICTIONARY_FILTERING and
PARQUET_READ_STATISTICS query options in the planner.

Testing was carried out independently on parquet and non-parquet
scans:
  1. Parquet scans were tested via the existing parquet-filtering
     planner test. Additionally, a new test
     [parquet-filtering-disabled] was added to ensure that the
     explain plan generated skips parquet predicates based on the
     query options.
  2. Non-parquet scans were tested manually to ensure that the
     functions to compute parquet conjucts were not invoked.
     Additional test cases were added to the parquet-filtering
     planner test to scan non parquet tables and ensure that the
     plans do not contain conjuncts based on parquet statistics.
  3. A parquet partition was added to the alltypesmixedformat
     table in the functional database. Planner tests were added
     to ensure that Parquet conjuncts are constructed only when
     the Parquet partition is included in the query.

Change-Id: I9d6c26d42db090c8a15c602f6419ad6399c329e7
Reviewed-on: http://gerrit.cloudera.org:8080/10704
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

2018-07-06 02:06:50 +00:00

functional-planner

IMPALA-6625: Skip computing parquet conjuncts for non-Parquet scans

2018-07-06 02:06:50 +00:00

functional-query

IMPALA-6625: Skip computing parquet conjuncts for non-Parquet scans

2018-07-06 02:06:50 +00:00

hive-benchmark

Refactor testing framework to generate Avro tables.

2014-01-08 10:48:45 -08:00

perf-regression

IMPALA-3311: fix string data coming out of aggs in subplans

2016-05-12 23:06:36 -07:00

targeted-perf

IMPALA-6819: Add new queries to targeted-perf workload

2018-05-09 23:08:44 +00:00

targeted-stress

IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-05 01:03:02 +00:00

tpcds

IMPALA-5717: Support for reading ORC data files

2018-04-11 05:13:02 +00:00

tpcds-insert

[CDH5] Modified TPCDS schema and queries to match Impala TPCDS kit

2014-08-08 02:20:40 -07:00

tpcds-unmodified

IMPALA-6819: Add new performance test workload - tpcds-unmodified used by Impala Performance Tests

2018-05-13 09:06:06 +00:00

tpch

IMPALA-6781: expand ORDER BY in some TPCH queries

2018-05-09 19:07:00 +00:00

tpch_nested

IMPALA-4924: Enable Decimal V2 by default

2018-01-25 04:33:11 +00:00

README

Move functional data loading to new framework + initial changes for workload directory structure

2014-01-08 10:44:18 -08:00

README

This directory contains Impala test workloads. The directory layout for the workloads should follow:

workloads/
   <data set name>/<data set name>_dimensions.csv  <- The test dimension file
   <data set name>/<data set name>_core.csv  <- A test vector file
   <data set name>/<data set name>_pairwise.csv
   <data set name>/<data set name>_exhaustive.csv
   <data set name>/queries/<query test>.test <- The queries for this workload