impala

mirror of https://github.com/apache/impala.git synced 2026-01-25 09:01:08 -05:00

Files

Tim Armstrong 588e1d46e9 IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner

Impala already supported RLE encoding for levels and dictionary pages, so
the only task was to integrate it into BoolColumnReader.

A new benchmark, rle-benchmark.cc is added to test the speed of RLE
decoding for different bit widths and run lengths.

There might be a small performance impact on PLAIN encoded booleans,
because of the additional branch when the cache of BoolColumnReader is
filled. As the cache size is 128, I considered this to be outside the
"hot loop".

Testing:

As Impala cannot write RLE encoded bool columns at the moment, parquet-mr
was used to create a test file, testdata/data/rle_encoded_bool.parquet

tests/query_test/test_scanners.py#test_rle_encoded_bools creates a table
that uses this file, and tries to query from it.

Change-Id: I4644bf8cf5d2b7238b05076407fbf78ab5d2c14f
Reviewed-on: http://gerrit.cloudera.org:8080/9403
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins

2018-03-22 02:47:33 +00:00

functional-planner

IMPALA-5270: Pass resolved exprs into analytic SortInfo.

2018-03-15 02:00:46 +00:00

functional-query

IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner

2018-03-22 02:47:33 +00:00

hive-benchmark

Refactor testing framework to generate Avro tables.

2014-01-08 10:48:45 -08:00

perf-regression

IMPALA-3311: fix string data coming out of aggs in subplans

2016-05-12 23:06:36 -07:00

targeted-perf

IMPALA-6621: Improve set lookup performance for in-predicate evaluation

2018-03-21 00:40:10 +00:00

targeted-stress

IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-05 01:03:02 +00:00

tpcds

IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-14 21:38:06 +00:00

tpcds-insert

[CDH5] Modified TPCDS schema and queries to match Impala TPCDS kit

2014-08-08 02:20:40 -07:00

tpch

IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-14 21:38:06 +00:00

tpch_nested

IMPALA-4924: Enable Decimal V2 by default

2018-01-25 04:33:11 +00:00

README

Move functional data loading to new framework + initial changes for workload directory structure

2014-01-08 10:44:18 -08:00

README

This directory contains Impala test workloads. The directory layout for the workloads should follow:

workloads/
   <data set name>/<data set name>_dimensions.csv  <- The test dimension file
   <data set name>/<data set name>_core.csv  <- A test vector file
   <data set name>/<data set name>_pairwise.csv
   <data set name>/<data set name>_exhaustive.csv
   <data set name>/queries/<query test>.test <- The queries for this workload