impala

mirror of https://github.com/apache/impala.git synced 2025-12-30 12:02:10 -05:00

Files

Lars Volker 8ea21d099f IMPALA-2523: Make HdfsTableSink aware of clustered input

IMPALA-2521 introduced clustering for insert statements. This change
makes the HdfsTableSink aware of clustered inputs, so that partitions
are opened, written, and closed one by one.

This change also adds/modifies tests in several ways:

- clustered insert tests switch from selecting all rows from
  alltypessmall to alltypes. Together with varying settings for
  batch_size, this results in a larger number of row batches being
  written.
- clustered insert tests select from alltypes instead of
  functional.alltypes to make sure we also select from various input
  formats.
- clustered insert tests have been added to select from alltypestiny to
  create inserts with 1 and 2 rows per partition respectively.
- exhaustive insert tests now use different values for batch_size: 1,
  16, 0 (meaning default, 1024). This is limited to uncompressed parquet
  files, to maintain a reasonable runtime. On my machine execution of
  test.insert took 1778 seconds, compared to 1002 seconds with the just
  default row batch size.
- There is additional testing in test_insert_behaviour.py to make sure
  that insertion over several row batches only creates one file per
  partition.
- It renames the test_insert method to make it unique in the file and
  allow for effective filtering with -k.
- It adds tests to the Analyzer test suite.

Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Reviewed-on: http://gerrit.cloudera.org:8080/4863
Reviewed-by: Lars Volker <lv@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins

2016-11-22 02:51:20 +00:00

queries

IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-11-22 02:51:20 +00:00

functional-query_core.csv

IMPALA-3718: Support subset of functional-query for Kudu

2016-09-14 22:11:04 +00:00

functional-query_dimensions.csv

Starting Kudu as part of the run-all.sh command / data loading

2015-06-01 15:53:34 -07:00

functional-query_exhaustive.csv

IMPALA-3718: Support subset of functional-query for Kudu