Commit Graph

44 Commits

Author SHA1 Message Date
Ippokratis Pandis
48699de6e3 IMPALA-1621,2241,2271,2330,2352: Lazy switch to IO buffers to reduce min mem needed for PAGG/PHJ
PAGG and PHJ were using an all-or-nothing approach wrt spilling. In
particular, they were trying to switch to IO-sized buffers for both
streams (aggregated and unaggregated in PAGG; build and probe in PHJ)
of every partition (currently 16 partitions for a total of 32
streams), even if some of the streams had very few rows, they were
empty or simply they would not spill so there was no need to allocate
IO-buffers for them. That was increasing the min mem needed by those
operators in many queries.

This patch decouples the decision to switch to IO-buffers for each
stream of each partition. Streams will switch to IO-sized buffers
whenever the rows they contain do not fit in the first two small
buffers (64KB and 512KB respectively). When we decide to spill a
partition, we switch to IO buffers both streams.

With these change many streams of PAGG and PHJ nodes do not need to
use IO-sized buffers, reducing the min mem requirement. For example,
below is the min mem needed (in MBs) for some of the TPC-H queries.
Some need half or less mem from the mem they needed before:

  TPC-H Q3: 645 -> 240
  TPC-H Q5: 375 -> 245
  TPC-H Q7: 685 -> 265
  TPC-H Q8: 740 -> 250
  TPC-H Q9: 650 -> 400
  TPC-H Q18: 1100 -> 425
  TPC-H Q20: 420 -> 250
  TPC-H Q21: 975 -> 620

To make this small buffer optimization to work, we had to fix
IMPALA-2352. That is, the AllocateRow() call of
PAGG::ConstructIntermediateTuple() could return unsuccessfully just
because the small buffers of the stream were exhausted. In that case,
previously we would treat it as an indication that there is no memory
left, start spilling a partition and switching all stream to
IO-buffes. Now we make a best effort, trying to first
SwitchToIoffers() and if that is successful, we re-attempt the
AllocateRow() call. See IMPALA-2352 for more details.

Another change is that now SwitchToIoBuffers() will reset the flag
using_small_buffers_ back to false, in case we are in a very low
memory situation and it fails to get a buffer. That allows us to
retry calling SwitchToIoBuffers() once we free up some space. See
IMPALA-2330 for more details.

With the above fixes we should also have fixed IMPALA-2241 and
IMPALA-2271 that are essentially stream::using_small_buffers_-related
DCHECKs.

This patch adds all 22 TPC-H queries in test_mem_usage_scaling test
and updates the per-query min mem limits in it. Additionally, it adds
a new aggregation test that uses the TPC-H dataset for larger
aggregations (TestTPCHAggregationQueries). It also removes some
dead test code.

Change-Id: Ia8ccd0b76f6d37562be21fd4539aedbc2a864d38
Reviewed-on: http://gerrit.cloudera.org:8080/818
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: Internal Jenkins

Conflicts:

	tests/query_test/test_aggregation.py
2015-09-23 11:07:42 -07:00
ishaan
dbc78aaa2c Enable isilon end to end tests for Impala.
This patch introduces changes to run tests against Isilon, combined with minor cleanup of
the test and client code.
For Isilon, it:
  - Populates the SkipIfIsilon class with appropriate pytest markers.
  - Introduces a new default for the hdfs client in order to connect to Isilon.
  - Cleans up a few test files take the underlying filesystem into account.
  - Cleans up the interface for metadata/test_insert_behaviour, query_test/test_ddl

On the client side, we introduce a wrapper around a few pywebhdfs's methods, specifically:
  - delete_file_dir does not throw an error if the file does not exist.
  - get_file_dir_status automatically strips the leading '/'

Change-Id: Ic630886e253e43b2daaf5adc8dedc0a271b0391f
Reviewed-on: http://gerrit.cloudera.org:8080/370
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-05-27 22:25:12 +00:00
Ippokratis Pandis
4d428440d8 IMPALA-1919: Avoid calling ProcessBatch with out_batch->AtCapacity in right joins
PHJ::GetNext() of RIGHT_OUTER, RIGHT_ANTI and FULL_OUTER joins that
had repartitioned were not checking whether the output batch reached
capacity at the OutputUnmatchedBuild() call. In case of repartitioned
joins where the list of build_partitions was exhausted and the output
batch has already reached capacity, we would call ProcessProbeBatch()
with a full output batch, resulting a DCHECK. This patch adds the
missing AtCapacity() check.

It also adds a new join test (tpch-out-joins) that uses the TPC-H
dataset and moves there some of the join tests that were using it.
Running join tests with the larger TPC-H dataset is needed, for
example, in order to trigger repartitions.

Change-Id: I4434ad0683e1b09f75a25b3eb870a817d4988370
Reviewed-on: http://gerrit.cloudera.org:8080/314
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: Internal Jenkins
2015-05-04 19:49:56 +00:00
Matthew Jacobs
99219488d7 IMPALA-1705: Support writing values larger than 64KB to Parquet files
Allow values larger than 64KB to be written to Parquet files. This was
previously limited by a fixed data page size. This commit removes that
limitation by allowing the page size to grow when necessary. This occurs
when there are enough unique values to switch from dictionary encoding
to plain encoding, and then there are huge values larger than the
default 64KB page size. In this case, it may be possible to write files
larger than one HDFS block, but this is an edge case and not worth
introducing additional complexity to handle.

Change-Id: I165ef44ba48ff0c3c3203860157a61c45f77df8b
Reviewed-on: http://gerrit.cloudera.org:8080/120
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Internal Jenkins
2015-03-03 05:44:55 +00:00
ishaan
dee6911b20 Enable loading metadata from the hive metastore snapshot and cleanup build scripts.
This patch contains the following changes:
  - Add a metastore_snapshot_file parameter to build.sh
  - Enable skipping loading the metadata.
  - create-load-data.sh is refactored into functions.
  - A lot of scripts source impala-config, which creates a lot of log spew. This has now
    been muted.
  - Unecessary log spew from compute-table-stats has been muted.
  - build_thirdparty.sh determins its parallelism from the system, it was previously hard
    coded to 4
  - Only force load data of the particular dataset if a schema change is detected.

Change-Id: I909336451e5c1ca57d21f040eb94c0e831546837
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5540
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-12-19 13:41:00 -08:00
Matthew Jacobs
25428fdb21 Add support for streaming decompression of gzip text
Compressed text formats currently require entire compressed
files be read into memory to be decompressed in a single call
to the decompression codec. This changes the HdfsTextScanner
to drive gzip in a streaming mode, i.e. produce partial output
as input is consumed.

Change-Id: Id5c0805e18cf6b606bcf27a5df4b5f58895809fd
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5233
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 05c3cc55e7a601d97adc4eebe03f878c68a33e56)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5385
2014-11-23 01:55:55 -08:00
Taras Bobrovytsky
e5e06c307b [CDH5] Modified TPCH queries to match the specification
Change-Id: Ife2c1fae4d774cd8fe188dfe9c98042ff7e45368
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4997
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-10-29 22:07:33 -07:00
ishaan
c4b4e010ff Buffered Tuple Stream fixes.
This patch fixes two issues:
  - Add API to buffered block mgr to allow an atomic Unpin and GetNewBlock. This has
    the semantics of unpinning a block and giving the buffer to the new block. This
    is necessary for the tuple stream to make sure another thread does not grab the
    unpinned block in between.
  - Buffer management reading an unpinned stream. Before moving onto a new block (and
    unpinning the current), we need to make sure all the tuples returned from the
    current block are returned up the operator tree.

Change-Id: I95ee58d1019dd971f6a7dc19ecafdfa54cdbf942
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4333
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-09-20 16:05:11 -07:00
Nong Li
a4e2f97845 Fix and add spilling test.
More tests coming.

Change-Id: I09e98adb6b011575572051eff1cd52e7be689fe8
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4311
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-09-13 00:19:21 -07:00
Ippokratis Pandis
8e193abff3 IMPALA-1157: Crash in HdfsParquetWriter when value size larger than data page size.
The HdfsParquetWriter was not making the check whether the size needed to encode a single
value is larger than the data page size. In this (rare) case, it was getting into an
infinite loop of allocating new pages, eventually crashing when it would run out of
memory.

Change-Id: I1375c047406ebd863fd8fb826ff1ecc0cb6335bc
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3870
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: jenkins
2014-08-17 12:46:17 -07:00
Alex Behm
5798cf7c6f CDH-18432: Fix assignment of On-clause predicates of semi-/anti-joins.
Change-Id: Iad4b772f170ea70ded0747ce55972b0a5194f88a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3852
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-08-17 12:24:51 -07:00
ishaan
2b5df0c6ff [CDH5] Convert tpch schemas to decimal and change the queries where possible.
I used the following document for reference: http://www.tpc.org/tpch/spec/tpch2.1.0.pdf

Change-Id: Ic84db0628323c90e89552707f214bbb9fa2f2ae0
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3132
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-07-08 14:51:43 -07:00
Victor Bittorf
808f9a661a IMPALA-939: Regex should match anywhere in string.
Change-Id: I8dcd337c3b06b632017270670a4f199ec7ada648
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2296
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit c97f82eaaf0efe9bd4c3da3d005464f425696a62)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2371
2014-04-25 16:16:15 -07:00
Nong Li
d401f746d4 IMPALA-692: Fix data corruption with dictionary encoded values.
We weren't clearing the state in the dictionary when rolling over to a new
page. The memory for the dictionary (built from the first file) was cleared
but the dictionary entires were not.

This also had a minor side effect that unused dictionary entries from the first
page were still being written out for subsequent pages, although in practice,
this is unlikely to affect the file size much.

Change-Id: I8e11fc4723dc23d21c5de8a42def13d8238c137b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1072
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:24 -08:00
Lenni Kuff
a2cbd2820e Add Catalog Service and support for automatic metadata refresh
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.

Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.

What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
  contains the change. An impalad will wait for the statestore heartbeat that contains this
  version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing

Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
  same JAR.

Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:11 -08:00
ishaan
2f7d24b35b Fix tpch-q18, to not use qualified table names. 2014-01-08 10:51:49 -08:00
ishaan
ece902d953 Fix tpch-q18 to inser into the database associated with its scale-factor. 2014-01-08 10:51:45 -08:00
Nong Li
58631d9ce0 Fix parquet insert .test files. 2014-01-08 10:49:46 -08:00
Skye Wanderman-Milne
a7e15b1417 Update Parquet scanner to only scan a file if assigned the first split.
Also re-enable Parquet tests.
2014-01-08 10:49:25 -08:00
Nong Li
329763e5ab Disable parquet tests. 2014-01-08 10:49:20 -08:00
Nong Li
20fc700002 Fix precision issue in text table writer. 2014-01-08 10:49:19 -08:00
Lenni Kuff
5f81becd84 Create tables used by insert tests in a supported insert format 2014-01-08 10:49:00 -08:00
Nong Li
0df9476be1 Parquet data loading. 2014-01-08 10:48:48 -08:00
Skye Wanderman-Milne
461a48df2b Refactor testing framework to generate Avro tables. 2014-01-08 10:48:45 -08:00
Nong Li
6e293090e6 Parquet writer.
Change-Id: I7117b545e3d3a7803a219234ad992040a6c7c4ec
2014-01-08 10:48:44 -08:00
Lenni Kuff
328ceed4e7 Add support for generating lzo compressed text files and running tests against lzo 2014-01-08 10:48:38 -08:00
ishaan
5138a720bb IMP-768: Enable the python test framework to check for insert results. 2014-01-08 10:48:22 -08:00
ishaan
09d6d931f4 Change the way data is loaded 2014-01-08 10:48:09 -08:00
Nong Li
a0229cd12e Update tpch schema to use bigint for keys. 2014-01-08 10:47:54 -08:00
Nong Li
02c329b97a Update RC files to use io mgr and remove scanner support for non-io mgr. 2014-01-08 10:47:11 -08:00
Nong Li
f46c654e01 Enable tpch-q21 and tpch-q22 in tests. 2014-01-08 10:47:03 -08:00
Lenni Kuff
837f35eab3 Updated results for more query tests to reflect proper ordering + improved result updating 2014-01-08 10:46:53 -08:00
Lenni Kuff
a035cf4e73 Update results of a few TPC-H queries to reflect proper ordering
Change-Id: I41156b506155c846220cfb097f5e8120503f8da8
2014-01-08 10:46:52 -08:00
Marcel Kornacker
f6af9316d9 Fix for IMP-137: incorrect predicate placement for outer joins
Fixing predicate assignment for outer joins:
- On clause predicates for outer joins are now assigned to the join node
- the exception are On clause predicates that can be directly evaluated
  by the outer-joined tables themselves; those are "pushed down"
- Where clause predicates for outer-joined tables are assigned to the join node
  that materializes the outer join
2014-01-08 10:46:50 -08:00
Lenni Kuff
ef48f65e76 Add test framework for running Impala query tests via Python
This is the first set of changes required to start getting our functional test
infrastructure moved from JUnit to Python. After investigating a number of
option, I decided to go with a python test executor named py.test
(http://pytest.org/). It is very flexible, open source (MIT licensed), and will
enable us to do some cool things like parallel test execution.

As part of this change, we now use our "test vectors" for query test execution.
This will be very nice because it means if load the "core" dataset you know you
will be able to run the "core" query tests (specified by --exploration_strategy
when running the tests).

You will see that now each combination of table format + query exec options is
treated like an individual test case. this will make it much easier to debug
exactly where something failed.

These new tests can be run using the script at tests/run-tests.sh
2014-01-08 10:46:50 -08:00
Lenni Kuff
1451650055 Bring onlne all TPCH planner tests (updated for new planner) and supported query tests 2014-01-08 10:46:21 -08:00
Lenni Kuff
9f91081183 Modify TPCH tests to always insert into text table so workload can run on all file formats 2014-01-08 10:46:21 -08:00
ishaan
42231b7d86 Annotate queries for better benchmark reporting. 2014-01-08 10:45:05 -08:00
Henry Robinson
e7348a209b IMP-232: Parallel INSERT OVERWRITE 2014-01-08 10:45:04 -08:00
Nong Li
4fd7bd9606 Updated tpch core workload to include seq/snappy and seq/gzip.
Change-Id: Ifb01ee95542fced2ae8cfa4928ffbc7e357df3a8
2014-01-08 10:44:34 -08:00
Alan Choi
cbadb4eac4 When a scan range begins at the starting point fo the tuple, we'll missed that tuple. This patch fixes
this problem.

review: 162
2014-01-08 10:44:24 -08:00
Lenni Kuff
04edc8f534 Update benchmark tests to run against generic workload, data loading with scale factor, +more
This change updates the run-benchmark script to enable it to target one or more
workloads. Now benchmarks can be run like:

./run-benchmark --workloads=hive-benchmark,tpch

We lookup the workload in the workloads directory, then read the associated
query .test files and start executing them.

To ensure the queries are not duplicated between benchmark and query tests, I
moved all existing queries (under fe/src/test/resources/* to the workloads
directory. You do NOT need to look through all the .test files, I've just moved
them. The one new file is the 'hive-benchmark.test' which contains the hive
benchmark queries.

Also added support for generating schema for different scale factors as well as
executing against these scale factors. For example, let's say we have a dataset
with a scale factor called "SF1". We would first generate the schema using:

./generate_schema_statements --workload=<workload> --scale_factor="SF3"
This will create tables with a unique names from the other scale factors.

Run the generated .sql file to load the data. Alternatively, the data can loaded
by running a new python script:
./bin/load-data.py -w <workload1>,<workload2> -e <exploration strategy> -s [scale factor]
For example: load-data.sh -w tpch -e core -s SF3

Then run against this:
./run-benchmark --workloads=<workload> --scale_factor=SF3

This changeset also includes a few other minor tweaks to some of the test
scripts.

Change-Id: Ife8a8d91567d75c9612be37bec96c1e7780f50d6
2014-01-08 10:44:22 -08:00
Michael Ubell
02d63d8dc3 Trevni file support 2014-01-08 10:44:19 -08:00
Lenni Kuff
bf27a31f98 Move functional data loading to new framework + initial changes for workload directory structure
This change moves (almost) all the functional data loading to the new data
loading framework. This removes the need for the create.sql, load.sql, and
load-raw-data.sql file. Instead we just have the single schema template file:
testdata/datasets/functional/functional_schema_template.sql

This template can be used to generate the schema for all file formats and
compression variations. It also should help make loading data easier. Now you
can run:

bin/load-impala-data.sh "query-test" "exhaustive"

And get all data needed for running the query tests.

This change also includes the initial changes for new dataset/workload directory
structure. The new structure looks like:

testdata/workload  <- Will contain query files and test vectors/dimensions

testdata/datasets <- WIll contain the data files and schema templates

Note: This is the first part of the change to this directory structure - it's
not yet complete. # Please enter the commit message for your changes. Lines starting
2014-01-08 10:44:18 -08:00