impala

mirror of https://github.com/apache/impala.git synced 2026-01-01 00:00:20 -05:00

Author	SHA1	Message	Date
Ippokratis Pandis	fe0646f76b	IMPALA-1022: Handle cases where in Parquet the expected number of rows in metadata is wrong There are cases of Parquet files where the metadata indicate wrong number of rows for these files. The parquet-scanner until now was not reporting any problem in this case. Instead it was reading as long as there where values for the read columns. But with IMPALA-1016 we are now reading at most as many rows as the rows per metadata. With this patch, the parquet-scanner, right before it finishes scannings, checks whether it read the expected number of rows (taken from metadata). In cases where the actual number of rows read is less than or greater than the expected number, it either aborts or logs an error. Change-Id: Ie6a66a38e8912730bf04762e6526ec1cadb2bcdc Reviewed-on: http://gerrit.ent.cloudera.com:8080/2755 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2944	2014-06-10 17:27:54 -07:00
Lenni Kuff	8d1674f638	Run only subset of tests with small batch_sizes + a few small fixes	2014-01-08 10:48:58 -08:00
Lenni Kuff	45c1cbe1fd	Use Python 2.6 style dictionary comprehension for building test dimensions	2014-01-08 10:47:05 -08:00
Lenni Kuff	ef9a5c2d0e	Add test suite for DEFAULT_ORDER_BY_LIMIT query option	2014-01-08 10:47:05 -08:00
Nong Li	b575b08357	Fix planner to reject compressed text formats.	2014-01-08 10:47:01 -08:00
Lenni Kuff	ef48f65e76	Add test framework for running Impala query tests via Python This is the first set of changes required to start getting our functional test infrastructure moved from JUnit to Python. After investigating a number of option, I decided to go with a python test executor named py.test (http://pytest.org/). It is very flexible, open source (MIT licensed), and will enable us to do some cool things like parallel test execution. As part of this change, we now use our "test vectors" for query test execution. This will be very nice because it means if load the "core" dataset you know you will be able to run the "core" query tests (specified by --exploration_strategy when running the tests). You will see that now each combination of table format + query exec options is treated like an individual test case. this will make it much easier to debug exactly where something failed. These new tests can be run using the script at tests/run-tests.sh	2014-01-08 10:46:50 -08:00

6 Commits