mirror of
https://github.com/apache/impala.git
synced 2026-01-04 00:00:56 -05:00
Main source for TPCDS query and result definitions: https://github.com/gregrahn/tpcds-kit. TPC-DS v2.5.0 qualification queries from G. Rahn, Cloudera, Inc. Data set constructed in mini-cluster using $IMPALA_HOME/buildall.sh -testdata.... This commit continues previous work on IMPALA-5376 in the ASF Impala repo and the Cloudera Gerrit service. This commit splits multi-query tests in the TPC-DS suite definition into one query and result set per test file, as the test framework requires. Names for such files have -1, -2... inner suffixes. The portion of the TPC-DS test suite in this commit passes. It contains no failures, as reflected by runs of $IMPALA_HOME/tests/run-tests.py query_test/test_tpcds_queries.py ... IMPALA-6007 addresses the TPC-DS cases that require skipping (because we don't support them or they flap) or expected-failure (xfail, because we support them but they fail due to bugs.) These require some added tooling for non-Pytest frameworks like the stress test to avoid attempting them until they work. Tests that flap are marked to skip, with a bug ID, since they don't reliably pass or xfail. Expected result sets come from the TPC-DS kit. Some TPC-DS test cases in this commit have been modified in sematically-neutral ways so as to pass on Impala. The tests/query_test/test_tpcds_queries.py driver file is authoritative for the active/skip/xfail status for each case and a brief reason. The following list describes the current status as: --- test-name deviance from TPC-DS spec changes made --- tpcds-q22a.test RESULT MISMATCH in LSD of AVG() values FIXED, HAND_ROUNDED AVG() VALUES IN RESULT SET --- tpcds-q26.test RESULT MISMATCH in LSD of AVG() values ABSENT, IMPALA-6087 --- tpcds-q28.test RESULT MISMATCH in LSD of AVG() values ABSENT, IMPALA-6087 --- tpcds-q30.test UNRECOGNIZED CHARACTER ABSENT, IMPALA-5961. --- tpcds-q31.test RESULT MISMATCH in LSD of DECIMAL values ABSENT, IMPALA-5956. --- tpcds-q35a.test RESULT MISMATCH ABSENT, IMPALA-5950. --- tpcds-q36a.test RESULT MISMATCH ABSENT, IMPALA-4741 --- tpcds-q47.test RESULT MISMATCH in LSD of DECIMAL values ABSENT, IMPALA-6087 --- tpcds-q48.test RESULT MISMATCH in scalar value ABSENT, IMPALA-5950. --- tpcds-q49.test RESULT MISMATCH in LSD of DECIMAL values ABSENT, IMPALA-5945 --- tpcds-q57.test RESULT MISMATCH, excess scale in DECIMAL values ABSENT, IMPALA-6087 --- tpcds-q58.test RESULT MISMATCH in DECIMAL values ABSENT, IMPALA-5946 --- tpcds-q59.test RESULT MISMATCH, excess scale in DECIMAL values ABSENT, IMPALA-6087 --- tpcds-q61.test RESULT MISMATCH in DECIMAL value FIXED. CAST RESULT QUOTIENT TO DECIMAL(15, 4), TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q63.test RESULT MISMATCH, excess scale in DECIMAL values ABSENT, IMPALA-6087 --- tpcds-q64.test RESULT MISMATCH ADDED ORDER BY COLUMNS. --- tpcds-q66.test RESULT MISMATCH ABSENT, IMPALA-4741 --- tpcds-q77a.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q78.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q83.test RESULT MISMATCH ABSENT, IMPALA-5945. --- tpcds-q85.test MISSING TABLE "reason" ABSENT, IMPALA-5960 --- tpcds-q86a.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q89.test RESULT MISMATCH, DECIMAL values flap ABSENT, ADDED ROUND(2) TO 8th COLUMN, TAKE ACTUAL RESULTS AS EXPECTED, IMPALA-5956. --- tpcds-q90.test RESULT MISMATCH ABSENT, IMPALA-5945. --- tpcds-q93.test MISSING TABLE "reason" ABSENT, IMPALA-5960 --- tpcds-q98.test RESULT MISMATCH FIXED, ADDED ROUND() TO LAST COLUMN Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Reviewed-on: http://gerrit.cloudera.org:8080/8102 Reviewed-by: Michael Brown <mikeb@cloudera.com> Tested-by: Impala Public Jenkins
This directory contains Impala test workloads. The directory layout for the workloads should follow: workloads/ <data set name>/<data set name>_dimensions.csv <- The test dimension file <data set name>/<data set name>_core.csv <- A test vector file <data set name>/<data set name>_pairwise.csv <data set name>/<data set name>_exhaustive.csv <data set name>/queries/<query test>.test <- The queries for this workload