impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Michael Smith	8935c75904	IMPALA-11859: Add bytes-read-encrypted metric Adds a metric bytes-read-encrypted to track encrypted reads. Testing: - ran test_io_metrics.py with Ozone (encrypts by default) - ran test_io_metrics.py with HDFS (no encryption) Change-Id: I9dbc194a4bc31cb0e01545fb6032a0853db60f34 Reviewed-on: http://gerrit.cloudera.org:8080/19461 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-02-08 16:18:37 +00:00
Michael Smith	b858f2acde	IMPALA-11883: Calculate erasure-coded bytes read directly Calculate the metric erasure-coded-bytes-read directly from HDFS reads rather than through hdfsFileGetReadStatistics. This allows us to use it for other filesystem implementations (Ozone). Also renumbers is_erasure_coded in THdfsFileSplit to 8, where it was originally before it was removed by IMPALA-9485 (and never replaced). Testing: - ran updated test_io_metrics.py with Ozone, with and without EC - ran updated test_io_metrics.py with HDFS, with and without EC Change-Id: Ide0fc806590b2328df8068a9a54645d1d1fb137c Reviewed-on: http://gerrit.cloudera.org:8080/19460 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2023-02-08 16:18:37 +00:00
Michael Smith	bbb0b4939d	IMPALA-11476: Support Ozone erasure coding Adds support for identifying erasure coding policy with Ozone. Enables testing Ozone with erasure coding. Omits support for identifying erasure coding policy with the o3fs protocol as that protocol is effectively deprecated and its classes don't provide access to the ObjectStore. Refactors volumeBucketPair to use StringTokenizer. Test updates: - test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is larger than all tables in our test environment. Use tpch_parquet which we rely on having 3 files (by loading from snapshot in this case). - test_new_file_shorter: receives an EOFException when seeking with EC - test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697 - test_erasure_coding: Ozone doesn't report files as erasure-coded (HDDS-7603) Testing: - Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone and ERASURE_CODING=true. Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1 Reviewed-on: http://gerrit.cloudera.org:8080/19324 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-01-25 18:18:28 +00:00
Michael Smith	4e87a80fae	IMPALA-9488: Add metrics for EC reads Adds metric tracking erasure-coded bytes read. Adds ScanRange::TestInfo to pass file info through calls of AllocateScanRange so it's easier to add erasure coding as another piece of file info. Adds a test to verify that the expected number of bytes are read for existing read metrics and the new `erasure-coded-bytes-read` metric when doing a select. Change-Id: Ieb06bac9dea4b632621653d2935e9a7b2dc81341 Reviewed-on: http://gerrit.cloudera.org:8080/19178 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-11-09 15:38:33 +00:00

5 Commits