Commit Graph

3 Commits

Author SHA1 Message Date
Attila Jeges
3338bae608 IMPALA-8043: Fix BE test failures related to SystemV timezones.
This is a fix for the following issue:

1. Some BE tests (e.g. ExprTest.TimestampFunctions) use the system's
   local timezone but run against a test timezone db (instead of the
   system's timezone db).
2. On some Linux installations /usr/share/zoneinfo contains symlinks
   to files in the /usr/share/zoneifo/SystemV directory
   (e.g /usr/share/zoneinfo/America/Los_Angeles is a symlink to
   ../SystemV/PST8PDT).
3. The 'SystemV' directory is not part of the test timezone db, since
   it is obsolete and excluded by default.

Consequently, if the system's local timezone is set to
America/Los_Angeles, BE tests won't find the corresponding timezone
file in the test timezone db. BE tests will default to UTC, which will
break some of them.

This change sets local timezone explicitly for failing BE tests, so
they don't depend on the system's local timezone.
It also adds 'SystemV' directory to the test timezone db to avoid
similar issues in the future.

Change-Id: I9288cd24c8af0c059e55d47c86bd92eaf0075681
Reviewed-on: http://gerrit.cloudera.org:8080/12199
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-01-15 17:04:55 +00:00
Attila Jeges
17749dbcfc IMPALA-3307: Add support for IANA time-zone db
Impala currently uses two different libraries for timestamp
manipulations: boost and glibc.

Issues with boost:
- Time-zone database is currently hard coded in timezone_db.cc.
  Impala admins cannot update it without upgrading Impala.
- Time-zone database is flat, therefore can’t track year-to-year
  changes.
- Time-zone database is not updated on a regular basis.

Issues with glibc:
- Uses /usr/share/zoneinfo/ database which could be out of sync on
  some of the nodes in the Impala cluster.
- Uses the host system’s local time-zone. Different nodes in the
  Impala cluster might use a different local time-zone.
- Conversion functions take a global lock, which causes severe
  performance degradation.

In addition to the issues above, the fact that /usr/share/zoneinfo/
and the hard-coded boost time-zone database are both in use is a
source of inconsistency in itself.

This patch makes the following changes:
- Instead of boost and glibc, impalad uses Google's CCTZ to implement
  time-zone conversions.

- Introduces a new startup flag (--hdfs_zone_info_zip) to impalad to
  specify an HDFS/S3/ADLS path to a zip archive that contains the
  shared compiled IANA time-zone database. If the startup flag is set,
  impalad will use the specified time-zone database. Otherwise,
  impalad will use the default /usr/share/zoneinfo time-zone database.

- Introduces a new startup flag (--hdfs_zone_alias_conf) to impalad to
  specify an HDFS/S3/ADLS path to a shared config file that contains
  definitions for non-standard time-zone aliases.

- impalad reads the entire time-zone database into an in-memory
  map on startup for fast lookups.

- The name of the coordinator node’s local time-zone is saved to the
  query context when preparing query execution. This time-zone is used
  whenever the current time-zone is referred afterwards in an
  execution node.

- Adds a new ZipUtil class to extract files from a zip archive. The
  implementation is not vulnerable to Zip Slip.

Cherry-picks: not for 2.x.

Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Reviewed-on: http://gerrit.cloudera.org:8080/9986
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Attila Jeges <attilaj@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-06-22 13:18:58 +00:00
Juan Yu
934b28fe5e IMPALA-1381: Expand set of supported timezones.
The hardcoded timezone information is from Java version 1.7.0_76.

Change-Id: I32c40d0036473079e5bfd4d0252a648cbb0e7c23
Reviewed-on: http://gerrit.cloudera.org:8080/393
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
2015-05-22 01:32:54 +00:00