add the new query option UTF8_MODE topic
update impala_string topic as requested in the first review
create a new topic for UTF_8 mode under SQL ref
discuss the new query option
Change-Id: Ifac5812a3f5e105a73ac87c1ae5fce69a776fb92
Reviewed-on: http://gerrit.cloudera.org:8080/18424
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
The numer of snapshots can grow large, because every INSERT can create a
new one. This patch adds inclusive predicates to narrow down the
resultset of the DESCRIBE HISTORY statement, these are:
- DESCRIBE HISTORY <table> FROM <ts>
- DESCRIBE HISTORY <table> BETWEEN <ts> AND <ts>
The timestamps can be date time values and intervals as well, such as:
- '2022-02-04 13:31:09.819'
- 'now() - interval 2 days'
Testing:
- Added e2e tests that verifies the result.
- Added unit tests that checks the analysis.
Change-Id: Ifead0d33f22069005bfd623460f4af1ff197cc0e
Reviewed-on: http://gerrit.cloudera.org:8080/18284
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Impala supports reading ORC files by default for quite some time.
Removed enable_orc_scanner flag and related code and test, disabling
ORC support is no longer possible.
Removed notes on how to disable ORC support from docs.
Change-Id: I7ff640afb98cbe3aa46bf03f9bff782574c998a5
Reviewed-on: http://gerrit.cloudera.org:8080/18188
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
IMPALA-8795 turns on event polling by default but the
documentation still says that it is a preview feature.
This change updates the documentation to say that the
feature is GA and enabled by default since Impala 4.1
Change-Id: Ife34b92cc1fdf4839071a888e389db69c0b4924f
Reviewed-on: http://gerrit.cloudera.org:8080/18173
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Shajini Thayasingh <sthayasingh@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Currently, Impala doesn't support ShellBasedUnixGroupsMapping and
ShellBasedUnixGroupsNetgroupMapping to fetch Hadoop groups as they
spawn a new process and run shell command to fetch group info.
In Impala, this would happen for every session being created
when user delegation is enabled via impala.doas.user and
authorized_proxy_group_config. It can have many gotcha's like
spawning many processes together in a highly concurrent setting,
creation of zombie processes on abrupt crashing of impalad etc.
However, not everyone in ecosystem have moved away from shell based
group mapping. For instance, in cloudera distribution many components
still rely on it. So we need a way to allow users to use shell based
mapping instead of not allowing it altogether.
This patch provides flag which would allow the support for users
that are aware about the gotchas it comes with.
Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470
Reviewed-on: http://gerrit.cloudera.org:8080/18019
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Made minor changes.
Incorporated feedback received by providing more examples.
Explained how to configure priorities for the scratch directories.
Provided an example displaying priority based configuration.
Change-Id: Iec170fdefcde09d4ee99d06b0876a17eb0bde2f6
Reviewed-on: http://gerrit.cloudera.org:8080/17700
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
statestore_subscriber_timeout_secs is also an important flag for
statestore scalability. Especially when the heartbeat frequency and
timeout are bumpped up, this flag should also be bumpped as well.
To introduce this flag, we should also introduce the heartbeat tcp
timeout flag and the max missed heartbeat flag.
Tests:
- Built locally, verify the html and pdf artifacts are expected.
Change-Id: Ia4b331693c5c0945f4cec8fd81ed9ec688563333
Reviewed-on: http://gerrit.cloudera.org:8080/17675
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Shajini Thayasingh <sthayasingh@cloudera.com>
Reviewed-by: Tamas Mate <tmate@cloudera.com>
Introduced a new query option to skip deleting column statistics on truncate operation.
Updated text to incorporate the comments received.
Change-Id: Ie753f84b233b06bf4554cab71263671aff36f570
Reviewed-on: http://gerrit.cloudera.org:8080/17533
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Added why cluster membership changes typically occur.
Explained how the coordinator will retry a failed query.
Talked about the new query option spool_all_results_for_retries.
Incorporated corrections from Patch set 2.
Change-Id: I3a65357a6e3d0bffa840b8636171a38bd9b22d17
Reviewed-on: http://gerrit.cloudera.org:8080/16819
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
TLS versions < 1.2 are now considered insecure. This patch improves
Impala's default security.
This is made possible now in part because Impala 4.0 dropped support
for Python versions < 2.7.9 (or 2.7.5 on certain distributions where
it has been patched) as lower Python versions do not support tls1.2
Testing:
- Existing SSL tests are updated to reflect the new default.
Change-Id: Ifed66646b041a061f9db92744710aef7453f39e4
Reviewed-on: http://gerrit.cloudera.org:8080/16988
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
incorporated comments, removed the para as per the feedback
listed all the overloads that are introduced
stated that Impala does not yet support new Hive UDFs
called out how mask functions were introduced through overloads
Change-Id: I37f0bcf4cf586cc5cfd03e4df68443967b6bb88f
Reviewed-on: http://gerrit.cloudera.org:8080/16861
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Adds a few basic docs for fault tolerance in Impala. Covers the
following topics:
* Transparent query retries
* Node blacklisting
* Statestore heartbeats
This commit only adds a high level explanation of the afortmentioned
fault tolerance concepts. The docs should be expanded on in a future
commit.
Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62
Reviewed-on: http://gerrit.cloudera.org:8080/16610
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
incorporated corrections based on the feedback
added three flags related to use_local_catalog
explained that it should not be changed
talked about the default values
Change-Id: I832f94e56ec51cee5304187991388c1994a1f55c
Reviewed-on: http://gerrit.cloudera.org:8080/16037
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
There were no bug fixes to the data cache between Impala 3.3
and 3.4 that I could find, so I just removed the warning -
it should be fine to use in Impala 3.3 and up.
Change-Id: I233c9bd0ad2bbc3dda1da03183d75f59ff31a737
Reviewed-on: http://gerrit.cloudera.org:8080/16016
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Documented startup option descriptions per review comments:
--To cover the spill-to-disk compression support
--To use the disk_spill_punch_holes as required
Included examples that need to be reviewed and minor edits.
Change-Id: I3694fe97d74697777a8d50288b406b8eca0aa9fb
Reviewed-on: http://gerrit.cloudera.org:8080/15692
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
- Replaced ellipses in example columns with sample output
- Fixed table formatting problems
- Exhumed varname styles
- Reverted table formatting at line 292 to published version formatting
- Fixed table formatting at 831
Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308
Reviewed-on: http://gerrit.cloudera.org:8080/15476
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
This change modifies the output of the SHOW TABLE STATS and SHOW
PARTITIONS for Kudu tables.
- PARTITIONS: the #Row column has been removed
- TABLE STATS: instead of showing partition informations it returns a
resultset similar to HDFS table stats, #Rows, #Partitions, Size, Format
and Location
Example outputs can be seen in the doc changes.
Testing:
* kudu_stats.test is modified to verify the new result set
* kudu_partition_ddl.test is modified to verify the new partitions style
* Updated unit test with the new error message
Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041
Reviewed-on: http://gerrit.cloudera.org:8080/15199
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The following documents were impacted by the change:
- impala_live_progress.xml, revised to explain new behavior
- impala_shell_options.xml, added --disable_live_progress option
Change-Id: I94e624b7bb916ecb5aeb4f007c0610807f7b18cf
Reviewed-on: http://gerrit.cloudera.org:8080/15442
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Alice Fan <fan309@gmail.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Mentioned deflate support on the following lines of impala_txtfile.xml:
- modified text to include deflate info
- removed redundant paragraph
Mentioned deflate support in impala_file_formats.xml.
Change-Id: I9e1205e4e408f2c20fd8642cccd6c74e7ba9eb40
Reviewed-on: http://gerrit.cloudera.org:8080/15310
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Summary of changes:
- Changed title of "Kudu tables:" to "Internal Kudu tables:".
- Added syntax "External Kudu tables" to show alternative create table syntax.
- Described alternative syntax and differences between resulting tables.
- In Kudu considerations, added example of creating synchronized external Kudu table.
- Covered external tables vs internal tables and HMS translation.
Change-Id: Ic07380fd53898dd21fbb5dacb4d9f7a84f160d4e
Reviewed-on: http://gerrit.cloudera.org:8080/15149
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>