This commit also removes the now unused `DISTRIBUTE`, `SPLIT`, and
`BUCKETS` keywords that were going to be newly released in Impala 2.6,
but are now unused. Additionally, a few remaining uses of the
`DISTRIBUTE BY` syntax has been switched to `PARTITION BY`.
Change-Id: I32fdd5ef26c532f7a30220db52bdfbf228165922
Reviewed-on: http://gerrit.cloudera.org:8080/5382
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
This commit adds support for Kudu-specific column options in CREATE
TABLE statements. The syntax is:
CREATE TABLE tbl_name ([col_name type [PRIMARY KEY] [option [...]]] [, ....])
where option is:
| NULL
| NOT NULL
| ENCODING encoding_val
| COMPRESSION compression_algorithm
| DEFAULT expr
| BLOCK_SIZE num
The output of the SHOW CREATE TABLE statement was altered to include all the specified
column options for Kudu tables.
Change-Id: I727b9ae1b7b2387db752b58081398dd3f3449c02
Reviewed-on: http://gerrit.cloudera.org:8080/5026
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
Adds support in the shell to report the number of modified
rows for all DML operations, as well as the number of rows
with errors.
Testing: Added shell tests.
Change-Id: I3d3d7aa8d176e03ea58fb00f2a81fb3e34965aa1
Reviewed-on: http://gerrit.cloudera.org:8080/5103
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The plan-root fragment instance that runs on the coordinator should be
handled like all others: started via RPC and run asynchronously. Without
this, the fragment requires special-case code throughout the
coordinator, and does not show up in system metrics etc.
This patch adds a new sink type, PlanRootSink, to the root fragment
instance so that the coordinator can pull row batches that are pushed by
the root instance. The coordinator signals completion to the fragment
instance via closing the consumer side of the sink, whereupon the
instance is free to complete.
Since the root instance now runs asynchronously wrt to the coordinator,
we add several coordination methods to allow the coordinator to wait for
a point in the instance's execution to be hit - e.g. to wait until the
instance has been opened.
Done in this patch:
* Add PlanRootSink
* Add coordination to PFE to allow coordinator to observe lifecycle
* Make FragmentMgr a singleton
* Removed dead code from Coordinator::Wait() and elsewhere.
* Moved result output exprs out of QES and into PlanRootSink.
* Remove special-case limit-based teardown of coordinator fragment, and
supporting functions in PlanFragmentExecutor.
* Simplified lifecycle of PlanFragmentExecutor by separating Open() into
Open() and Exec(), the latter of which drives the sink by reading
rows from the plan tree.
* Add child profile to PlanFragmentExecutor to measure time spent in
each lifecycle phase.
* Removed dependency between InitExecProfiles() and starting root
fragment.
* Removed mostly dead-code handling of LIMIT 0 queries.
* Ensured that SET returns a result set in all cases.
* Fix test_get_log() HS2 test. Errors are only guaranteed to be visible
after fetch calls return EOS, but test was assuming this would happen
after first fetch.
Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Reviewed-on: http://gerrit.cloudera.org:8080/4402
Tested-by: Internal Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This change removes some of the occurrences of the strings 'CDH'/'cdh'
from the Impala repository. References to Cloudera-internal Jiras have
been replaced with upstream Jira issues on issues.cloudera.org.
For several categories of occurrences (e.g. pom.xml files,
DOWNLOAD_CDH_COMPONENTS) I also created a list of follow-up Jiras to
remove the occurrences left after this change.
Change-Id: Icb37e2ef0cd9fa0e581d359c5dd3db7812b7b2c8
Reviewed-on: http://gerrit.cloudera.org:8080/4187
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The webserver address was always configured as 0.0.0.0 (meaning that
the webserver could be reached on any IP for that machine) unless
otherwise specified. This is not a correct value to dispay to the
user. This patch returns the hostname of the node, when requested,
if the webserver host address is 0.0.0.0.
This patch also does not print the coordinator link for very simple
queries, as it's not necessary and is unnecessarily verbose.
This patch also does away with pinging the impalad an extra time per
query for finding the host time and webserver address. It instead
remembers the webserver address at connect time and displays client
local time for every query instead.
Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Reviewed-on: http://gerrit.cloudera.org:8080/3994
Tested-by: Internal Jenkins
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Sailesh Mukil <sailesh@cloudera.com>
Fix the error handling code and add a test.
Change-Id: Iebcf1dc8a1a08b400a2c769a9cff38ea02c8e525
Reviewed-on: http://gerrit.cloudera.org:8080/4022
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
The impala-shell could not accept wildcard or SAN certificates
previously as the thrift library it depended on did not support them.
This patch subclasses TSSLSocket and adds the logic to take care of
the above mentioned cases by introducing the new
TSSLSocketWithWildcardSAN class.
The certificate matching logic is based on the python-ssl source code.
Added custom cluster tests to test both wildcard matching and SAN
matching.
Added be/src/testutil/certificates-info.txt which contains all the
information about the certificates which are added for the tests.
This has been tested with Python2.4 and Python2.6.
Change-Id: I75e37012eeeb0bcf87a5edf875f0ff915daf8b89
Reviewed-on: http://gerrit.cloudera.org:8080/3765
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.
Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
To help supportability and debugging, it's helpful to have the impala
shell print out the coordinator time and the link to the coordinator
web UI once the query is submitted.
This is done by calling the PingImpalaService() routine everytime a
query is submitted, which returns the coordinator's hostname,
webserver port and the coordinator epoch time at that moment which the
shell then formats and prints out.
Added tests to verify these new messages.
Change-Id: I704eb64546e27c367830120241311fea6091266b
Reviewed-on: http://gerrit.cloudera.org:8080/3507
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
To cancel a query, the shell will create a separate connection inside
it's SIGINT handler, and send the cancellation RPC. However this
connection did not start a secure connection if it needed to, meaning
that the cancellation attempt would just hang.
A workaround is to kill the shell process, which I expect is what users
have been doing with this bug which has been around since 2014.
Testing:
I added a custom cluster test that starts Impala with SSL
enabled, and wrote two tests - one just to check SSL connectivity, and
the other to mimic the existing test_cancellation which sends SIGINT to
the shell process. In doing so I refactored the shell testing code a bit
so that all tests use a single ImpalaShell object, rather than rolling
their own Popen() based approaches when they needed to do something
unusual, like cancel a query.
In the cancellation test on my machine, SIGINT can take a few tries to
be effective. I'm not sure if this is a timing thing - perhaps the
Python interpreter doesn't correctly pass signals through to a handler
if it's in a blocking call, for example. The test reliably passes within
~5 tries on my machine, so the test tries 30 times, once per second.
Change-Id: If99085e75708d92a08dbecf0131a2234fedad33a
Reviewed-on: http://gerrit.cloudera.org:8080/3302
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
Before this change, a single test database was created for the entire suite,
and each test was marked to run serially. With the addition of a test fixture
in tests/conftest.py to create a unique database per each individual method,
it's possible now to run the tests in parallel. (The tables required by
individual tests are created via local test fixtures.)
As such, any methods which had been responsible for setting up the test
database were removed. Pytest markers for running tests serially were also
removed, except in cases where interactions from running concurrency would
affect other tests.
Additional minor changes were made to improve PEP-8 compliance.
The non-serial tests were run in a loop ten times to confirm that there weren't
any unexpected failures.
Review: https://gerrit.cloudera.org/#/c/3301/
Change-Id: Icdcb04a99c0907fc1ba56baa2497fafb33b0e34e
Reviewed-on: http://gerrit.cloudera.org:8080/3301
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Internal Jenkins
This patch trims trailing comments while parsing queries in
non-interactive mode. Users usually have comments in the end
of the script which should be ignored. Without this patch,
the script fails with an exception since it expects a valid
SQL. The behavior however remains the same with interactive
mode.
Change-Id: I723763ef7eedd03cf22058fadf06e9673a0d94d2
Reviewed-on: http://gerrit.cloudera.org:8080/3169
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
This patch allows you to write SOURCE <file> or SRC <file>, and have the
shell read the file and execute all the queries in it.
Change-Id: Ib05df3e755cd12e9e9562de6b353857940eace03
Reviewed-on: http://gerrit.cloudera.org:8080/2663
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
Previously Impala disallowed LOAD DATA and INSERT on S3. This patch
functionally enables LOAD DATA and INSERT on S3 without making major
changes for the sake of improving performance over S3. This patch also
enables both INSERT and LOAD DATA between file systems.
S3 does not support the rename operation, so the staged files in S3
are copied instead of renamed, which contributes to the slow
performance on S3.
The FinalizeSuccessfulInsert() function now does not make any
underlying assumptions of the filesystem it is on and works across
all supported filesystems. This is done by adding a full URI field to
the base directory for a partition in the TInsertPartitionStatus.
Also, the HdfsOp class now does not assume a single filesystem and
gets connections to the filesystems based on the URI of the file it
is operating on.
Added a python S3 client called 'boto3' to access S3 from the python
tests. A new class called S3Client is introduced which creates
wrappers around the boto3 functions and have the same function
signatures as PyWebHdfsClient by deriving from a base abstract class
BaseFileSystem so that they can be interchangeably through a
'generic_client'. test_load.py is refactored to use this generic
client. The ImpalaTestSuite setup creates a client according to the
TARGET_FILESYSTEM environment variable and assigns it to the
'generic_client'.
P.S: Currently, the test_load.py runs 4x slower on S3 than on
HDFS. Performance needs to be improved in future patches. INSERT
performance is slower than on HDFS too. This is mainly because of an
extra copy that happens between staging and the final location of a
file. However, larger INSERTs come closer to HDFS permformance than
smaller inserts.
ACLs are not taken care of for S3 in this patch. It is something
that still needs to be discussed before implementing.
Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d
Reviewed-on: http://gerrit.cloudera.org:8080/2574
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
This change whitelists the supported filesystems which can be set
as Default FS for Impala to run on.
This patch configures Impala to use S3 as the default filesystem, rather
than a secondary filesystem as before.
Change-Id: I2f45bef6c94ece634045acb906d12591587ccfed
Reviewed-on: http://gerrit.cloudera.org:8080/1121
Reviewed-by: anujphadke <aphadke@cloudera.com>
Tested-by: Internal Jenkins
The SET command has been extended with the following syntax, to allow
setting of variables in the Impala Shell:
SET VAR:<variable_name>=<value>
The UNSET command has also been modified to allow:
UNSET VAR:<variable_name>
This patch builds on the changes in IMPALA-2179. The main change for
this patch was to ensure that all SET commands are processed by the
shell, rather than being send to the front end as a query. For this
I had to modify the command sanitization function to remove comments
that happen in front of a SET command.
Comments can be a can of worms to parse, so I tried to be as strict
as possible to avoid collateral effects. Comments are only removed
if they happen right at the beginning of the line AND before a SET
command. NO other comments are touched, including comments before,
after or within queries.
Change-Id: I87e07385122187ab8d324346499896a3dfbbafe6
Reviewed-on: http://gerrit.cloudera.org:8080/679
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
This patch adds the command line option `--var` to allow the user to set
variable to be used in commands within the shell. It does *not* implement the
setting of variables through the SET command, as Hive does. This extension will
be implemented separately on IMPALA-2180.
The syntax for specifying a parameter in the command line is --var=KEY=VAL, as
for example: --var=start_date=20150101
Variables are textually replaced by their value in the Impala shell commands.
The substitution work similarly for interactive sessions as well as for command
line queries and/or scripts (-q and -f options, respectively).
Variables can be referenced as ${VAR:VAR_NAME} (case-insensitive). The form
${HIVEVAR:VAR_NAME} can also be used for compatibility with Hive scripts.
To prevent any of the reference expressions above from being replaced you can
escape them with a backslash (e.g. \${VAR:VAR_NAME} and \${HIVEVAR:VAR_NAME}).
The Impala shell's SET command now also reports the set variables and their
values.
Change-Id: Ia491fae91256334bb60c9066d119fe9a1e9779dd
Reviewed-on: http://gerrit.cloudera.org:8080/611
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
The bug was that the test sometimes polled the wrong impalad for checking
metrics because ImpaladCluster() does not guarantee any particular order of
impalads, but the test was incorrectly relying on that.
The fix is to execute the shell commands on a fixed impalad, and use the
same impalad for polling metrics.
Change-Id: Iecb528be916d2900d7ec8a894bdef630250547da
Reviewed-on: http://gerrit.cloudera.org:8080/1974
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Adds a new option --ldap_password_cmd that takes a string which is
executed as a shell command. The stdout results are used as the LDAP
password for this shell session.
Tests are added for the negative case (where the command fails for some
reason), but without tests for successful LDAP connections we can't test
the case where the password is correct.
Change-Id: Ib0362be5e167ff752e764ad2152c4c4b679f83c2
Reviewed-on: http://gerrit.cloudera.org:8080/1542
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
Remove a hack in impala-shell that suppressed any error messages that
included the string "Cancelled". This originally was necessary since
user-initiated cancellation messages were propagated back to the client,
but is no longer necessary. Certain errors that occur asynchronously
were suppressed, because the error is propagated by cancelling the
query.
Confirmed that no error messages are printed for manually cancelled
queries, and that error messages that were previously suppressed (from
my IMPALA-2298 patch) now show in the shell.
Change-Id: Iac53b1307768cbb07640ddc88b152ae71c71beab
Reviewed-on: http://gerrit.cloudera.org:8080/1529
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
Allow Impala to start only with a running HMS (and no additional services like HDFS,
HBase, Hive, YARN) and use the local file system.
Skip all tests that need these services, use HDFS caching or assume that multiple impalads
are running.
To run Impala with the local filesystem, set TARGET_FILESYSTEM to 'local' and
WAREHOUSE_LOCATION_PREFIX to a location on the local filesystem where the current user has
permissions since this is the location where the test data will be extracted.
Test coverage (with core strategy) in comparison with HDFS and S3:
HDFS 1348 tests passed
S3 1157 tests passed
Local Filesystem 1161 tests passed
Change-Id: Ic9718c7e0307273382b1cc6baf203ff2fb2acd03
Reviewed-on: http://gerrit.cloudera.org:8080/1352
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Readability: Alex Behm <alex.behm@cloudera.com>
Impala shell cannot get child query handle so it cannot
query live progress for COMPUTE STATS query. Disable live
progress callback for compute stats query.
Change-Id: I2d2f342a805905a4fa868686e7c9e9362c2c2223
Reviewed-on: http://gerrit.cloudera.org:8080/1109
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
Python tests and infra scripts will now use "python" from the virtualenv
via $IMPALA_HOME/bin/impala-python. Some scripts could be simplified now
that python 2.6 and a dependable set of third-party libraries are
available but that is not done as part of this commit.
Change-Id: If1cf96898d6350e78ea107b9026b12ba63a4162f
Reviewed-on: http://gerrit.cloudera.org:8080/603
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
This patch changes the behaviour of the Impala shell to refuse to
attempt an LDAP-authenticated connection to Impala unless SSL/TLS is
configured.
A new flag --auth_creds_in_clear_ok is added to suppress this
behaviour. This is similar to Impala's --ldap_passwords_in_clear_ok
flag. The shell will also now print a warning if an insecure
configuration is used.
Change-Id: Ide25d8dd881a61b9f08900112466c430da64a038
Reviewed-on: http://gerrit.cloudera.org:8080/546
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
This patch adds a 'tip-of-the-day' message to the shell's intro header,
and a TIP command that prints out a random tip. This might be a good way
to make advanced or little-known functionality of both Impala and the
shell better known to users.
Change-Id: I987b386f8c96f8a75ccd5cd9197a8f2981c8bf43
Reviewed-on: http://gerrit.cloudera.org:8080/586
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
The tests were doing unnecessary things. One such thing that stopped
working with the virtualenv patch was searching for the shell process to
get the pid. The search was never needed since the process was spawned
with Popen which provides the pid directly.
Change-Id: I2455e58de4fdba8fd2770f0489fac8cddf6b90a0
Reviewed-on: http://gerrit.cloudera.org:8080/555
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
This patch adds a way to allow for dynamic progress reporting in the
shell. There are two new command line flags for the shell
--live_progress - will print the completed vs total # of scan ranges
--live_summary - prints an updated exec summary
In addition to the command line flags, these options can be set from
within the shell using:
set LIVE_SUMMARY=True
set LIVE_PROGRESS=True
The new options will be listed under shell options. Both reports will be
updated at most every second, for longer running queries it will be
adjusted to the time between two RPC calls to get the query status. To
provide this information in the ExecSummary, the Thrift structure for
the ExecSummary was extended to contain a progress indicator. The output
is printed to stderr and only available in interactive mode.
An example video is available here:
https://asciinema.org/a/5wi7ypckx4ol4ha1hlg3e3q1k
Change-Id: I70b2ab5fa74dc2ba5bc3b338ef13ddc6ccf367d2
Reviewed-on: http://gerrit.cloudera.org:8080/508
Tested-by: Internal Jenkins
Reviewed-by: Martin Grund <mgrund@cloudera.com>
The '-f' option in Impala shell is used to read from a file containing SQL queries.
Now, additional support is added to read from STDIN by using:
"-f -"
An additional bug was fixed in the test script test_shell_commandline.py where certain
tests(test_default_db and test_unsecure_message) would hang indefinitely due to the
subprocess(impala-shell) waiting for user input. Fixed by piping STDIN to the subprocess
which sends an implicit EOF that closes the impala-shell once the test is completed.
Change-Id: I9a2682e086a3345e089f3e9db7cc049ce3d2c19a
Reviewed-on: http://gerrit.cloudera.org:8080/479
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
Upgrading sqlparse ended up trading one bug for another. The new bug is
not fixed upstream, I sent a patch. The problem is '\\' is not
considered a terminated string and we use this in the phrase "fields
escaped by '\\'" when creating tables.
Change-Id: Id57081f5a96e997afd3aa9b26dca23f627488fc3
Reviewed-on: http://gerrit.cloudera.org:8080/117
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
The existing error message ("Error: could not match input") is not
helpful and too confusing. This accepts a trailing semicolon and
provides a better error message for other unacceptable chars.
Change-Id: I48ee90d109ce27c8a34e68f79d2e57e8426f1074
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5617
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: jenkins
This patch uses a regex rather than a raw string to search for unique query profiles in
the shell's output.
Change-Id: I5c0c2dc30ad6d9a14e5c35177f5a494445b7cc7c
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5381
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
This patch adds a line to the signal handler that closes
queries that have been cancelled. This patch closes the
cancelled query in the signal handler if it is not already
closed. This patch also improves the cancellation test so
it catches this problem in the future.
Change-Id: I1bb2a4a8fc3c3d40b8e4ba41f4b2bcf6d32bc297
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5303
Reviewed-by: Alex Leblang <alex.leblang@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5320
This patch enables the shell to pass query strings as-is to Impala's parser. In order to
preserve previous behaviour, we transform multi-line queries before writing them to
history. We replace EOL with an obscure ascii character (DLE), and re-apply the
transformation when reading it back from history.
Change-Id: I021b9c3d50b03df73bea1afd6ce3ec6b413484e0
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4664
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
Conflicts:
shell/impala_shell.py
Earlier, if the user connected via the command line, the shell would only attempt to
create a new Impala Client instance if a connection did not exist. This resulted in the
shell not connecting to the user specified Impalad.
Change-Id: I74c291256d0c063f6324b01aa7336282e6969a4e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4392
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
Follow up fix to IMPALA-1153. Ensure that the correct
CmdStatus is returned by the summary command. (ERROR for
invalid queries and SUCCESS even if summary is not available.)
Change-Id: Icf67164dc82f202ec15071541f6ed3b26e3ad7fb
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4089
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Restructured how the cmd control flow executes commands in postcmd,
removing the hack to make non-interactive mode work. Now there are
values to represent the different command execution statuses.
Change-Id: I149b65d8a64d63a978fed284f0ad0da95833149c
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3850
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
(cherry picked from commit b33dc4d10bcc3982dad43015343c91f1d277bb3f)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4096
Reviewed-by: Henry Robinson <henry@cloudera.com>
This is the first iteration of a kerberized development environment.
All the daemons start and use kerberos, with the sole exception of the
hive metastore. This is sufficient to test impala authentication.
When buildall.sh is run using '-kerberize', it will stop before
loading data or attempting to run tests.
Loading data into the cluster is known to not work at this time, the
root causes being that Beeline -> HiveServer2 -> MapReduce throws
errors, and Beeline -> HiveServer2 -> HBase has problems. These are
left for later work.
However, the impala daemons will happily authenticate using kerberos
both from clients (like the impala shell) and amongst each other.
This means that if you can get data into the mini-cluster, you could
query it.
Usage:
* Supply a '-kerberize' option to buildall.sh, or
* Supply a '-kerberize' option to create-test-configuration.sh, then
'run-all.sh -format', re-source impala-config.sh, and then start
impala daemons as usual. You must reformat the cluster because
kerberizing it will change all the ownership of all files in HDFS.
Notable changes:
* Added clean start/stop script for the llama-minikdc
* Creation of Kerberized HDFS - namenode and datanodes
* Kerberized HBase (and Zookeeper)
* Kerberized Hive (minus the MetaStore)
* Kerberized Impala
* Loading of data very nearly working
Still to go:
* Kerberize the MetaStore
* Get data loading working
* Run all tests
* The unknown unknowns
* Extensive testing
Change-Id: Iee3f56f6cc28303821fc6a3bf3ca7f5933632160
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4019
Reviewed-by: Michael Yoder <myoder@cloudera.com>
Tested-by: jenkins
This is a reorganization of the existing impala-shell.
The basic idea was to split up the shell into two components: one part
soley responsible for the CLI functionality, and another to represent
the impala client/connection that would interact with the Beeswax api and
execute queries, fetch results, etc.
One major change was to redo how the existing shell handled cancellation,
which was to create a thread for each rpc, so that Ctrl+C would not interrupt
the system calls and break the socket connection. In the new approach,
a new client instance is created to close the query and if the socket connection is
broken, the client reconnects. Cancellation currently works.
Change-Id: I0f371f68552c065b2317f967c6cf7483b44be3df
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3316
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4008
Commands with escaped single quotes would cause
the shell to enter an infinite loop while trying
to parse the command due to shlex not escaping single
quotes correctly. Once that change was implemented,
shlex would now ignore escaped single and double quotes
outside of closed quotes, so there needed to be a check for
that as well.
ALSO, implemented testing of commands in interactive mode.
Needed this to test these inputs, as command line input
cannot span multiple lines.
Change-Id: Id67368944eeb9a73061bc3e90bd6cda73c9d9f64
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3408
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3893
Before, when the show_profiles -p option was enabled, the runtime profile
would be printed before the query was closed, preventing the query summary
from being printed. Now the profile is printed after the query is closed.
Change-Id: Icf7b10f7612d8016736aac70aa7b77265d391a98
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3770
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3821
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Optionally loads options from a file in the user's home directory, called
'.impalarc', though the path to another file can be passed in as a command-line
option. The file must have a case-sensitive [impala] header. Specifying
the option in the command line overwrites the config file's value
for the option for that instance of the shell. If an option is not
specified in the config file, its default value is used.
Change-Id: I218da2c1e10308c5b8729883fa625f0c284397a7
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2956
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3629
There was an issue with the previous fix to IMPALA-1059
if the user tried to reconnect within the shell after
having passed in a database via the -d option. The
passed database would be doubly backticked. This makes
the backticking of the argument idempotent.
Change-Id: I6eaed997c2be73d8659a2a12046ce393b97ec82c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3467
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3502