Commit Graph

48 Commits

Author SHA1 Message Date
Henry Robinson
dd4c1c32dc Add optional RM reservation limit to memtrackers
If RM and per-query memory limits were enabled at the same time, the
per-query limit would be ignored if RM wanted to expand the memory
allocation. This change adds an optional reservation limit to a
memtracker. The original limit goes back to being a hard limit -
i.e. any attempt to consume more than that amount results in
failure. The RM reservation limit is the RM-allocated memory limit. If
that is exceeded it triggers the ExpandRmReservation() method, which tries
to retrieve more memory as long as the hard limit is observed.

The net effect is that per-query memory limits have the intended,
hard-limit effect, while the RM limits coexist nicely and can expand
with more memory as required.

At the same time, we change the precedence of various ways of suggesting
an initial reservation size so that the user can change the reservation
size via a query option (MEM_RESERVATION_SIZE).

Change-Id: I41bfa4eb1336810a8a5946f6be3472111a052144
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3134
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-07-01 18:08:47 -07:00
Nong Li
5d903efca3 ExecSummary
The runtime profile as we present it is not very useful and I think the structure of
it makes it hard to consume. This patch adds a new client facing schemed set of
counters that are collected from the runtime profiles. For example, with this structure
it would be easy to have the shell get the stats of a running query and print a useful
progress report or to check the most relevant metrics for diagnosing issues.

Here's an example of the output for one of the tpch queries:
Operator              #Hosts   Avg Time   Max Time    #Rows  Est. #Rows  Peak Mem  Est. Peak Mem  Detail
------------------------------------------------------------------------------------------------------------------------
09:MERGING-EXCHANGE        1   79.738us   79.738us        5           5         0        -1.00 B  UNPARTITIONED
05:TOP-N                   3   84.693us   88.810us        5           5  12.00 KB       120.00 B
04:AGGREGATE               3    5.263ms    6.432ms        5           5  44.00 KB       10.00 MB  MERGE FINALIZE
08:AGGREGATE               3   16.659ms   27.444ms   52.52K     600.12K   3.20 MB       15.11 MB  MERGE
07:EXCHANGE                3    2.644ms      5.1ms   52.52K     600.12K         0              0  HASH(o_orderpriority)
03:AGGREGATE               3  342.913ms  966.291ms   52.52K     600.12K  10.80 MB       15.11 MB
02:HASH JOIN               3    2s165ms    2s171ms  144.87K     600.12K  13.63 MB      941.01 KB  INNER JOIN, BROADCAST
|--06:EXCHANGE             3    8.296ms    8.692ms   57.22K      15.00K         0              0  BROADCAST
|  01:SCAN HDFS            2    1s412ms    1s978ms   57.22K      15.00K  24.21 MB      176.00 MB  tpch.orders o
00:SCAN HDFS               3    8s032ms    8s558ms    3.79M     600.12K  32.29 MB      264.00 MB  tpch.lineitem l

Change-Id: Iaad4b9dd577c375006313f19442bee6d3e27246a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2964
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-06-11 03:10:11 -07:00
Srinath Shankar
5755b0bdee Order by without limit for Impala
Enable order-by without limit
Added BufferedBlockMgr to allocate buffers and spill to disk.
Added Sorter for the external sort impelementation
Added new SortNode execution node that completely sorts its input
Changes to enable writing in IoMgr went in a separate patch.

Reviewed-on: http://gerrit.ent.cloudera.com:8080/1539
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins

Conflicts:

	testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test

Change-Id: I3ece32affe5b006f53bbdfcc03ded01471e818ac
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2900
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
2014-06-09 16:58:08 -07:00
Henry Robinson
99c37aac37 IMPALA-827: Add an option for directories created by INSERT to inherit
their parent's permissions

This patch adds --insert_inherit_permissions. If true, all
new partition directories created by INSERT will inherit their
permissions from their parent. When false, the directories are created
with the default permissions.

Change-Id: Ib2b4c251e51ea5048387169678e8dde34ecfe5f6
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1917
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-04-04 10:25:20 -07:00
Matthew Jacobs
989830186f Remove RM pool configuration and yarn_pool query option/profile property
Admission control adds support for configuring pools via a fair scheduler
allocation configuration, so the pool configuration mechanism is no longer
needed. This also renames the "yarn_pool" query option to the more general
"request_pool" as it can also be used to configure the admission controller
when RM/Yarn is not used. Similarly, the query profile shows the pool as
"Request Pool" rather than "Yarn Pool".

Change-Id: Id2cefb77ccec000e8df954532399d27eb18a2309
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1668
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 8d59416fb519ec357f23b5267949fd9682c9d62f)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1759
2014-03-06 14:46:09 -08:00
Nong Li
309ab4df0d Update backend to support hdfs caching.
Change-Id: I22761c8893c8fd222564d4e2a97bfba1284cd741
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1724
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-03-02 00:36:33 -08:00
Alex Behm
dc7b398bd3 Impala reserves resources from YARN via LLama.
Impala reserves resources from YARN via Llama and handles resources
preemptions by cancelling affected queries. Adds the Impala Resource
Broker for interacting with Llama. Refactors scheduler and coordinator
to move fragment-to-host assignment logic into scheduler. Local test
setup uses MiniLLama.

Change-Id: Ic7b0fe43de52d30f4207b4e65cce7e6a294e54e1
2014-01-15 15:12:04 -08:00
Lenni Kuff
9717b7af28 Rename SYNCED_DDL query option to SYNC_DDL
Change-Id: I0b5e08694a271c40ac55d8e695cf3a74a012ce06
Reviewed-on: http://gerrit.ent.cloudera.com:8080/972
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:11 -08:00
Lenni Kuff
39f77b8b8f Add support for cluster-synchronized catalog operations
This change adds support for cluster-synchronized catalog operations. This provides the
guaranteethat after a catalog op completes, all other subscribers to the catalog topic have
also processed that update. This is useful when load balancing, because a common workflow
is to target a different impalad for each statement executed.
For example if each of the following were executed sequentially, but targeting
a different node:
1) CREATE TABLE Foo
2) INSERT INTO Foo
3) SELECT * FROM Foo
4) INSERT INTO Foo ....

Since both the INSERT and the CREATE update the catalog, it would not work as expected
without this patch. The user might either get a "table not found" error or would be
missing partition information from the INSERT.

The downside is that this approach to DDL takes a bit longer because we need to wait
until all subscribers have processed an update. If all nodes are healthy, this overhead
should not be significantly longer than the current DDL time. However, a single bad node
might slow down or completely block the completion of all DDL operations. By default
this feature is disabled, but it can be enabled using a new query option: SYNCED_DDL=1

To test this, the base test suite was updated to support selecting a random impalad
to execute each query section in a query test file. This is currently only enabled
for the insert and DDL tests, but could be leveraged by more tests in the future.

TODO: Add additional failure tests around this functionality.
TODO: Add an explicit "sync" statement so users do not need to run all their DDL
in this mode (since it is slower).

Change-Id: I45e757a931bf2a4740cc0cdd1e76ce49a1e22b83
Reviewed-on: http://gerrit.ent.cloudera.com:8080/899
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:58 -08:00
Henry Robinson
89a0beb56a IMPALA-449: Better cleanup after an INSERT fails
This patch goes some way to improving recovery after an INSERT
fails. Inserts now write intermediate results to
<table_dir>/.impala_insert_staging. After execution completes, either
successfully or not, the query-specific directory under that directory
is deleted.

This doesn't complete the job for better cleanup (although this goes as
far as IMPALA-449 suggests). Two things to do in the future:

* Have each backend delete its own staging files on error. The
  difficulty getting there now is that backends don't know if they are
  cancelled in error or because a LIMIT was reached.
* If the operation to move files to their final destinations should
  fail during FinalizeQuery(), the coordinator should perform
  compensation actions and delete the files that made it.

Note: We also considered a query-wide and impalad-wide option to change
the staging dir. There are advantages to this (all intermediate results
go to a known location which is easy to clean up on failure), but also
security and other operational concerns. Worth revisiting in the future.

Change-Id: Ia54cf36db6a382e359877f87d7d40aad7fdb77be
Reviewed-on: http://gerrit.ent.cloudera.com:8080/670
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:37 -08:00
Alex Behm
4bb8b38cde Added stats and cost estimates to explain output.
Change-Id: I1273745a439fd25cefa4e08ecc075c98cc8bfc45
Reviewed-on: http://gerrit.ent.cloudera.com:8080/602
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Alex Behm <alex.behm@cloudera.com>
2014-01-08 10:53:22 -08:00
Nong Li
2cf314262f IMPALA-582, IMPALA-494: Fix parquet writer to allow multiple files per partition.
Added parquet_file_size query option.

Change-Id: I860dc8ab858622976402233229c365112bf081bc
Reviewed-on: http://gerrit.ent.cloudera.com:8080/477
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-01-08 10:52:49 -08:00
Lenni Kuff
d84a33efa7 Make ABORT_ON_ERROR=true the default query option value 2014-01-08 10:51:46 -08:00
Alan Choi
ecee109e68 IMPALA-387 Add refresh/invalidate SQL 2014-01-08 10:51:25 -08:00
Alan Choi
b71357fc28 IMPALA-387 Reuse Hdfs and Hive metastore metadata to perform a fast incremental refresh 2014-01-08 10:51:17 -08:00
Alan Choi
0d662a3c8c IMPALA-381 add HBase query options for setCaching and setCacheBlocks 2014-01-08 10:51:07 -08:00
Nong Li
7c6598066c Add testing for different compression codecs with parquet. 2014-01-08 10:51:04 -08:00
Lenni Kuff
ff2ae70b27 IMPALA-232: Clarify Impala shell's "version" cmd returns the shell version and also get server version 2014-01-08 10:50:06 -08:00
Lenni Kuff
9559f2a3d9 IMP-861: Enable refreshing a specific table name 2014-01-08 10:50:01 -08:00
Alan Choi
991db9001b IMPALA-113 Raise error when default order by limit is exceeded 2014-01-08 10:49:03 -08:00
Lenni Kuff
0bcb54fcf8 Add GetRuntimeProfile RPC and enable printing runtime profile from impala-shell 2014-01-08 10:48:44 -08:00
Marcel Kornacker
77f4fc8cf9 Adding memory limits
- new class MemLimit
- new query flag MEM_LIMIT
- implementation of impalad flag mem_limit

Still missing:
- parsing a mem limit spec that contains "M/G", as in: 1.25G
2014-01-08 10:48:33 -08:00
Marcel Kornacker
63e3cd0279 Adding query option DEBUG_ACTION 2014-01-08 10:47:37 -08:00
Alan Choi
be98df19c8 HiveServer2
This patch implements the HiveServer2  API.

We have tested it with Lenni's patch against the tpch workload. It has also
been tested manually against Hive's beeline with queries and metadata operations.

All of the HiveServer2 code is implemented in impala-hs2-server.cc. Beeswax
code is refactored to impala-beeswax-server.cc.

HiveServer2 has a few more metadata operations. These operations go through
impala-hs2-server to ddl-executor and then to FE. The logics are implemented in
fe/src/main/java/com/cloudera/impala/service/MetadataOp.java.

Because of the Thrift union issue, I have to modify the generated c++ file.
Therefore, all the HiveServer2 thrift generated c++ code are checked into
be/src/service/hiveserver2/. Once the thrift issue is resolved, I'll remove
these files.

Change-Id: I9a8fe5a09bf250ddc43584249bdc87b6da5a5881
2014-01-08 10:47:24 -08:00
Henry Robinson
7ba437a52e Code changes to build against thrift 0.9.0 in thirdparty/ 2014-01-08 10:47:22 -08:00
Alan Choi
ff704ce586 IMP-690: impala-shell calls PingImpalaService thrift API to verify
the connected server is an impalad.
2014-01-08 10:47:13 -08:00
Marcel Kornacker
bf56c21c1b IMP-618
Adding DEFAULT_ORDER_BY_LIMIT query option.
Also removing deprecated PARTITION_AGG query option.
2014-01-08 10:47:04 -08:00
Henry Robinson
2f339f2ed8 Add ASL license to all public files 2014-01-08 10:46:32 -08:00
Alan Choi
0ce8a044e3 Disable RC/Trevni (with option to allow it); remove file_buffer_size
IMP-336: remove file_buffer_size query options
Add "allow_unsupported_formats" query options to allow RC/Trevni in our test; disabled by
default
2014-01-08 10:46:02 -08:00
Marcel Kornacker
5984c0be52 First cut of partitioned plan generation:
- created new class PlanFragment, which encapsulates everything having to do with a single
  plan fragment, including its partition, output exprs, destination node, etc.
- created new class DataPartition
- explicit classes for fragment and plan node ids, to avoid getting them mixed up, which is easy to do with ints
- Adding IdGenerator class.
- moved PlanNode.ExplainPlanLevel to Types.thrift, so it can also be used for
  PlanFragment.getExplainString()
- Changed planner interface to return scan ranges with a complete list of server locations,
  instead of making a server assignment.

Also included: cleaned up AggregateInfo:
- the 2nd phase of a DISTINCT aggregation is now captured separately from a merge aggregation.
- moved analysis functionality into AggregateInfo

Removing broken test cases from workload functional-planner (they're being handled correctly in functional-newplanner).
2014-01-08 10:45:56 -08:00
Henry Robinson
e7348a209b IMP-232: Parallel INSERT OVERWRITE 2014-01-08 10:45:04 -08:00
Marcel Kornacker
c18d0970d7 Changed RuntimeProfile::PrettyPrint() and Coordinator::BackendExecState::GetNodeThroughput()
not to hold locks while they make function calls.

Changed Frontend.assignIds() to use UUID.randomUUID() to generate the query id.
2014-01-08 10:44:46 -08:00
Nong Li
81bba16dac Parallel scanners. 2014-01-08 10:44:38 -08:00
Henry Robinson
e5893064b0 Fix build failure 2014-01-08 10:44:37 -08:00
Henry Robinson
fb681fba4e Simple Python shell for Impala 2014-01-08 10:44:37 -08:00
Alan Choi
f15ef994fb "mvn test" now uses impalad and beeswax api to submit query and fetch, including
insert query.

review issue: 260
2014-01-08 10:44:30 -08:00
Alan Choi
cbadb4eac4 When a scan range begins at the starting point fo the tuple, we'll missed that tuple. This patch fixes
this problem.

review: 162
2014-01-08 10:44:24 -08:00
Alan Choi
41200fc307 Impalad now accept Query.Configuration as execution option
issue: 210
2014-01-08 10:44:22 -08:00
Marcel Kornacker
10bf3e91e3 Cancellation support:
- added DataStreamMgr::Cancel(), which is used to propagate cancellation from the
  coordinator to all (possibly blocked) ExchangeNodes
- all exec nodes now check for cancellation before they do anything that might block for a while
- fixed up logic related to async cancellation

Added support for async query execution via beeswax interface:
- implemented ImpalaServer::query()
- QueryExecState now tracks beeswax's idea of the query state
- ImpalaServer::get_state() now returns the actual state

Fixed handling of ExecNode::Close():
- needs to be called for entire plan tree, regardless of what fails (can't use
  RETURN_IF_ERROR() inside of it)
- needs to be called for every Open() call by coordinator/ImpalaServer
2014-01-08 10:44:18 -08:00
Marcel Kornacker
fb32d40b03 Switching to an asynchronous plan fragment exec interface; this entails:
- making the coordinator asynchronous
- renamed ImpalaBackendService to ImpalaInternalService;
- new class ImpalaServer implements ImpalaService and ImpalaInternalService
- renaming ImpalaInternalService fields to conform to c++ style
- merged impala-service.{cc,h} and backend-service.{cc,h} into impala-server.{cc,h}
- added TStatusCode field to Status.ErrorDetail
- removed ImpalaInternalService.CloseChannel

Also removed JdbcDriverTest.java
2014-01-08 10:44:15 -08:00
Alan Choi
f52286f72c This completes the Beeswax implementation for ODBC. All the ODBC tests
(CDH/hive-odbc-test) passes (except those with "create table" and "show table".

We should have nightly regression of the odbc test to run against impalad.

There're still a few issues:
1. running with num_node > 0 crashes the coordinator;
2. work around for a few ODBC jiras
3. no test for bool/timestamp because ODBC doesn't support them.

review: issue 110
2012-06-18 14:46:46 -07:00
Henry Robinson
eb2a09ed4a impalad can use external planservice, plus catalog refresh utility 2012-06-12 12:22:31 -07:00
Alan Choi
7af87c7dea Beeswax Service for Impala (partiial implementation)
review id: 82
2012-06-06 10:08:06 -07:00
Henry Robinson
3ff3559805 Add support for per-partition file formats to front end and backend.
At the same time, this patch removes the partitionKeyRegex in favour
of explicitly sending a list of literal expressions for each file path
from the front end.
2012-06-05 12:00:09 -07:00
Marcel Kornacker
0227ea8868 Several changes related to impalad:
- breaks out ImpalaService implementation into impala-service.{cc,h} and
  completes the implementation (minus cancellation)
- reorg of testutil/QueryExecutor: now we have a QueryExecutorIf with two implementations,
  InProcessQueryExecutor (the existing one) and ImpaladQueryExecutor (which
  executes against a running impalad process)
2012-05-21 12:00:21 -07:00
Henry Robinson
2af14392a6 Serial INSERT support 2012-05-03 13:44:32 -07:00
Marcel Kornacker
6a57a1d879 Enabling multi-node distributed execution:
- adding flag --backends="host:port,host:port,..." , which TestEnv uses to create clients for ImpalaBackendServices
  running on those nodes; this is just a hack in order to be able to use runquery for multi-node execution
- impalad-main.cc: main() of impala daemon, which will export both ImpalaService and
  ImpalaBackendService (but at the moment only does the latter; everything related to ImpalaService is commented out)
- com.cloudera.impala.service.Frontend: API to the frontend functionality; invoked by impalad via jni; ignore for now
2012-02-10 10:53:40 -08:00
Nong Li
c84fec38d3 - Move thrift out of FE src and into impala/common
- Thrift files now build using cmake instead of mvn
- Added cmake build to impala/ which drives the build process
2011-12-30 19:35:20 -08:00