impala

mirror of https://github.com/apache/impala.git synced 2026-01-06 06:01:03 -05:00

Author	SHA1	Message	Date
Alex Behm	70d7ff07af	CDH-19856: Disable Hive's stats autogathering. Change-Id: I04e91f91d29b7863848a750e362c9d94469df7f2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3156 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3169	2014-06-19 16:48:34 -07:00
Alex Behm	ef6705d7e0	Rename MergeNode to UnionNode. Change-Id: I9e3675a103757db1345b04bd1d102d2719efddd0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3128 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3154 Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-06-19 12:44:21 -07:00
Nong Li	e7f7eab1b5	Missing reanalyze() in select stmt after substitution. Change-Id: I71203ebb02cf64e5bf259d2f6c5faf951f87f0d2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3144 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-19 02:52:10 -07:00
ishaan	99602fb8c2	Force load data if the current HEAD has a schema change. This patch checks the test-warehouse's stored githash (if it exists) to determine if the current patch has changed the schema if a table. If a change is detected, we force load all the data. Change-Id: I314f9f3364d3e6b2d66de38a9e6d9f57c4e279a7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3049 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-06-19 02:25:50 -07:00
Alex Behm	677062be3d	Rework planning of unions s.t. a UnionStmt produces a single MergeNode. This patch changes the planning of a UnionStmt s.t. it always produces a single fragment with a MergeNode connecting all child fragments as its root. The data partition of the returned fragment and how the child fragments are merged depends on the data partitions of the child fragments: - All child fragments are unpartitioned or partitioned: The returned fragment is has a UNPARTITIONED or RANDOM data partition, respectively. The MergeNode absorbs the plan trees of all child fragments. - Mixed partitioned/unpartitioned child fragments: The returned fragment is RANDOM partitioned. The plan trees of all partitioned child fragments are absorbed into the MergeNode. All unpartitioned child fragments are connected to the MergeNode via a RANDOM exchange, and remain unchanged otherwise. Also adds support for random partitioned data exchanges. Change-Id: I82b2d12c104d98c4e7133234653ee1b67658ef7a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2876 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3143	2014-06-19 00:56:58 -07:00
Alex Behm	4be9611474	Temporarily disable insert planner tests (CDH-19856). Change-Id: Ibcf914b87fb0ae958c5039a7cd2e8be72aa4295e Reviewed-on: http://gerrit.ent.cloudera.com:8080/3110 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-06-17 23:34:07 -07:00
Alex Behm	eed829f778	Fix misleading test to unblock full data loading. Change-Id: I98c218188a0cf459cacb96363e7a65ebb4525f07 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3100 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-06-17 17:45:04 -07:00
Srinath Shankar	895bdeddd8	Ignore order-by without limit in INSERT and CTAS Order-by without limit in the query statement corresponding an INSERT or CTAS must be ignored because i) There is no guarantee on row ordering when the target table is scanned again i.e. 'select * from table' may return rows in any order, regardless of how the rows were inserted, and ii) Ignoring (and not flagging an error) is consistent with the treatment of order-by w/o limit in nested queries, union operands etc. Currently, an order-by w/o limit in a QueryStmt is only evaluated if the analyzer is the root analyzer (has no ancestors). However, a new child analyzer is not created for the QueryStmt in an InsertStmt, so this technique fails for inserts. The correct thing to do is to use a child analyzer for that QueryStmt, but this has spill-over scoping effects for analysis of with clauses. This patch adds a flag, similar to the isExplain flag to the analyzer to identify insert statements. Change-Id: I9ded587cfea75eca0b7a43ee9b0df0a6c8ecb602 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3044 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3060	2014-06-14 18:36:43 -07:00
Alex Behm	c503e1aa20	Wait for the NN to exit safe mode before starting services that depend on it. Our testdata/run-all.sh can be brittle depending on the state of your Hdfs. In particular, Yarn depends on the NN not being in safe mode, but it may take some time for the NN to exit safe mode immediately after starting Hdfs. This patch makes the NN startup script complete only after the NN has exited safe mode. Change-Id: I8b30cd07128dc48d79d91726eafed4174fb91a6d Reviewed-on: http://gerrit.ent.cloudera.com:8080/3005 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3021	2014-06-13 01:36:34 -07:00
anusha	ffc334a735	IMPALA-834: Fix for Create Table like Views Change-Id: Ied1f706c48a1106e1d6fc2aa73e57746f52ea333 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2939 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3014 Reviewed-by: Anusha Dasarakothapalli <anusha.dasarakothapalli@cloudera.com>	2014-06-12 22:13:30 -07:00
Skye Wanderman-Milne	c1c097b1b8	IMPALA-1030: HdfsTableSink was evaluating exprs in Prepare() Exprs need to be prepared and opened before calling GetValue(). Change-Id: I51d111b79c3453c9ab7acad14b93566f03decbcc Reviewed-on: http://gerrit.ent.cloudera.com:8080/2959 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit fa602080d7fb1aad90ea5f9446d82ff953169974) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2994	2014-06-12 02:23:20 -07:00
Skye Wanderman-Milne	1cc628d32d	IMPALA-950: Skip computing stats for decimal columns. This patch also adds a mechanism to return analysis warnings to client, which is used to log skipped decimal columns. Change-Id: I30c246044a68ec8861cd5bed072bd54e65a079e6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2822 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit fc77422acef7e6f93fdeb5448309414b905f0725) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2984	2014-06-11 19:16:34 -07:00
Nong Li	5d903efca3	ExecSummary The runtime profile as we present it is not very useful and I think the structure of it makes it hard to consume. This patch adds a new client facing schemed set of counters that are collected from the runtime profiles. For example, with this structure it would be easy to have the shell get the stats of a running query and print a useful progress report or to check the most relevant metrics for diagnosing issues. Here's an example of the output for one of the tpch queries: Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ------------------------------------------------------------------------------------------------------------------------ 09:MERGING-EXCHANGE 1 79.738us 79.738us 5 5 0 -1.00 B UNPARTITIONED 05:TOP-N 3 84.693us 88.810us 5 5 12.00 KB 120.00 B 04:AGGREGATE 3 5.263ms 6.432ms 5 5 44.00 KB 10.00 MB MERGE FINALIZE 08:AGGREGATE 3 16.659ms 27.444ms 52.52K 600.12K 3.20 MB 15.11 MB MERGE 07:EXCHANGE 3 2.644ms 5.1ms 52.52K 600.12K 0 0 HASH(o_orderpriority) 03:AGGREGATE 3 342.913ms 966.291ms 52.52K 600.12K 10.80 MB 15.11 MB 02:HASH JOIN 3 2s165ms 2s171ms 144.87K 600.12K 13.63 MB 941.01 KB INNER JOIN, BROADCAST \|--06:EXCHANGE 3 8.296ms 8.692ms 57.22K 15.00K 0 0 BROADCAST \| 01:SCAN HDFS 2 1s412ms 1s978ms 57.22K 15.00K 24.21 MB 176.00 MB tpch.orders o 00:SCAN HDFS 3 8s032ms 8s558ms 3.79M 600.12K 32.29 MB 264.00 MB tpch.lineitem l Change-Id: Iaad4b9dd577c375006313f19442bee6d3e27246a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2964 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-11 03:10:11 -07:00
Skye Wanderman-Milne	6ac9a8104b	IMPALA-1009: UDF/UDA leaks should not fail queries With this change, leaky UDFs built with the SDK will still fail when using the test harness, but leaky UDFs running in Impala will only trigger a warning. This change also updates the test infrastructure to always check for non-fatal errors/warnings. Change-Id: I5615349b9d691e4eddea3e03e152ef12e73835e7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2844 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 60ce5190d96add6104aba642d2354d87a26000fa) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2938	2014-06-10 21:46:47 -07:00
Nong Li	5e49150a22	Speed up views compat test. - Use a smaller table so hive runs faster - Don't invalidate the catalog, just the view created in hive - This lets us run it in parallel Change-Id: I8085d8967dc96cbbb20e2d719072b29fe591cd98 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2958 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-10 20:53:23 -07:00
Matthew Jacobs	f5da019555	IMPALA-1025: Use converse of data source predicate operators if expr has val before slot Change-Id: I31790c037e2fa9af7b80c01014f7507ba5053e63 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2925 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins	2014-06-09 23:54:09 -07:00
Lenni Kuff	b212634f95	Fix expected types for exhaustive compute stats results Change-Id: Idaffa50b5d023bb912eb7e3717133fe0a9bbc825 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2901 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins (cherry picked from commit c4228b36c883ac759e4e3b0a8fd2e76c17e70929) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2930 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-06-09 22:46:11 -07:00
Victor Bittorf	09aff77a6c	IMPALA-943: removed database udf_test from front-end tests Added CATCH section to test files. Change-Id: I28ba3a6e5ae4c53df5b86505573793d7b150863b Reviewed-on: http://gerrit.ent.cloudera.com:8080/2782 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins (cherry picked from commit 5b616715958f3ebfdc45b8dc0e4baa82bd55f1d2) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2912	2014-06-09 19:06:15 -07:00
Matthew Jacobs	89ec6b3d7a	IMPALA-1033: Remove SOURCE keyword; very common identifier The SOURCE keyword was introduced for DATA SOURCE ddl commands, but it is also a very common identifier. This removes the SOURCE and SOURCES keywords and instead uses DATASOURCE and DATASOURCES. Change-Id: Ic6c2897d1e23efa169aa8787752fe4aa2bb125d5 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2895 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 267c13f9b46d249bfd1b8711fd3fadf6853dc1ef)	2014-06-09 17:17:14 -07:00
Srinath Shankar	5755b0bdee	Order by without limit for Impala Enable order-by without limit Added BufferedBlockMgr to allocate buffers and spill to disk. Added Sorter for the external sort impelementation Added new SortNode execution node that completely sorts its input Changes to enable writing in IoMgr went in a separate patch. Reviewed-on: http://gerrit.ent.cloudera.com:8080/1539 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins Conflicts: testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test Change-Id: I3ece32affe5b006f53bbdfcc03ded01471e818ac Reviewed-on: http://gerrit.ent.cloudera.com:8080/2900 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins	2014-06-09 16:58:08 -07:00
ishaan	db97981ab9	[CDH5] Switch the tpcds schemas to use decimal instead of float/double. This patch converts the tpcds schemas to use decimal instead of float/double. Currently, Impala can only r/w decimal in text, therefore, the tables are constrained to text. The schemas were obtained from the official tpc spec: http://www.tpc.org/tpcds/spec/tpcds_1.1.0.pdf Change-Id: I1ef0113dcb48bad52af75ee93b47b08adf9e1a69 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2403 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-06-08 11:47:23 -07:00
Nong Li	895d69c09f	IMPALA-1026: Fix decimal partition cols. Change-Id: I956b69a86528f1969febf356181dc3182f309909 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2841 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-06-06 09:26:56 -07:00
Matthew Jacobs	2f9b2ae785	Fix SHOW DATA SOURCE test; must execute setup/cleanup serially The SHOW DATA SOURCE tests were run as part of the other SHOW * tests in test_show(), but the setup/cleanup for data sources can't be run in parallel. This change moves the SHOW DATA SOURCE tests into a separate test method and the setup/cleanup code is only run for this test (i.e. not using setup_method() and teardown_method()). The test is then only executed serially. Change-Id: I221145f49cfe7290e132c6a87a5295b747c1fcc7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2864 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 5bcd769eae3a694d7f6f42d093f9197e8a4e8b77) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2870	2014-06-05 20:07:57 -07:00
Nong Li	b5c5c05bcb	Fix bad test. Needs to be overwrite to allow loading from snapshot. Change-Id: I7abe2a105d72662c874debfb2b9ae98647b03a1e Reviewed-on: http://gerrit.ent.cloudera.com:8080/2853 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-06-05 08:36:46 -07:00
Dimitris Tsirogiannis	0348a36b49	IMPALA-887: Improve partition pruning time (final) This commit contains the final set of changes for improving the performance of partition pruning. For each HdfsTable, we materialize a set of partition value metadata that allows the efficient evaluation of simple predicates on partition attributes without invoking the BE. These changes result in three orders of magnitude performance improvement during partition pruning. Change-Id: I5b405f0f45a470f2ba7b2191e0d46632c354d5ae Reviewed-on: http://gerrit.ent.cloudera.com:8080/2700 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2823	2014-06-03 23:17:44 -07:00
Nong Li	e6b7565eff	Fix decimal literal casting and cast expr reanalyze(). BigDecimal doesn't think about scale the way we need it to. Change-Id: I09612c31e30e80ce4806080f1d24c6615090785e Reviewed-on: http://gerrit.ent.cloudera.com:8080/2794 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-02 23:34:20 -07:00
Ippokratis Pandis	e34ede292c	IMPALA-1016: Return correct number of NULL values when projecting newly added column This patch handles the case where when a query was projecting a newly added column, the parquet scanner was returning infinite values. Change-Id: Ie5f4d4a88d5868e8d9e5c39fa9440821776dde3c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2725 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2761 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>	2014-06-01 01:28:25 -07:00
Nong Li	8f4dc0f2f0	IMPALA-974: Switch from FloatLiteral to DecimalLiteral. Float/Doubles are lossy so using those as the default literal type is problematic. Change-Id: I5a619dd931d576e2e6cd7774139e9bafb9452db9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2758 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-31 22:19:06 -07:00
Nong Li	5d80942d42	[CDH5] IMPALA-1019: Fix cancellation path in io mgr for cached reads. Change-Id: I11efd65d1efa900f79afe88b781262a44ac5006a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2703 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-30 19:14:39 -07:00
Nong Li	6e691f9500	IMPALA-1010: Remove Close() of build side in blocking join node. This optimization is generally not safe since the probe side is still streaming. The join node could acquire all of the data from the child into its own pool but then there's no real point in doing this (doesn't lead to lower memory footprint and just makes the mem accounting harder to reason about). This is exposed in busy plans. Change-Id: I37b0f6507dc67c79e5ebe8b9242ec86f28ddad41 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2747 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-30 11:50:50 -07:00
Skye Wanderman-Milne	c8b2017093	Add decimal UDF/UDA support. Change-Id: Ie48c1cb8e978c7282593b7f602dd68added6d3fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/2625 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 5048f04b332c13b1bff32fb257272b0fea4b8584) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2739	2014-05-29 20:49:53 -07:00
Matthew Jacobs	12b72c4330	IMPALA-1011: Handle SHOW DATA SOURCES when no sources configured Change-Id: I367b90c7603aea973d442f9186a6b32598a66a28 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2716 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins (cherry picked from commit 4df5c6d741237e9c91e84e39fd6ea760ccb40cf5) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2723 Reviewed-by: Matthew Jacobs <mj@cloudera.com>	2014-05-28 20:38:41 -07:00
Lenni Kuff	745c091fcc	[CDH5] Update SHOW TABLE STATS to include per-partition HDFS caching stats Change-Id: I71b01f84bbd308108d775e78c644e867b48e05be Reviewed-on: http://gerrit.ent.cloudera.com:8080/2621 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-05-28 08:54:54 -07:00
Lenni Kuff	c45e9a70d9	[CDH5] Add DDL support for HDFS caching This change adds DDL support for HDFS caching. The DDL allows the user to indicate a table or partition should be cached and which pool to cache the data into: * Create a cached table: CREATE TABLE ... CACHED IN 'poolName' * Cache a table/partition: ALTER TABLE ... [partitionSpec] SET CACHED IN 'poolName' * Uncache a table/partition: ALTER TABLE ... [partitionSpec] SET UNCACHED When a table/partition is marked as cached, a new HDFS caching request is submitted to cache the location (HDFS path) of the table/partition and the ID of that request is stored with in the table metadata (in the table properties). This is stored as: 'cache_directive_id'='<requestId>'. The cache requests and IDs are managed by HDFS and persisted across HDFS restarts. When a cached table or partition is dropped it is important to uncache the cached data (drop the associated cache request). For partitioned tables, this means dropping all cache requests from all cached partitions in the table. Likewise, if a partitioned table is created as cached, new partitions should be marked as cached by default. It is desirable to know which cache pools exists early on (in analysis) so the query will fail without hitting HDFS/CatalogServer if a non-existent pool is specified. To support this, a new cache pool catalog object type was introduced. The catalog server caches the known pools (periodically refreshing the cache) and sends the known pools out in catalog updates. This allows impalads to perform analysis checks on cache pool existence going to HDFS. It would be easy to use this to add basic cache pool management in the future (ADD/DROP/SHOW CACHE POOL). Waiting for the table/partition to become cached may take a long time. Instead of blocking the user from access the time during this period we will wait for the cache requests to complete in the background and once they have finished the table metadata will be automatically refreshed. Change-Id: I1de9c6e25b2a3bdc09edebda5510206eda3dd89b Reviewed-on: http://gerrit.ent.cloudera.com:8080/2310 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-05-27 16:47:15 -07:00
ishaan	10952da6e0	Change the slf4j version to harmonize with the rest of CDH. All other CDH components use slf4j version 1.7.5; Impala's use of an earlier version causes a lot of benign warnings. This patch changes Impala's version to be the same as the rest of the stack. Change-Id: I297903d146c6b7642de5b6fa4eefa28a6a08fafe Reviewed-on: http://gerrit.ent.cloudera.com:8080/2541 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-05-27 13:46:17 -07:00
Dimitris Tsirogiannis	ca86e470de	IMPALA-887: Improve partition pruning time This commit is the first step in improving the performance of partition pruning. Currently, Impala can prune approximately 10K partitions per sec, thereby introducing significant overhead for huge table with a large number of partitions. With this commit we reduce that overhead by 3X by batching the partition pruning calls to the backend. Change-Id: I3303bfc7fb6fe014790f58a5263adeea94d0fe7d Reviewed-on: http://gerrit.ent.cloudera.com:8080/2608 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2687	2014-05-26 13:10:12 -07:00
Victor Bittorf	c13a1d080e	IMPALA-938: Fix implicit casting in timestamp arithmetic exprs. Change-Id: I7e875ec2251e9782c98b60195ecbc92258b63b5c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2657 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins (cherry picked from commit 8822401dbb65d9b4d996d5bb78ac3aca1aa2dbac) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2671	2014-05-23 14:11:35 -07:00
Skye Wanderman-Milne	1dff1686aa	Add option to build UDF test libs in copy-udfs-udas.sh The option is off by default, but useful for running this script without building the world. Change-Id: I82d8251cf9bb2763ce69094da1995a4d6ceff167 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2647 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins (cherry picked from commit a7f77643820dcbfbab231a9260c94450564bd2df) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2659 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-05-22 18:01:55 -07:00
Nong Li	5729024fe9	IMPALA-984: Fix missing reanalyze in InlineViewRef and NULL handling. Change-Id: Ia80035c5456630aeef7a24288a998fe08546a282 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2652 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-21 18:18:29 -07:00
Henry Robinson	e87c0eb22a	[CDH5] Detect pseudo-distributed Llama cluster Since we're no longer using the MiniLlama, we need to explicitly set whether or not the cluster is pseudo-distributed. Impala needs this information to correctly translate datanode addresses to a format that Llama understands. This change (adapted from one made by Casey) adds a method to the frontend (callable via JNI) to get a configuration value from the Hadoop configuration. We'll set that configuration value for local RM testing. Change-Id: Ifd51db98a993ac0270dac2b832babbc394483c1a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2549 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-05-20 21:24:33 -07:00
Alex Behm	1b9a8020bf	IMPALA-996: Exclude non-materialized slots from a tuple's avgSerializedSize. Change-Id: Ic7936c6b5c5e6d4c162d91105128cda2b1b7284c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2617 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2626	2014-05-20 16:21:59 -07:00
Alex Behm	b252921363	IMPALA-994: Handle incorrect column metadata in views created by Hive. Change-Id: I3fba08d191c479f37371ce50fd07b8476a73eba2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2613 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2618 Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-05-19 20:17:23 -07:00
Matthew Jacobs	f9c9a7ca13	Add SHOW DATA SOURCES Change-Id: Ieeb0df107f45a58b8a99f717e96453da93ee7270 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2529 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit b2392c5bfe9fc928ad19af6ff6737e6dc6324e63) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2614	2014-05-19 17:52:27 -07:00
Matthew Jacobs	6ccd56bc1f	Enforce slot equivalences at data source scan nodes Change-Id: I2ed606ba398990ab05afa3301b6356c6a636e2bb Reviewed-on: http://gerrit.ent.cloudera.com:8080/2521 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 55061f6953956f45d433fe227ded539a648e3f9c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2536	2014-05-19 14:37:44 -07:00
Dimitris Tsirogiannis	a7a9cde86f	CDH-18969: Incorrect query result in Impala This commit fixes issue CDH-18969 where Impala returns wrong results when querying an HBase table. This issue is triggered when a column family sorts lexicographically before ":key", which is the column family of the row key, thereby causing the wrong column to be used as a row key by the backend. The following changes are included: 1. Modified the load function in HBaseTable.java to make sure the catalog object of an HBase table always stores the row key column first. Change-Id: Icd7ebc973d81672c04d5c7c8bbabd813338d5eac Reviewed-on: http://gerrit.ent.cloudera.com:8080/2513 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2602	2014-05-18 16:29:11 -07:00
Skye Wanderman-Milne	edbbe6035e	Decimal: read from Avro Allows reading decimal columns with or without codegen. Includes tests based on a data file posted on HIVE-5823. Change-Id: Ie541c6b98bd24543691850cb45a434af60b5a5a6 (cherry picked from commit 6983dcefdf70cce14724e17d03bc061ffb8f671c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2596 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-05-16 22:26:11 -07:00
Alex Behm	fcf4e43a3c	IMPALA-962: Fully qualify table and view names in toSql(). Change-Id: I6bf757c4ffbaf82c136af7b59d2d415234545a86 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2373 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2589	2014-05-16 01:26:38 -07:00
Lenni Kuff	61cbdd4f49	[CDH5] Add Sentry Service to local test environment Adds the ability to start/stop the Sentry Service to our local test environment and load the sentry-site.xml configs. Since the existing Sentry startup scripts don't work I wrote a simple wrapper to handle service startup. Change-Id: I1b77a2e50e51e6e6eae58cfed4d5d7c403dbc0b4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2540 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-05-14 12:02:02 -07:00
Dimitris Tsirogiannis	2d7a8b7c70	IMPALA-964: Full outer join on values() followed by group by hits a preconditions check This commit fixes IMPALA-964 where full outer join between two inline views followed by a group by (e.g. select 1 FROM (VALUES(1 x, 1 y)) a FULL OUTER JOIN (VALUES(1 x, 1 y)) b ON (a.x = b.y) GROUP BY a.x;) hits a preconditions check. This check evaluates if the numNodes (number of nodes for the purpose of resource estimation) variable is greater or equal to zero and is triggered when we try to compute the resource estimates (number of distinct values) of a plan fragment. The following changes are included in this commit: 1. Modified the getNumDistinctValues function in PlanFragment class to consider the special case where the numNodes of a plan fragment is -1. 2. Added a test case in QueryTest/joins.test. Change-Id: I2962ed5079e174d0e76ad990ab84e1fb1a4607ef Reviewed-on: http://gerrit.ent.cloudera.com:8080/2466 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2514 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>	2014-05-11 19:30:38 -07:00
Victor Bittorf	0bb66ef327	Adding aliases ADD_MONTHS and SUB_MONTHS This is a request for consistency with oracle. Change-Id: I463a66694a068cd773532d8f6f853a4b089b918a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2400 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins (cherry picked from commit 1f0b643789596f96c54580b8c5262fada4dfc958) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2502	2014-05-09 17:35:29 -07:00

1 2 3 4 5 ...

580 Commits