mirror of
https://github.com/apache/impala.git
synced 2025-12-19 09:58:28 -05:00
IMPALA-13257: [DOCS] Documentation for unnest() and querying arrays
Currently, the two topics, Querying Arrays and Zipping Unnest on Arrays from Views, were missing. The documentation has been added, and the parent topic has been updated with references to the child topics. Change-Id: I3ad29153bf6ed3939fb1d87d6220bd22f8f7fa1b Reviewed-on: http://gerrit.cloudera.org:8080/21651 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
This commit is contained in:
committed by
Quanlong Huang
parent
d8a8412c2b
commit
10a380bcbb
@@ -112,7 +112,10 @@ under the License.
|
||||
</topicref>
|
||||
<topicref href="topics/impala_tinyint.xml"/>
|
||||
<topicref href="topics/impala_varchar.xml"/>
|
||||
<topicref href="topics/impala_complex_types.xml"/>
|
||||
<topicref href="topics/impala_complex_types.xml">
|
||||
<topicref href="topics/impala_queryingarrays.xml"/>
|
||||
<topicref href="topics/impala_unnest_views.xml"/>
|
||||
</topicref>
|
||||
</topicref>
|
||||
<topicref href="topics/impala_literals.xml"/>
|
||||
<topicref href="topics/impala_operators.xml"/>
|
||||
|
||||
@@ -45,8 +45,6 @@ under the License.
|
||||
and higher. The Hive <codeph>UNION</codeph> type is not currently supported.
|
||||
</p>
|
||||
|
||||
<p outputclass="toc inpage"/>
|
||||
|
||||
<p>
|
||||
Once you understand the basics of complex types, refer to the individual type topics when you need to refresh your memory about syntax
|
||||
and examples:
|
||||
@@ -65,6 +63,9 @@ under the License.
|
||||
<xref href="impala_map.xml#map"/>
|
||||
</li>
|
||||
</ul>
|
||||
<p outputclass="toc inpage"/>
|
||||
<p>For information on querying arrays and improvements to zipping unnest functionality, refer to
|
||||
the following:</p>
|
||||
|
||||
</conbody>
|
||||
|
||||
|
||||
132
docs/topics/impala_queryingarrays.xml
Normal file
132
docs/topics/impala_queryingarrays.xml
Normal file
@@ -0,0 +1,132 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one
|
||||
or more contributor license agreements. See the NOTICE file
|
||||
distributed with this work for additional information
|
||||
regarding copyright ownership. The ASF licenses this file
|
||||
to you under the Apache License, Version 2.0 (the
|
||||
"License"); you may not use this file except in compliance
|
||||
with the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing,
|
||||
software distributed under the License is distributed on an
|
||||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
||||
<concept id="impala_queryingarrays" rev="4.1.0">
|
||||
<title>Querying arrays (<keyword keyref="impala41"/> or higher only)</title>
|
||||
<titlealts audience="PDF">
|
||||
<navtitle>Querying arrays</navtitle>
|
||||
</titlealts>
|
||||
<prolog>
|
||||
<metadata>
|
||||
<data name="Category" value="Impala"/>
|
||||
<data name="Category" value="Impala Data Types"/>
|
||||
<data name="Category" value="SQL"/>
|
||||
<data name="Category" value="Data Analysts"/>
|
||||
<data name="Category" value="Developers"/>
|
||||
<data name="Category" value="Schemas"/>
|
||||
</metadata>
|
||||
</prolog>
|
||||
<conbody>
|
||||
<p rev="4.1.0">
|
||||
<indexterm audience="hidden">Querying arrays</indexterm> Describes how to use UNNEST function
|
||||
to query arrays. ARRAY data types represent collections with arbitrary numbers of elements,
|
||||
where each element is the same type.</p>
|
||||
<section id="section_yl4_2qb_3cc">
|
||||
<title>Querying arrays using JOIN and UNNEST</title>
|
||||
<p>You can query arrays by making a join between the table and the array inside the table.
|
||||
This approach is improved with the introduction of the <codeph>UNNEST</codeph> function in
|
||||
the <codeph>SELECT</codeph> list or in the <codeph>FROM</codeph> clause in the
|
||||
<codeph>SELECT</codeph> statement. When you use <codeph>UNNEST</codeph>, you can provide
|
||||
more than one array in the <codeph>SELECT</codeph> statement. If you use JOINs for querying
|
||||
arrays it will yield a <term>joining unnest</term> however the latter will provide a
|
||||
<term>zipping unnest</term>.</p>
|
||||
</section>
|
||||
<section id="section_hmf_hqb_3cc">
|
||||
<title>Example of querying arrays using JOIN</title>
|
||||
<p>Use <codeph>JOIN</codeph> in cases where you must join unnest of multiple arrays. However
|
||||
if you must zip unnest then use the newly implemented <codeph>UNNEST</codeph> function.</p>
|
||||
<p>Here is an example of a <codeph>SELECT</codeph> statement that uses JOINs to query an
|
||||
array.</p>
|
||||
<codeblock id="codeblock_uqy_rqb_3cc">SELECT id, arr1.item, arr2.item FROM tbl_name tbl, tbl.arr1, tbl.arr2;
|
||||
|
||||
ID, ARR1.ITEM, ARR2.ITEM
|
||||
[1, 1, 10]
|
||||
[1, 1, 11]
|
||||
[1, 2, 10]
|
||||
[1, 2, 11]
|
||||
[1, 3, 10]
|
||||
[1, 3, 11]
|
||||
</codeblock>
|
||||
<note id="note_tpb_wln_htb">
|
||||
<p>The test data used in this example is ID: 1, arr1: {1, 2, 3}, arr2: {10, 11}</p>
|
||||
</note>
|
||||
</section>
|
||||
<section id="section_ipq_tqb_3cc">
|
||||
<title>Examples of querying arrays using UNNEST</title>
|
||||
<p>You can use one of the two different syntaxes shown here to unnest multiple arrays in one
|
||||
query. This results in the items of the arrays being zipped together instead of joining.</p>
|
||||
<ul id="ul_jpq_tqb_3cc">
|
||||
<li>ISO:SQL 2016 compliant syntax:
|
||||
<codeblock id="codeblock_kpq_tqb_3cc">SELECT a1.item, a2.item
|
||||
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);
|
||||
</codeblock></li>
|
||||
<li>Postgres compatible
|
||||
syntax:<codeblock id="codeblock_yl5_3mn_htb">SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;</codeblock></li>
|
||||
</ul>
|
||||
<p><b>Unnest operator in SELECT list</b></p>
|
||||
<codeblock id="codeblock_lpq_tqb_3cc">SELECT id, unnest(arr1), unnest(arr2) FROM tbl_name;</codeblock>
|
||||
<p><b>Unnest operator in FROM clause</b></p>
|
||||
<codeblock id="codeblock_mpq_tqb_3cc">SELECT id, arr1.item, arr2.item FROM tbl_name tbl_alias, UNNEST(tbl_alias.arr1, tbl_alias.arr2);</codeblock>
|
||||
<p>This new functionality would zip the arrays next to each other as shown here. </p>
|
||||
<codeblock id="codeblock_npq_tqb_3cc">ID, ARR1.ITEM, ARR2.ITEM
|
||||
[1, 1, 10]
|
||||
[1, 2, 11]
|
||||
[1, 3, NULL]
|
||||
</codeblock>
|
||||
<p>Note, that arr2 is shorter than arr1 so the "missing" items in its column will be filled
|
||||
with NULLs.</p>
|
||||
<note id="note_opq_tqb_3cc">The test data used in this example is ID: 1, arr1: {1, 2, 3},
|
||||
arr2: {10, 11}</note>
|
||||
</section>
|
||||
<section id="section_i1g_wqb_3cc">
|
||||
<title>Limitations in Using UNNEST</title>
|
||||
<p>
|
||||
<ul id="ul_j1g_wqb_3cc">
|
||||
<li>Only arrays from the same table can be zipping unnested</li>
|
||||
<li>The old (joining) and the new (zipping) unnests cannot be used together</li>
|
||||
<li>You can add a <codeph>WHERE</codeph> filter on an unnested item only if you add a
|
||||
wrapper <codeph>SELECT</codeph> and do the filtering
|
||||
<p>Example:</p><codeblock id="codeblock_k1g_wqb_3cc">SELECT id, arr1_unnest FROM (SELECT id, unnest(arr1) as arr1_unnest FROM tbl_name) WHERE arr1_unnest < 10;</codeblock></li>
|
||||
</ul>
|
||||
</p>
|
||||
</section>
|
||||
<section id="section_ewb_yqb_3cc">
|
||||
<title>Using ARRAY columns in the SELECT list</title>
|
||||
<!--Removing this since zipping unnest for arrays (IMPALA-10920) and allow array type in SELECT list (IMPALA-9498) are all added in the same upstream release (Impala 4.1)
|
||||
<p>Prior to this release to look into the content of an array you had to unnest the array
|
||||
either by the joining syntax or by using the zipping <codeph>UNNEST</codeph> operator as
|
||||
shown in the following example:</p>
|
||||
<codeblock id="codeblock_fwb_yqb_3cc">SELECT unnest(IDs), unnest(NAMES) FROM table_name;</codeblock>
|
||||
-->
|
||||
<p rev="4.1">Impala 4.1 adds support to return <codeph>ARRAYs</codeph> as
|
||||
<codeph>STRINGs</codeph> (<term>JSON arrays</term>) in the <codeph>SELECT</codeph> list,
|
||||
for example: </p>
|
||||
<codeblock id="codeblock_gwb_yqb_3cc">select id, int_array from functional_parquet.complextypestbl where id = 1;
|
||||
returns: 1, “[1,2,3]”
|
||||
</codeblock>
|
||||
<p>Returning <codeph>ARRAYs</codeph> from inline or Hive Metastore views is also supported.
|
||||
These arrays can be used both in the select list or as relative table references.</p>
|
||||
<codeblock id="codeblock_hwb_yqb_3cc">select id, int_array from (select id, int_array from complextypestbl) s;</codeblock>
|
||||
<p>Though <codeph>STRUCTs</codeph> are already supported, <codeph>ARRAYs</codeph> and
|
||||
<codeph>STRUCTs</codeph> nested within each other are not supported yet. Using them as
|
||||
non-relative table references is also not supported yet.</p>
|
||||
</section>
|
||||
</conbody>
|
||||
</concept>
|
||||
80
docs/topics/impala_unnest_views.xml
Normal file
80
docs/topics/impala_unnest_views.xml
Normal file
@@ -0,0 +1,80 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one
|
||||
or more contributor license agreements. See the NOTICE file
|
||||
distributed with this work for additional information
|
||||
regarding copyright ownership. The ASF licenses this file
|
||||
to you under the Apache License, Version 2.0 (the
|
||||
"License"); you may not use this file except in compliance
|
||||
with the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing,
|
||||
software distributed under the License is distributed on an
|
||||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
||||
<concept id="impala_unnest_views" rev="4.1.0">
|
||||
<title>Zipping unnest on arrays from views (<keyword keyref="impala41"/> or higher only)</title>
|
||||
<titlealts audience="PDF">
|
||||
<navtitle>Zipping unnest on arrays from views</navtitle>
|
||||
</titlealts>
|
||||
<prolog>
|
||||
<metadata>
|
||||
<data name="Category" value="Impala"/>
|
||||
<data name="Category" value="Impala Data Types"/>
|
||||
<data name="Category" value="SQL"/>
|
||||
<data name="Category" value="Data Analysts"/>
|
||||
<data name="Category" value="Developers"/>
|
||||
<data name="Category" value="Schemas"/>
|
||||
</metadata>
|
||||
</prolog>
|
||||
<conbody>
|
||||
<p rev="4.1.0">
|
||||
<indexterm audience="hidden">Zipping unnest on arrays from views</indexterm>
|
||||
<keyword keyref="impala41"/>, the zipping unnest functionality works for arrays in both tables
|
||||
and
|
||||
views.<codeblock id="codeblock_nwz_lyc_3cc">SELECT UNNSET(arr1) FROM view_name;</codeblock>
|
||||
</p>
|
||||
<section id="section_irf_rrb_3cc">
|
||||
<title>UNNEST() on array columns</title>
|
||||
<p>You can use <codeph>UNNEST()</codeph> on array columns in two ways. Using one of these two
|
||||
ways results in the items of the arrays being zipped together instead of joining.</p>
|
||||
<ul id="ul_jrf_rrb_3cc">
|
||||
<li>ISO:SQL 2016 compliant syntax</li>
|
||||
</ul>
|
||||
<codeblock id="codeblock_krf_rrb_3cc">SELECT a1.item, a2.item
|
||||
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);
|
||||
</codeblock>
|
||||
<ul id="ul_lrf_rrb_3cc">
|
||||
<li>Postgres compatible syntax</li>
|
||||
</ul>
|
||||
<codeblock id="codeblock_mrf_rrb_3cc">SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;</codeblock>
|
||||
<p>When unnesting multiple arrays with zipping unnest, the i'th item of one array will be
|
||||
placed next to the i'th item of the other arrays in the results. If the size of the arrays
|
||||
is not equal then the shorter arrays will be filled with NULL values up to the size of the
|
||||
longest array as shown in the following example:</p>
|
||||
<p>The test data used in this example is arr1: {1, 2, 3}, arr2: {11, 12}</p>
|
||||
<p>After running any of the queries listed in the examples, the result will be as shown
|
||||
here:</p>
|
||||
<p>
|
||||
<codeblock id="codeblock_n5z_cjq_mtb">arr1 arr2
|
||||
[1, 11]
|
||||
[2, 12]
|
||||
[3, NULL]
|
||||
</codeblock>
|
||||
</p>
|
||||
</section>
|
||||
<section id="section_ifc_trb_3cc">
|
||||
<title>Example of an UNNEST() in an inline view using SELECT/FILTER of the inline view</title>
|
||||
<p>In the following example the filter is not in the <codeph>SELECT</codeph> query that
|
||||
creates the inline view but a level above that.</p>
|
||||
<codeblock id="codeblock_jfc_trb_3cc">SELECT id, ac1, ac2 FROM (SELECT id, UNNEST(array_col1) AS ac1, UNNEST(array_col2) AS ac2 FROM some_view) WHERE id <10;
|
||||
</codeblock>
|
||||
</section>
|
||||
</conbody>
|
||||
</concept>
|
||||
Reference in New Issue
Block a user