mirror of
https://github.com/apache/impala.git
synced 2025-12-23 21:08:39 -05:00
Updates to DML statements for Impala + Kudu
Fill in syntax, usage notes, examples for UPDATE, DELETE, UPSERT. Take out IGNORE from INSERT. Add 2nd syntax form and examples for DELETE. Add join syntax to UPDATE. Change-Id: I60512b7957fb53d86d3123a4f1d46fbb355f4665 Reviewed-on: http://gerrit.cloudera.org:8080/5646 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Impala Public Jenkins
This commit is contained in:
committed by
Impala Public Jenkins
parent
f590bc0da6
commit
fc721fbd06
@@ -2720,6 +2720,10 @@ select max(height), avg(height) from census_data where age > 20;
|
||||
<xref href="../topics/impala_sync_ddl.xml#sync_ddl">SYNC_DDL</xref> query option)
|
||||
</p>
|
||||
|
||||
<p id="dml_blurb_kudu" rev="kudu">
|
||||
<b>Statement type:</b> DML
|
||||
</p>
|
||||
|
||||
<p rev="1.2" id="sync_ddl_blurb">
|
||||
If you connect to different Impala nodes within an <cmdname>impala-shell</cmdname> session for
|
||||
load-balancing purposes, you can enable the <codeph>SYNC_DDL</codeph> query option to make each DDL
|
||||
@@ -3691,14 +3695,38 @@ sudo pip-python install ssl</codeblock>
|
||||
around sharing content between the Impala documentation and the Kudu documentation.
|
||||
</p>
|
||||
|
||||
<p id="kudu_blurb">
|
||||
<p id="kudu_blurb" rev="kudu 2.8.0">
|
||||
<b>Kudu considerations:</b>
|
||||
</p>
|
||||
|
||||
<p id="kudu_no_load_data">
|
||||
<p id="kudu_no_load_data" rev="kudu">
|
||||
The <codeph>LOAD DATA</codeph> statement cannot be used with Kudu tables.
|
||||
</p>
|
||||
|
||||
<p id="kudu_no_truncate_table" rev="kudu">
|
||||
Currently, the <codeph>TRUNCATE TABLE</codeph> statement cannot be used with Kudu tables.
|
||||
</p>
|
||||
|
||||
<p id="kudu_no_insert_overwrite" rev="kudu">
|
||||
Currently, the <codeph>INSERT OVERWRITE</codeph> syntax cannot be used with Kudu tables.
|
||||
</p>
|
||||
|
||||
<p id="kudu_unsupported_data_type" rev="kudu">
|
||||
Currently, the data types <codeph>DECIMAL</codeph>, <codeph>TIMESTAMP</codeph>, <codeph>CHAR</codeph>, <codeph>VARCHAR</codeph>,
|
||||
<codeph>ARRAY</codeph>, <codeph>MAP</codeph>, and <codeph>STRUCT</codeph> cannot be used with Kudu tables.
|
||||
</p>
|
||||
|
||||
<p id="kudu_non_pk_data_type" rev="kudu">
|
||||
Currently, the data types <codeph>BOOLEAN</codeph>, <codeph>FLOAT</codeph>,
|
||||
and <codeph>DOUBLE</codeph> cannot be used for primary key columns in Kudu tables.
|
||||
</p>
|
||||
|
||||
<p id="pk_implies_not_null" rev="kudu">
|
||||
Because all of the primary key columns must have non-null values, specifying a column
|
||||
in the <codeph>PRIMARY KEY</codeph> clause implicitly adds the <codeph>NOT
|
||||
NULL</codeph> attribute to that column.
|
||||
</p>
|
||||
|
||||
</section>
|
||||
|
||||
</conbody>
|
||||
|
||||
@@ -18,7 +18,7 @@ specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
||||
<concept id="delete">
|
||||
<concept id="delete" rev="kudu">
|
||||
|
||||
<title>DELETE Statement (<keyword keyref="impala28"/> or higher only)</title>
|
||||
<titlealts audience="PDF"><navtitle>DELETE</navtitle></titlealts>
|
||||
@@ -39,43 +39,149 @@ under the License.
|
||||
|
||||
<p>
|
||||
<indexterm audience="hidden">DELETE statement</indexterm>
|
||||
Deletes one or more rows from a Kudu table.
|
||||
Although deleting a single row or a range of rows would be inefficient for tables using HDFS
|
||||
data files, Kudu is able to perform this operation efficiently. Therefore, this statement
|
||||
only works for Impala tables that use the Kudu storage engine.
|
||||
Deletes an arbitrary number of rows from a Kudu table.
|
||||
This statement only works for Impala tables that use the Kudu storage engine.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/syntax_blurb"/>
|
||||
|
||||
<codeblock>
|
||||
DELETE [FROM] [<varname>database_name</varname>.]<varname>table_name</varname> [ WHERE <varname>where_conditions</varname> ]
|
||||
|
||||
DELETE <varname>table_ref</varname> FROM [<varname>joined_table_refs</varname>] [ WHERE <varname>where_conditions</varname> ]
|
||||
</codeblock>
|
||||
|
||||
<p rev="kudu">
|
||||
Normally, a <codeph>DELETE</codeph> operation for a Kudu table fails if
|
||||
some partition key columns are not found, due to their being deleted or changed
|
||||
by a concurrent <codeph>UPDATE</codeph> or <codeph>DELETE</codeph> operation.
|
||||
Specify <codeph>DELETE IGNORE <varname>rest_of_statement</varname></codeph> to
|
||||
make the <codeph>DELETE</codeph> continue in this case. The rows with the nonexistent
|
||||
duplicate partition key column values are not removed.
|
||||
<p>
|
||||
The first form evaluates rows from one table against an optional
|
||||
<codeph>WHERE</codeph> clause, and deletes all the rows that
|
||||
match the <codeph>WHERE</codeph> conditions, or all rows if
|
||||
<codeph>WHERE</codeph> is omitted.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/dml_blurb"/>
|
||||
<p>
|
||||
The second form evaluates one or more join clauses, and deletes
|
||||
all matching rows from one of the tables. The join clauses can
|
||||
include non-Kudu tables, but the table from which the rows
|
||||
are deleted must be a Kudu table. The <codeph>FROM</codeph>
|
||||
keyword is required in this case, to separate the name of
|
||||
the table whose rows are being deleted from the table names
|
||||
of the join clauses.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/sync_ddl_blurb"/>
|
||||
<p>
|
||||
The conditions in the <codeph>WHERE</codeph> clause are the same ones allowed
|
||||
for the <codeph>SELECT</codeph> statement. See <xref href="impala_select.xml#select"/>
|
||||
for details.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The conditions in the <codeph>WHERE</codeph> clause can refer to
|
||||
any combination of primary key columns or other columns. Referring to
|
||||
primary key columns in the <codeph>WHERE</codeph> clause is more efficient
|
||||
than referring to non-primary key columns.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
If the <codeph>WHERE</codeph> clause is omitted, all rows are removed from the table.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Because Kudu currently does not enforce strong consistency during concurrent DML operations,
|
||||
be aware that the results after this statement finishes might be different than you
|
||||
intuitively expect:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p>
|
||||
If some rows cannot be deleted because their
|
||||
some primary key columns are not found, due to their being deleted
|
||||
by a concurrent <codeph>DELETE</codeph> operation,
|
||||
the statement succeeds but returns a warning.
|
||||
</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>
|
||||
A <codeph>DELETE</codeph> statement might also overlap with
|
||||
<codeph>INSERT</codeph>, <codeph>UPDATE</codeph>,
|
||||
or <codeph>UPSERT</codeph> statements running concurrently on the same table.
|
||||
After the statement finishes, there might be more or fewer rows than expected in the table
|
||||
because it is undefined whether the <codeph>DELETE</codeph> applies to rows that are
|
||||
inserted or updated while the <codeph>DELETE</codeph> is in progress.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
The number of affected rows is reported in an <cmdname>impala-shell</cmdname> message
|
||||
and in the query profile.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/dml_blurb_kudu"/>
|
||||
|
||||
<note conref="../shared/impala_common.xml#common/compute_stats_next"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/example_blurb"/>
|
||||
<codeblock>
|
||||
|
||||
<p>
|
||||
The following examples show how to delete rows from a specified
|
||||
table, either all rows or rows that match a <codeph>WHERE</codeph>
|
||||
clause:
|
||||
</p>
|
||||
|
||||
<codeblock>
|
||||
-- Deletes all rows. The FROM keyword is optional.
|
||||
DELETE FROM kudu_table;
|
||||
DELETE kudu_table;
|
||||
|
||||
-- Deletes 0, 1, or more rows.
|
||||
-- (If c1 is a single-column primary key, the statement could only
|
||||
-- delete 0 or 1 rows.)
|
||||
DELETE FROM kudu_table WHERE c1 = 100;
|
||||
|
||||
-- Deletes all rows that match all the WHERE conditions.
|
||||
DELETE FROM kudu_table WHERE
|
||||
(c1 > c2 OR c3 IN ('hello','world')) AND c4 IS NOT NULL;
|
||||
DELETE FROM t1 WHERE
|
||||
(c1 IN (1,2,3) AND c2 > c3) OR c4 IS NOT NULL;
|
||||
DELETE FROM time_series WHERE
|
||||
year = 2016 AND month IN (11,12) AND day > 15;
|
||||
|
||||
-- WHERE condition with a subquery.
|
||||
DELETE FROM t1 WHERE
|
||||
c5 IN (SELECT DISTINCT other_col FROM other_table);
|
||||
|
||||
-- Does not delete any rows, because the WHERE condition is always false.
|
||||
DELETE FROM kudu_table WHERE 1 = 0;
|
||||
</codeblock>
|
||||
|
||||
<p>
|
||||
The following examples show how to delete rows that are part
|
||||
of the result set from a join:
|
||||
</p>
|
||||
|
||||
<codeblock>
|
||||
-- Remove _all_ rows from t1 that have a matching X value in t2.
|
||||
DELETE t1 FROM t1 JOIN t2 ON t1.x = t2.x;
|
||||
|
||||
-- Remove _some_ rows from t1 that have a matching X value in t2.
|
||||
DELETE t1 FROM t1 JOIN t2 ON t1.x = t2.x
|
||||
WHERE t1.y = FALSE and t2.z > 100;
|
||||
|
||||
-- Delete from a Kudu table based on a join with a non-Kudu table.
|
||||
DELETE t1 FROM kudu_table t1 JOIN non_kudu_table t2 ON t1.x = t2.x;
|
||||
|
||||
-- The tables can be joined in any order as long as the Kudu table
|
||||
-- is specified as the deletion target.
|
||||
DELETE t2 FROM non_kudu_table t1 JOIN kudu_table t2 ON t1.x = t2.x;
|
||||
</codeblock>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/related_info"/>
|
||||
|
||||
<p>
|
||||
<xref href="impala_kudu.xml#impala_kudu"/>
|
||||
<xref href="impala_kudu.xml#impala_kudu"/>, <xref href="impala_insert.xml#insert"/>,
|
||||
<xref href="impala_update.xml#update"/>, <xref href="impala_upsert.xml#upsert"/>
|
||||
</p>
|
||||
|
||||
</conbody>
|
||||
|
||||
@@ -33,7 +33,7 @@ under the License.
|
||||
<data name="Category" value="Developers"/>
|
||||
<data name="Category" value="Tables"/>
|
||||
<data name="Category" value="S3"/>
|
||||
<!-- <data name="Category" value="Kudu"/> -->
|
||||
<data name="Category" value="Kudu"/>
|
||||
<!-- This is such an important statement, think if there are more applicable categories. -->
|
||||
</metadata>
|
||||
</prolog>
|
||||
@@ -90,16 +90,39 @@ hint_clause ::= [SHUFFLE] | [NOSHUFFLE] (Note: the square brackets are part o
|
||||
See <xref href="impala_complex_types.xml#complex_types"/> for details about working with complex types.
|
||||
</p>
|
||||
|
||||
<p rev="kudu">
|
||||
<b>Ignoring duplicate partition keys for Kudu tables (IGNORE clause)</b>
|
||||
</p>
|
||||
<p conref="../shared/impala_common.xml#common/kudu_blurb"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/kudu_no_insert_overwrite"/>
|
||||
|
||||
<p rev="kudu">
|
||||
Normally, an <codeph>INSERT</codeph> operation into a Kudu table fails if
|
||||
it would result in duplicate partition key columns for any rows.
|
||||
Specify <codeph>INSERT IGNORE <varname>rest_of_statement</varname></codeph> to
|
||||
make the <codeph>INSERT</codeph> continue in this case. The rows that would
|
||||
have duplicate partition key columns are not inserted.
|
||||
Kudu tables require a unique primary key for each row. If an <codeph>INSERT</codeph>
|
||||
statement attempts to insert a row with the same values for the primary key columns
|
||||
as an existing row, that row is discarded and the insert operation continues.
|
||||
When rows are discarded due to duplicate primary keys, the statement finishes
|
||||
with a warning, not an error. (This is a change from early releases of Kudu
|
||||
where the default was to return in error in such cases, and the syntax
|
||||
<codeph>INSERT IGNORE</codeph> was required to make the statement succeed.
|
||||
The <codeph>IGNORE</codeph> clause is no longer part of the <codeph>INSERT</codeph>
|
||||
syntax.)
|
||||
</p>
|
||||
|
||||
<p>
|
||||
For situations where you prefer to replace rows with duplicate primary key values,
|
||||
rather than discarding the new data, you can use the <codeph>UPSERT</codeph>
|
||||
statement instead of <codeph>INSERT</codeph>. <codeph>UPSERT</codeph> inserts
|
||||
rows that are entirely new, and for rows that match an existing primary key in the
|
||||
table, the non-primary-key columns are updated to reflect the values in the
|
||||
<q>upserted</q> data.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
If you really want to store new rows, not replace existing ones, but cannot do so
|
||||
because of the primary key uniqueness constraint, consider recreating the table
|
||||
with additional columns included in the primary key.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
See <xref href="impala_kudu.xml#impala_kudu"/> for more details about using Impala with Kudu.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
|
||||
|
||||
@@ -18,7 +18,7 @@ specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
||||
<concept id="update">
|
||||
<concept id="update" rev="kudu">
|
||||
|
||||
<title>UPDATE Statement (<keyword keyref="impala28"/> or higher only)</title>
|
||||
<titlealts audience="PDF"><navtitle>UPDATE</navtitle></titlealts>
|
||||
@@ -39,43 +39,141 @@ under the License.
|
||||
|
||||
<p>
|
||||
<indexterm audience="hidden">UPDATE statement</indexterm>
|
||||
Updates one or more rows from a Kudu table.
|
||||
Although updating a single row or a range of rows would be inefficient for tables using HDFS
|
||||
data files, Kudu is able to perform this operation efficiently. Therefore, this statement
|
||||
only works for Impala tables that use the Kudu storage engine.
|
||||
Updates an arbitrary number of rows in a Kudu table.
|
||||
This statement only works for Impala tables that use the Kudu storage engine.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/syntax_blurb"/>
|
||||
|
||||
<codeblock>
|
||||
UPDATE [<varname>database_name</varname>.]<varname>table_name</varname> SET <varname>col</varname> = <varname>val</varname> [, <varname>col</varname> = <varname>val</varname> ... ]
|
||||
[ FROM <varname>joined_table_refs</varname> ]
|
||||
[ WHERE <varname>where_conditions</varname> ]
|
||||
</codeblock>
|
||||
|
||||
<p rev="kudu">
|
||||
Normally, an <codeph>UPDATE</codeph> operation for a Kudu table fails if
|
||||
some partition key columns are not found, due to their being deleted or changed
|
||||
by a concurrent <codeph>UPDATE</codeph> or <codeph>DELETE</codeph> operation.
|
||||
Specify <codeph>UPDATE IGNORE <varname>rest_of_statement</varname></codeph> to
|
||||
make the <codeph>UPDATE</codeph> continue in this case. The rows with the nonexistent
|
||||
duplicate partition key column values are not changed.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/dml_blurb"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/sync_ddl_blurb"/>
|
||||
<p>
|
||||
None of the columns that make up the primary key can be updated by the
|
||||
<codeph>SET</codeph> clause.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The conditions in the <codeph>WHERE</codeph> clause are the same ones allowed
|
||||
for the <codeph>SELECT</codeph> statement. See <xref href="impala_select.xml#select"/>
|
||||
for details.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
If the <codeph>WHERE</codeph> clause is omitted, all rows in the table are updated.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The conditions in the <codeph>WHERE</codeph> clause can refer to
|
||||
any combination of primary key columns or other columns. Referring to
|
||||
primary key columns in the <codeph>WHERE</codeph> clause is more efficient
|
||||
than referring to non-primary key columns.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Because Kudu currently does not enforce strong consistency during concurrent DML operations,
|
||||
be aware that the results after this statement finishes might be different than you
|
||||
intuitively expect:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p>
|
||||
If some rows cannot be updated because their
|
||||
some primary key columns are not found, due to their being deleted
|
||||
by a concurrent <codeph>DELETE</codeph> operation,
|
||||
the statement succeeds but returns a warning.
|
||||
</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>
|
||||
An <codeph>UPDATE</codeph> statement might also overlap with
|
||||
<codeph>INSERT</codeph>, <codeph>UPDATE</codeph>,
|
||||
or <codeph>UPSERT</codeph> statements running concurrently on the same table.
|
||||
After the statement finishes, there might be more or fewer matching rows than expected
|
||||
in the table because it is undefined whether the <codeph>UPDATE</codeph> applies to rows
|
||||
that are inserted or updated while the <codeph>UPDATE</codeph> is in progress.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
The number of affected rows is reported in an <cmdname>impala-shell</cmdname> message
|
||||
and in the query profile.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The optional <codeph>FROM</codeph> clause lets you restrict the
|
||||
updates to only the rows in the specified table that are part
|
||||
of the result set for a join query. The join clauses can include
|
||||
non-Kudu tables, but the table from which the rows are deleted
|
||||
must be a Kudu table.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/dml_blurb_kudu"/>
|
||||
|
||||
<note conref="../shared/impala_common.xml#common/compute_stats_next"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/example_blurb"/>
|
||||
<codeblock>
|
||||
|
||||
<p>
|
||||
The following examples show how to perform a simple update
|
||||
on a table, with or without a <codeph>WHERE</codeph> clause:
|
||||
</p>
|
||||
|
||||
<codeblock>
|
||||
-- Set all rows to the same value for column c3.
|
||||
-- In this case, c1 and c2 are primary key columns
|
||||
-- and so cannot be updated.
|
||||
UPDATE kudu_table SET c3 = 'not applicable';
|
||||
|
||||
-- Update only the rows that match the condition.
|
||||
UPDATE kudu_table SET c3 = NULL WHERE c1 > 100 AND c3 IS NULL;
|
||||
|
||||
-- Does not update any rows, because the WHERE condition is always false.
|
||||
UPDATE kudu_table SET c3 = 'impossible' WHERE 1 = 0;
|
||||
|
||||
-- Change the values of multiple columns in a single UPDATE statement.
|
||||
UPDATE kudu_table SET c3 = upper(c3), c4 = FALSE, c5 = 0 WHERE c6 = TRUE;
|
||||
</codeblock>
|
||||
|
||||
<p>
|
||||
The following examples show how to perform an update using the
|
||||
<codeph>FROM</codeph> keyword with a join clause:
|
||||
</p>
|
||||
|
||||
<codeblock>
|
||||
-- Uppercase a column value, only for rows that have
|
||||
-- an ID that matches the value from another table.
|
||||
UPDATE kudu_table SET c3 = upper(c3)
|
||||
FROM kudu_table JOIN non_kudu_table
|
||||
ON kudu_table.id = non_kudu_table.id;
|
||||
|
||||
-- Same effect as previous statement.
|
||||
-- Assign table aliases in FROM clause, then refer to
|
||||
-- short names elsewhere in the statement.
|
||||
UPDATE t1 SET c3 = upper(c3)
|
||||
FROM kudu_table t1 JOIN non_kudu_table t2
|
||||
ON t1.id = t2.id;
|
||||
|
||||
-- Same effect as previous statements, but more efficient.
|
||||
-- Use WHERE clause to skip updating values that are
|
||||
-- already uppercase.
|
||||
UPDATE t1 SET c3 = upper(c3)
|
||||
FROM kudu_table t1 JOIN non_kudu_table t2
|
||||
ON t1.id = t2.id
|
||||
WHERE c3 != upper(c3);
|
||||
</codeblock>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/related_info"/>
|
||||
|
||||
<p>
|
||||
<xref href="impala_kudu.xml#impala_kudu"/>
|
||||
<xref href="impala_kudu.xml#impala_kudu"/>, <xref href="impala_insert.xml#insert"/>,
|
||||
<xref href="impala_delete.xml#delete"/>, <xref href="impala_upsert.xml#upsert"/>
|
||||
</p>
|
||||
|
||||
</conbody>
|
||||
|
||||
@@ -18,7 +18,7 @@ specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
||||
<concept id="upsert">
|
||||
<concept id="upsert" rev="kudu IMPALA-3725">
|
||||
|
||||
<title>UPSERT Statement (<keyword keyref="impala28"/> or higher only)</title>
|
||||
<titlealts audience="PDF"><navtitle>UPSERT</navtitle></titlealts>
|
||||
@@ -62,28 +62,54 @@ under the License.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Although inserting or updating a single row or a small set of rows would be inefficient for tables using HDFS
|
||||
data files, Kudu is able to perform this operation efficiently.
|
||||
Therefore, this statement only works for Impala tables that use the
|
||||
Kudu storage engine.
|
||||
This statement only works for Impala tables that use the Kudu storage engine.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/syntax_blurb"/>
|
||||
|
||||
<codeblock>
|
||||
UPSERT INTO [TABLE] [<varname>db_name</varname>.]<varname>table_name</varname>
|
||||
[(<varname>column_list</varname>)]
|
||||
{
|
||||
[<varname>hint_clause</varname>] <varname>select_statement</varname>
|
||||
| VALUES (<varname>value</varname> [, <varname>value</varname> ...]) [, (<varname>value</varname> [, <varname>value</varname> ...]) ...]
|
||||
}
|
||||
|
||||
hint_clause ::= [SHUFFLE] | [NOSHUFFLE]
|
||||
(Note: the square brackets are part of the syntax.)
|
||||
</codeblock>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/dml_blurb"/>
|
||||
<p>
|
||||
The <varname>select_statement</varname> clause can use the full syntax, such as
|
||||
<codeph>WHERE</codeph> and join clauses, as <xref href="impala_select.xml#select"/>.
|
||||
</p>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/dml_blurb_kudu"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/sync_ddl_blurb"/>
|
||||
<p>
|
||||
If you specify a column list, any omitted columns in the inserted or updated rows are
|
||||
set to their default value (if the column has one) or <codeph>NULL</codeph> (if the
|
||||
column does not have a default value). Therefore, if a column is not nullable and
|
||||
has no default value, it must be included in the column list for any <codeph>UPSERT</codeph>
|
||||
statement. Because all primary key columns meet these conditions, all the primary key
|
||||
columns must be specified in every <codeph>UPSERT</codeph> statement.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Because Kudu tables can efficiently handle small incremental changes, the <codeph>VALUES</codeph>
|
||||
clause is more practical to use with Kudu tables than with HDFS-based tables.
|
||||
</p>
|
||||
|
||||
<note conref="../shared/impala_common.xml#common/compute_stats_next"/>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/example_blurb"/>
|
||||
<codeblock>
|
||||
|
||||
<codeblock>
|
||||
UPSERT INTO kudu_table (pk, c1, c2, c3) VALUES (0, 'hello', 50, true), (1, 'world', -1, false);
|
||||
UPSERT INTO production_table SELECT * FROM staging_table;
|
||||
UPSERT INTO production_table SELECT * FROM staging_table WHERE c1 IS NOT NULL AND c2 > 0;
|
||||
</codeblock>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/related_info"/>
|
||||
|
||||
Reference in New Issue
Block a user