mirror of
https://github.com/apache/impala.git
synced 2025-12-19 09:58:28 -05:00
This now gives a clean RAT check with bin/check-rat-report.py, which is one way for the Impala community to check compliance with ASF rules on intellectual property. Change-Id: I2ad06435f84a65ba126759e42a18fdaf52cd7036 Reviewed-on: http://gerrit.cloudera.org:8080/5232 Reviewed-by: Jim Apple <jbapple-impala@apache.org> Tested-by: Impala Public Jenkins Reviewed-by: John Russell <jrussell@cloudera.com>
110 lines
4.4 KiB
XML
110 lines
4.4 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept id="dml">
|
|
|
|
<title>DML Statements</title>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="SQL"/>
|
|
<data name="Category" value="DML"/>
|
|
<data name="Category" value="Data Analysts"/>
|
|
<data name="Category" value="Developers"/>
|
|
<data name="Category" value="Tables"/>
|
|
<data name="Category" value="ETL"/>
|
|
<data name="Category" value="Ingest"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
DML refers to <q>Data Manipulation Language</q>, a subset of SQL statements that modify the data stored in
|
|
tables. Because Impala focuses on query performance and leverages the append-only nature of HDFS storage,
|
|
currently Impala only supports a small set of DML statements:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<xref keyref="delete"/>. Works for Kudu tables only.
|
|
</li>
|
|
|
|
<li>
|
|
<xref keyref="insert"/>.
|
|
</li>
|
|
|
|
<li>
|
|
<xref keyref="load_data"/>. Does not apply for HBase or Kudu tables.
|
|
</li>
|
|
|
|
<li>
|
|
<xref keyref="update"/>. Works for Kudu tables only.
|
|
</li>
|
|
|
|
<li>
|
|
<xref keyref="upsert"/>. Works for Kudu tables only.
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
<codeph>INSERT</codeph> in Impala is primarily optimized for inserting large volumes of data in a single
|
|
statement, to make effective use of the multi-megabyte HDFS blocks. This is the way in Impala to create new
|
|
data files. If you intend to insert one or a few rows at a time, such as using the <codeph>INSERT ...
|
|
VALUES</codeph> syntax, that technique is much more efficient for Impala tables stored in HBase. See
|
|
<xref href="impala_hbase.xml#impala_hbase"/> for details.
|
|
</p>
|
|
|
|
<p>
|
|
<codeph>LOAD DATA</codeph> moves existing data files into the directory for an Impala table, making them
|
|
immediately available for Impala queries. This is one way in Impala to work with data files produced by other
|
|
Hadoop components. (<codeph>CREATE EXTERNAL TABLE</codeph> is the other alternative; with external tables,
|
|
you can query existing data files, while the files remain in their original location.)
|
|
</p>
|
|
|
|
<p>
|
|
In <keyword keyref="impala28_full"/> and higher, Impala does support the <codeph>UPDATE</codeph>, <codeph>DELETE</codeph>,
|
|
and <codeph>UPSERT</codeph> statements for Kudu tables.
|
|
For HDFS or S3 tables, to simulate the effects of an <codeph>UPDATE</codeph> or <codeph>DELETE</codeph> statement
|
|
in other database systems, typically you use <codeph>INSERT</codeph> or <codeph>CREATE TABLE AS SELECT</codeph> to copy data
|
|
from one table to another, filtering out or changing the appropriate rows during the copy operation.
|
|
</p>
|
|
|
|
<p>
|
|
You can also achieve a result similar to <codeph>UPDATE</codeph> by using Impala tables stored in HBase.
|
|
When you insert a row into an HBase table, and the table
|
|
already contains a row with the same value for the key column, the older row is hidden, effectively the same
|
|
as a single-row <codeph>UPDATE</codeph>.
|
|
</p>
|
|
|
|
<p rev="2.6.0">
|
|
Impala can perform DML operations for tables or partitions stored in the Amazon S3 filesystem
|
|
with <keyword keyref="impala26_full"/> and higher. See <xref href="impala_s3.xml#s3"/> for details.
|
|
</p>
|
|
|
|
<p conref="../shared/impala_common.xml#common/related_info"/>
|
|
|
|
<p>
|
|
The other major classifications of SQL statements are data definition language (see
|
|
<xref href="impala_ddl.xml#ddl"/>) and queries (see <xref href="impala_select.xml#select"/>).
|
|
</p>
|
|
</conbody>
|
|
</concept>
|