mirror of
https://github.com/apache/impala.git
synced 2025-12-19 09:58:28 -05:00
Adds a topic documenting Apache Ozone support, and recommends using the ofs protocol. Change-Id: I724a40c086fe0466646e7e108645fd8dbaee5f1d Reviewed-on: http://gerrit.cloudera.org:8080/19448 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
104 lines
4.2 KiB
XML
104 lines
4.2 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept rev="4.2.0" id="impala_ozone">
|
|
|
|
<title>Using Impala with Apache Ozone Storage</title>
|
|
|
|
<titlealts audience="PDF">
|
|
<navtitle>Ozone Storage</navtitle>
|
|
</titlealts>
|
|
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="Ozone"/>
|
|
<data name="Category" value="Disk Storage"/>
|
|
<data name="Category" value="Administrators"/>
|
|
<data name="Category" value="Developers"/>
|
|
<data name="Category" value="Data Analysts"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
<indexterm audience="hidden">Ozone</indexterm>
|
|
You can use Impala to query data files that reside on Apache Ozone distributed storage,
|
|
rather than in HDFS. The combination of the Impala query engine and Apache Ozone storage
|
|
is certified on <keyword keyref="impala42"/> or higher.
|
|
</p>
|
|
|
|
<p>
|
|
For more information on Ozone, see <xref keyref="upstream_ozone_site"/>.
|
|
</p>
|
|
|
|
<p>
|
|
The typical use case for Impala and Ozone together is to use Ozone for the default
|
|
filesystem, replacing HDFS entirely. In this configuration, when you create a database,
|
|
table, or partition, the data always resides on Ozone storage and you do not need to
|
|
specify any special <codeph>LOCATION</codeph> attribute. If you do specify a
|
|
<codeph>LOCATION</codeph> attribute, its value refers to a path within the Ozone
|
|
filesystem. For example:
|
|
</p>
|
|
|
|
<codeblock>-- If the default filesystem is Ozone, all Impala data resides there
|
|
-- and all Impala databases and tables are located there.
|
|
CREATE TABLE t1 (x INT, s STRING);
|
|
|
|
-- You can specify LOCATION for database, table, or partition,
|
|
-- using values from the Ozone filesystem.
|
|
CREATE DATABASE d1 LOCATION '/some/path/on/ozone/server/d1.db';
|
|
CREATE TABLE d1.t2 (a TINYINT, b BOOLEAN);
|
|
</codeblock>
|
|
|
|
<p>
|
|
Impala can write to, delete, and rename data files and database, table, and partition
|
|
directories on Ozone storage. Therefore, Impala statements such as <codeph>CREATE
|
|
TABLE</codeph>, <codeph>DROP TABLE</codeph>, <codeph>CREATE DATABASE</codeph>,
|
|
<codeph>DROP DATABASE</codeph>, <codeph>ALTER TABLE</codeph>, and <codeph>INSERT</codeph>
|
|
work the same with Ozone storage as with HDFS.
|
|
</p>
|
|
|
|
<p>
|
|
Ozone supports multiple protocols: <codeph>ofs</codeph>, <codeph>o3fs</codeph>, and
|
|
<codeph>s3a</codeph>. Impala supports reading <codeph>ofs</codeph> and <codeph>o3fs</codeph>.
|
|
Impala can also read <codeph>s3a</codeph> (see <xref href="impala_s3.xml#s3"/>). However
|
|
<codeph>ofs</codeph> is their newer protocol, and the only one Impala supports as a default
|
|
filesystem. We recommend using it for <xref href="impala_ddl.xml#ddl"/> to avoid access
|
|
limitations, and for <xref href="impala_dml.xml#dml"/> and
|
|
<xref href="impala_select.xml#select"/> for performance.
|
|
</p>
|
|
|
|
<p conref="../shared/impala_common.xml#common/ozone_block_size_caveat"/>
|
|
|
|
<p>
|
|
Impala's spill-to-disk feature may be configured to use Ozone storage by specifying a full
|
|
URI (e.g. <codeph>ofs://host:port/volume/bucket/key</codeph>) for the spill location. See
|
|
<xref href="impala_disk_space.xml#disk_space"/> for details on configuring remote
|
|
spill-to-disk.
|
|
</p>
|
|
|
|
<!-- <p outputclass="toc inpage"/> -->
|
|
|
|
</conbody>
|
|
|
|
</concept>
|