Files
impala/docs/topics/impala_conversion_functions.xml
Alex Rodoni e8ee827a6d [DOCS] Built-in Functions doc format Changes
- The function titles were changed to upper case.
- The function titles no longer use <codeph>. <codeph> font appears
smaller than the <p> font.
- Return type were changed to upper case data types.
- Minor typos were fixed, such as extra commas and periods in titles.
- The indexterm dita elememts were removed. Indexterm was incomplete
and WIP. No plan to go ahead and implement it, so removed.

Change-Id: I797532463da8d29fe5bc7543cfdfb5b2b82db197
Reviewed-on: http://gerrit.cloudera.org:8080/11619
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Brown <mikeb@cloudera.com>
2018-10-10 18:18:09 +00:00

205 lines
7.8 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="conversion_functions">
<title>Impala Type Conversion Functions</title>
<titlealts audience="PDF">
<navtitle>Type Conversion Functions</navtitle>
</titlealts>
<prolog>
<metadata>
<data name="Category" value="Impala"/>
<data name="Category" value="Impala Functions"/>
<data name="Category" value="Impala Data Types"/>
<data name="Category" value="SQL"/>
<data name="Category" value="Data Analysts"/>
<data name="Category" value="Developers"/>
<data name="Category" value="Querying"/>
</metadata>
</prolog>
<conbody>
<p>
Conversion functions are usually used in combination with other functions, to explicitly
pass the expected data types. Impala has strict rules regarding data types for function
parameters. For example, Impala does not automatically convert a <codeph>DOUBLE</codeph>
value to <codeph>FLOAT</codeph>, a <codeph>BIGINT</codeph> value to <codeph>INT</codeph>,
or other conversion where precision could be lost or overflow could occur. Also, for
reporting or dealing with loosely defined schemas in big data contexts, you might
frequently need to convert values to or from the <codeph>STRING</codeph> type.
</p>
<note>
Although in <keyword keyref="impala23_full"/>, the <codeph>SHOW FUNCTIONS</codeph> output
for database <codeph>_IMPALA_BUILTINS</codeph> contains some function signatures matching
the pattern <codeph>castto*</codeph>, these functions are not intended for public use and
are expected to be hidden in future.
</note>
<p>
<b>Function reference:</b>
</p>
<p>
Impala supports the following type conversion functions:
</p>
<ul>
<li>
<xref href="#conversion_functions/cast">CAST</xref>
</li>
<li>
<xref href="#conversion_functions/typeof">TYPEOF</xref>
</li>
</ul>
<dl>
<dlentry id="cast">
<dt>
CAST(expr AS type)
</dt>
<dd>
<b>Purpose:</b> Converts the value of an expression to any other type. If the
expression value is of a type that cannot be converted to the target type, the result
is <codeph>NULL</codeph>.
<p>
<b>Usage notes:</b> Use <codeph>CAST</codeph> when passing a column value or literal
to a function that expects a parameter with a different type. Frequently used in SQL
operations such as <codeph>CREATE TABLE AS SELECT</codeph> and <codeph>INSERT ...
VALUES</codeph> to ensure that values from various sources are of the appropriate
type for the destination columns. Where practical, do a one-time
<codeph>CAST()</codeph> operation during the ingestion process to make each column
into the appropriate type, rather than using many <codeph>CAST()</codeph> operations
in each query; doing type conversions for each row during each query can be
expensive for tables with millions or billions of rows.
</p>
<p
conref="../shared/impala_common.xml#common/timezone_conversion_caveat"/>
<p conref="../shared/impala_common.xml#common/example_blurb"/>
<codeblock>SELECT CONCAT('Here are the first ',10,' results.'); -- Fails
SELECT CONCAT('Here are the first ',CAST(10 AS STRING),' results.'); -- Succeeds
</codeblock>
<p>
The following example starts with a text table where every column has a type of
<codeph>STRING</codeph>, which might be how you ingest data of unknown schema until
you can verify the cleanliness of the underlying values. Then it uses
<codeph>CAST()</codeph> to create a new Parquet table with the same data, but using
specific numeric data types for the columns with numeric data. Using numeric types
of appropriate sizes can result in substantial space savings on disk and in memory,
and performance improvements in queries, over using strings or larger-than-necessary
numeric types.
</p>
<codeblock>CREATE TABLE t1 (name STRING, x STRING, y STRING, z STRING);
CREATE TABLE t2 STORED AS PARQUET
AS SELECT
name,
CAST(x AS BIGINT) x,
CAST(y AS TIMESTAMP) y,
CAST(z AS SMALLINT) z
FROM t1;</codeblock>
<p conref="../shared/impala_common.xml#common/related_info"/>
<p>
For details of casts from each kind of data type, see the description of the
appropriate type: <xref
href="impala_tinyint.xml#tinyint"/>,
<xref
href="impala_smallint.xml#smallint"/>,
<xref
href="impala_int.xml#int"/>,
<xref href="impala_bigint.xml#bigint"
/>,
<xref href="impala_float.xml#float"/>,
<xref
href="impala_double.xml#double"/>,
<xref
href="impala_decimal.xml#decimal"/>,
<xref
href="impala_string.xml#string"/>,
<xref
href="impala_char.xml#char"/>,
<xref
href="impala_varchar.xml#varchar"/>,
<xref
href="impala_timestamp.xml#timestamp"/>,
<xref
href="impala_boolean.xml#boolean"/>
</p>
</dd>
</dlentry>
<dlentry rev="2.3.0" id="typeof">
<dt>
TYPEOF(type value)
</dt>
<dd>
<b>Purpose:</b> Returns the name of the data type corresponding to an expression. For
types with extra attributes, such as length for <codeph>CHAR</codeph> and
<codeph>VARCHAR</codeph>, or precision and scale for <codeph>DECIMAL</codeph>,
includes the full specification of the type.
<!-- To do: How about for columns of complex types? Or fields within complex types? -->
<p>
<b>Return type:</b> <codeph>STRING</codeph>
</p>
<p>
<b>Usage notes:</b> Typically used in interactive exploration of a schema, or in
application code that programmatically generates schema definitions such as
<codeph>CREATE TABLE</codeph> statements, for example, to get the type of an
expression such as <codeph>col1 / col2</codeph> or <codeph>CONCAT(col1, col2,
col3)</codeph>. This function is especially useful for arithmetic expressions
involving <codeph>DECIMAL</codeph> types because the precision and scale of the
result is can be different than that of the operands.
</p>
<p conref="../shared/impala_common.xml#common/added_in_230"/>
<p conref="../shared/impala_common.xml#common/example_blurb"/>
<codeblock>SELECT TYPEOF(2), TYPEOF(2+2);
+-----------+---------------+
| typeof(2) | typeof(2 + 2) |
+-----------+---------------+
| TINYINT | SMALLINT |
+-----------+---------------+
</codeblock>
</dd>
</dlentry>
</dl>
</conbody>
</concept>