impala/docs/topics/impala_conversion_functions.xml

<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="conversion_functions">

  <title>Impala Type Conversion Functions</title>

  <titlealts audience="PDF">

    <navtitle>Type Conversion Functions</navtitle>

  </titlealts>

  <prolog>
    <metadata>
      <data name="Category" value="Impala"/>
      <data name="Category" value="Impala Functions"/>
      <data name="Category" value="Impala Data Types"/>
      <data name="Category" value="SQL"/>
      <data name="Category" value="Data Analysts"/>
      <data name="Category" value="Developers"/>
      <data name="Category" value="Querying"/>
    </metadata>
  </prolog>

  <conbody>

    <p>
      Conversion functions are usually used in combination with other functions, to explicitly
      pass the expected data types. Impala has strict rules regarding data types for function
      parameters. For example, Impala does not automatically convert a <codeph>DOUBLE</codeph>
      value to <codeph>FLOAT</codeph>, a <codeph>BIGINT</codeph> value to <codeph>INT</codeph>,
      or other conversion where precision could be lost or overflow could occur. Also, for
      reporting or dealing with loosely defined schemas in big data contexts, you might
      frequently need to convert values to or from the <codeph>STRING</codeph> type.
    </p>

    <note>
      Although in <keyword keyref="impala23_full"/>, the <codeph>SHOW FUNCTIONS</codeph> output
      for database <codeph>_IMPALA_BUILTINS</codeph> contains some function signatures matching
      the pattern <codeph>castto*</codeph>, these functions are not intended for public use and
      are expected to be hidden in future.
    </note>

    <p>
      <b>Function reference:</b>
    </p>

    <p>
      Impala supports the following type conversion functions:
    </p>

    <ul>
      <li>
        <xref href="#conversion_functions/cast">CAST</xref>
      </li>

      <li>
        <xref href="#conversion_functions/typeof">TYPEOF</xref>
      </li>
    </ul>

    <dl>
      <dlentry id="cast">

        <dt>
          CAST(expr AS type)
        </dt>

        <dd>
          <b>Purpose:</b> Converts the value of an expression to any other type. If the
          expression value is of a type that cannot be converted to the target type, the result
          is <codeph>NULL</codeph>.
          <p>
            <b>Usage notes:</b> Use <codeph>CAST</codeph> when passing a column value or literal
            to a function that expects a parameter with a different type. Frequently used in SQL
            operations such as <codeph>CREATE TABLE AS SELECT</codeph> and <codeph>INSERT ...
            VALUES</codeph> to ensure that values from various sources are of the appropriate
            type for the destination columns. Where practical, do a one-time
            <codeph>CAST()</codeph> operation during the ingestion process to make each column
            into the appropriate type, rather than using many <codeph>CAST()</codeph> operations
            in each query; doing type conversions for each row during each query can be
            expensive for tables with millions or billions of rows.
          </p>

          <p
            conref="../shared/impala_common.xml#common/timezone_conversion_caveat"/>

          <p conref="../shared/impala_common.xml#common/example_blurb"/>
<codeblock>SELECT CONCAT('Here are the first ',10,' results.'); -- Fails
SELECT CONCAT('Here are the first ',CAST(10 AS STRING),' results.'); -- Succeeds
</codeblock>
          <p>
            The following example starts with a text table where every column has a type of
            <codeph>STRING</codeph>, which might be how you ingest data of unknown schema until
            you can verify the cleanliness of the underlying values. Then it uses
            <codeph>CAST()</codeph> to create a new Parquet table with the same data, but using
            specific numeric data types for the columns with numeric data. Using numeric types
            of appropriate sizes can result in substantial space savings on disk and in memory,
            and performance improvements in queries, over using strings or larger-than-necessary
            numeric types.
          </p>
<codeblock>CREATE TABLE t1 (name STRING, x STRING, y STRING, z STRING);

CREATE TABLE t2 STORED AS PARQUET
AS SELECT
  name,
  CAST(x AS BIGINT) x,
  CAST(y AS TIMESTAMP) y,
  CAST(z AS SMALLINT) z
FROM t1;</codeblock>
          <p conref="../shared/impala_common.xml#common/related_info"/>

          <p>
            For details of casts from each kind of data type, see the description of the
            appropriate type: <xref
              href="impala_tinyint.xml#tinyint"/>,
            <xref
              href="impala_smallint.xml#smallint"/>,
            <xref
              href="impala_int.xml#int"/>,
            <xref href="impala_bigint.xml#bigint"
            />,
            <xref href="impala_float.xml#float"/>,
            <xref
              href="impala_double.xml#double"/>,
            <xref
              href="impala_decimal.xml#decimal"/>,
            <xref
              href="impala_string.xml#string"/>,
            <xref
              href="impala_char.xml#char"/>,
            <xref
              href="impala_varchar.xml#varchar"/>,
            <xref
              href="impala_timestamp.xml#timestamp"/>,
            <xref
              href="impala_boolean.xml#boolean"/>
          </p>
        </dd>

      </dlentry>

      <dlentry rev="2.3.0" id="typeof">

        <dt>
          TYPEOF(type value)
        </dt>

        <dd>
          <b>Purpose:</b> Returns the name of the data type corresponding to an expression. For
          types with extra attributes, such as length for <codeph>CHAR</codeph> and
          <codeph>VARCHAR</codeph>, or precision and scale for <codeph>DECIMAL</codeph>,
          includes the full specification of the type.
<!-- To do: How about for columns of complex types? Or fields within complex types? -->
          <p>
            <b>Return type:</b> <codeph>STRING</codeph>
          </p>

          <p>
            <b>Usage notes:</b> Typically used in interactive exploration of a schema, or in
            application code that programmatically generates schema definitions such as
            <codeph>CREATE TABLE</codeph> statements, for example, to get the type of an
            expression such as <codeph>col1 / col2</codeph> or <codeph>CONCAT(col1, col2,
            col3)</codeph>. This function is especially useful for arithmetic expressions
            involving <codeph>DECIMAL</codeph> types because the precision and scale of the
            result is can be different than that of the operands.
          </p>

          <p conref="../shared/impala_common.xml#common/added_in_230"/>

          <p conref="../shared/impala_common.xml#common/example_blurb"/>
<codeblock>SELECT TYPEOF(2), TYPEOF(2+2);
+-----------+---------------+
| typeof(2) | typeof(2 + 2) |
+-----------+---------------+
| TINYINT   | SMALLINT      |
+-----------+---------------+
</codeblock>
        </dd>

      </dlentry>
    </dl>

  </conbody>

</concept>