impala/docs/topics/impala_functions_overview.xml

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="functions">

  <title>Overview of Impala Functions</title>
  <titlealts audience="PDF"><navtitle>Functions</navtitle></titlealts>
  <prolog>
    <metadata>
      <data name="Category" value="Impala"/>
      <data name="Category" value="Impala Functions"/>
      <data name="Category" value="SQL"/>
      <data name="Category" value="Data Analysts"/>
      <data name="Category" value="Developers"/>
    </metadata>
  </prolog>

  <conbody>

    <p>
      Functions let you apply arithmetic, string, or other computations and transformations to Impala data. You
      typically use them in <codeph>SELECT</codeph> lists and <codeph>WHERE</codeph> clauses to filter and format
      query results so that the result set is exactly what you want, with no further processing needed on the
      application side.
    </p>

    <p>
      Scalar functions return a single result for each input row. See <xref href="impala_functions.xml#builtins"/>.
    </p>

<codeblock>[localhost:21000] > select name, population from country where continent = 'North America' order by population desc limit 4;
[localhost:21000] > select upper(name), population from country where continent = 'North America' order by population desc limit 4;
+-------------+------------+
| upper(name) | population |
+-------------+------------+
| USA         | 320000000  |
| MEXICO      | 122000000  |
| CANADA      | 25000000   |
| GUATEMALA   | 16000000   |
+-------------+------------+
</codeblock>
    <p>
      Aggregate functions combine the results from multiple rows:
      either a single result for the entire table, or a separate result for each group of rows.
      Aggregate functions are frequently used in combination with <codeph>GROUP BY</codeph>
      and <codeph>HAVING</codeph> clauses in the <codeph>SELECT</codeph> statement.
      See <xref href="impala_aggregate_functions.xml#aggregate_functions"/>.
    </p>

<codeblock>[localhost:21000] > select continent, <b>sum(population)</b> as howmany from country <b>group by continent</b> order by howmany desc;
+---------------+------------+
| continent     | howmany    |
+---------------+------------+
| Asia          | 4298723000 |
| Africa        | 1110635000 |
| Europe        | 742452000  |
| North America | 565265000  |
| South America | 406740000  |
| Oceania       | 38304000   |
+---------------+------------+
</codeblock>

    <p>
      User-defined functions (UDFs) let you code your own logic.  They can be either scalar or aggregate functions.
      UDFs let you implement important business or scientific logic using high-performance code for Impala to automatically parallelize.
      You can also use UDFs to implement convenience functions to simplify reporting or porting SQL from other database systems.
      See <xref href="impala_udf.xml#udfs"/>.
    </p>

<codeblock>[localhost:21000] > select <b>rot13('Hello world!')</b> as 'Weak obfuscation';
+------------------+
| weak obfuscation |
+------------------+
| Uryyb jbeyq!     |
+------------------+
[localhost:21000] > select <b>likelihood_of_new_subatomic_particle(sensor1, sensor2, sensor3)</b> as probability
                  > from experimental_results group by experiment;
</codeblock>

    <p>
      Each function is associated with a specific database. For example, if you issue a <codeph>USE somedb</codeph>
      statement followed by <codeph>CREATE FUNCTION somefunc</codeph>, the new function is created in the
      <codeph>somedb</codeph> database, and you could refer to it through the fully qualified name
      <codeph>somedb.somefunc</codeph>. You could then issue another <codeph>USE</codeph> statement
      and create a function with the same name in a different database.
    </p>

    <p>
      Impala built-in functions are associated with a special database named <codeph>_impala_builtins</codeph>,
      which lets you refer to them from any database without qualifying the name.
    </p>

<codeblock>[localhost:21000] > show databases;
+-------------------------+
| name                    |
+-------------------------+
| <b>_impala_builtins</b>        |
| analytic_functions      |
| avro_testing            |
| data_file_size          |
...
[localhost:21000] > show functions in _impala_builtins like '*subs*';
+-------------+-----------------------------------+
| return type | signature                         |
+-------------+-----------------------------------+
| STRING      | substr(STRING, BIGINT)            |
| STRING      | substr(STRING, BIGINT, BIGINT)    |
| STRING      | substring(STRING, BIGINT)         |
| STRING      | substring(STRING, BIGINT, BIGINT) |
+-------------+-----------------------------------+
</codeblock>

    <p>
      <b>Related statements:</b> <xref href="impala_create_function.xml#create_function"/>,
      <xref href="impala_drop_function.xml#drop_function"/>
    </p>
  </conbody>
</concept>