String functions are classified as those primarily accepting or returning
The string functions operate mainly on these data types:
Function reference:
Impala supports the following string functions:
Return type:
Return type:
For general information about Base64 encoding, see
Return type:
For general information about Base64 encoding, see
Return type:
The following examples show the default
Syntax: BYTES (byte_expression)
Where:
byte_expression is the byte string for which the number of bytes is to be returned.
The BYTES function is similar to the LENGTH() function except that it always returns the number of bytes regardless of the status of UTF-8 mode whether it is turned ON or OFF.
The following is the list of supported string data types to be used in byte_expression:
The following example obtains the number of bytes from “cloudera” by applying the BYTES function to the column “cloudera”, which is type VARCHAR.
Return type:
When applied to a
Return type:
Usage notes: Can be used as the inverse of the
Return type:
Return type:
Return type:
Return type:
By default, returns a single string covering the whole result set. To include other
columns or values in the result set, or to produce multiple concatenated strings for
subsets of rows, include a
Strictly speaking,
Return type:
Example:
Return type:
If the
The optional third and fourth arguments
let you find instances of the
Return type:
Usage notes:
If the two input strings are identical, the function returns 0.0.
If there is no matching character between the input strings, the function returns 1.0.
If either input strings is
If the length of either input string is bigger than 255 characters, the function returns an error.
Return type:
Usage notes:
If the two input strings are identical, the function returns 1.0.
If there is no matching character between the input strings, the function returns 0.0.
If either input strings is
If the length of either input string is bigger than 255 characters, the function returns an error.
Return type:
Usage notes:
If the two input strings are identical, the function returns 0.0.
If there is no matching character between the input strings, the function returns 1.0.
The function returns an error in the following cases:
If either input strings is
The default
The prefix weight will only be applied if the Jaro-distance exceeds the optional
Use Jaro or Jaro-Winkler functions to perform fuzzy matches on relatively short strings, e.g. to scrub user inputs of names against the records in the database.
Return type:
Usage notes:
If the two input strings are identical, the function returns 1.0.
If there is no matching character between the input strings, the function returns 0.0.
The function returns an error in the following cases:
If either input strings is
The default
The prefix weight will only be applied if the Jaro-similarity exceeds the optional
Return type:
When applied to a
Return type:
If input strings are equal, the function returns 0.
If either input exceeds 255 characters, the function returns an error.
If either input string is
If the length of one input string is zero, the function returns the length of the other string.
Example:
Return type:
Return type:
Return type:
Return type:
Return type:
Usage notes: This function is important for the traditional Hadoop use case
of interpreting web logs. For example, if the web traffic data features raw URLs not
divided into separate table columns, you can count visitors to a particular page by
extracting the
Return type:
Examples:
Return type:
This example shows escaping one of special characters in RE2.
This example shows escaping all the special characters in RE2.
q\|r\:s\-t | +------------------------------------------------------------+ ]]>
Return type:
This example shows how group 0 matches the full pattern string, including the
portion outside any
This example shows how group 1 matches just the contents inside the first
Unlike in earlier Impala releases, the regular expression library used in Impala 2.0
and later supports the
The flags that you can include in the optional third argument are:
Return type:
This example shows how
Return type:
These examples show how you can replace parts of a string matching a pattern with
replacement text, which can include backreferences to any
Replace a character pattern with new text:
Replace a character pattern with substitution text that includes the original matching text:
Remove all characters that are not digits:
Return type:
Return type:
Because this function does not use any regular expression patterns, it is typically
faster than
If any argument is
Matching is case-sensitive.
If the replacement string contains another instance of the target string, the expansion is only performed once, instead of applying again to the newly constructed string.
Return type:
Return type:
Return type:
Return type:
The
All matching of the delimiter is done exactly, not using any regular expression patterns.
Return type:
Return type:
Return type:
Return type:
For example:
Return type:
Usage notes:
If
For example:
If
If
If
For example:
Usage notes: Often used during data cleansing operations during the ETL
cycle, if input values might still have surrounding spaces. For a more
general-purpose function that can remove other leading and trailing characters
besides spaces, see
TRIM-FROM syntax is a SQL-standardized wrapper around
Syntax #1:
Syntax #2:
Syntax #3:
Return type:
Return type: