impala

mirror of https://github.com/apache/impala.git synced 2026-02-03 09:00:39 -05:00

Files

Arnab Karmakar 6a0eedf4af IMPALA-13299: Support CREATE TABLE LIKE for Iceberg from HDFS sources

This patch enables creating Iceberg tables from non-Iceberg HDFS source
tables (Parquet, ORC, etc.) using CREATE TABLE LIKE with STORED BY ICEBERG.
This provides a metadata-only operation to convert table schemas to Iceberg
format without copying data.

Supported source types: Parquet, ORC, Avro, Text, and other HDFS-based formats
Not supported: Kudu tables, JDBC tables, Paimon tables

Use case: This is particularly useful for Apache Hive 3.1 environments where
CTAS (CREATE TABLE AS SELECT) with STORED BY ICEBERG is not supported - that
feature requires Hive 4.0+. Users can use CREATE TABLE LIKE to create the
Iceberg schema, then use INSERT INTO to migrate data.

Testing:
- Comprehensive tests covering schema conversion with various data types,
  partitioned and external tables, complex types (STRUCT, ARRAY, MAP)
- Bidirectional conversion tests (non-Iceberg → Iceberg and reverse)
- Hive interoperability tests verifying data round-trips correctly

Change-Id: Id162f217e49e9f396419b09815b92eb7f351881e
Reviewed-on: http://gerrit.cloudera.org:8080/23733
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

2026-02-02 16:29:43 +00:00

functional-planner

IMPALA-13712: Calcite Planner - Enable constant folding

2026-01-28 20:32:12 +00:00

functional-query

IMPALA-13299: Support CREATE TABLE LIKE for Iceberg from HDFS sources

2026-02-02 16:29:43 +00:00

perf-regression

IMPALA-9709: Remove Impala-lzo from the development environment

2020-06-15 23:42:12 +00:00

targeted-perf

IMPALA-14680: Improve row regex search syntax in runtime profile tests

2026-01-16 07:11:16 +00:00

targeted-stress

IMPALA-9709: Remove Impala-lzo from the development environment

2020-06-15 23:42:12 +00:00

tpcds

IMPALA-14680: Improve row regex search syntax in runtime profile tests

2026-01-16 07:11:16 +00:00

tpcds_partitioned

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-15 00:28:38 +00:00

tpcds-insert

IMPALA-14026: Migrate test files that assert Beeswax dml result.

2025-05-02 20:04:00 +00:00

tpcds-unmodified

IMPALA-13617: Rename c_last_review_date to c_last_review_date_sk

2024-12-20 06:20:37 +00:00

tpch

IMPALA-14680: Improve row regex search syntax in runtime profile tests

2026-01-16 07:11:16 +00:00

tpch_nested

IMPALA-13758: Use context manager in ImpalaTestSuite.change_database

2025-02-19 23:50:34 +00:00

README

…

README

This directory contains Impala test workloads. The directory layout for the workloads should follow:

workloads/
   <data set name>/<data set name>_dimensions.csv  <- The test dimension file
   <data set name>/<data set name>_core.csv  <- A test vector file
   <data set name>/<data set name>_pairwise.csv
   <data set name>/<data set name>_exhaustive.csv
   <data set name>/queries/<query test>.test <- The queries for this workload