mirror of
https://github.com/apache/impala.git
synced 2025-12-19 18:12:08 -05:00
IMPALA-14092 Part2: Support querying of paimon data table via JNI
This patch mainly implement the querying of paimon data table
through JNI based scanner.
Features implemented:
- support column pruning.
The partition pruning and predicate push down will be submitted
as the third part of the patch.
We implemented this by treating the paimon table as normal
unpartitioned table. When querying paimon table:
- PaimonScanNode will decide paimon splits need to be scanned,
and then transfer splits to BE do the jni-based scan operation.
- We also collect the required columns that need to be scanned,
and pass the columns to Scanner for column pruning. This is
implemented by passing the field ids of the columns to BE,
instead of column position to support schema evolution.
- In the original implementation, PaimonJniScanner will directly
pass paimon row object to BE, and call corresponding paimon row
field accessor, which is a java method to convert row fields to
impala row batch tuples. We find it is slow due to overhead of
JVM method calling.
To minimize the overhead, we refashioned the implementation,
the PaimonJniScanner will convert the paimon row batches to
arrow recordbatch, which stores data in offheap region of
impala JVM. And PaimonJniScanner will pass the arrow offheap
record batch memory pointer to the BE backend.
BE PaimonJniScanNode will directly read data from JVM offheap
region, and convert the arrow record batch to impala row batch.
The benchmark shows the later implementation is 2.x better
than the original implementation.
The lifecycle of arrow row batch is mainly like this:
the arrow row batch is generated in FE,and passed to BE.
After the record batch is imported to BE successfully,
BE will be in charge of freeing the row batch.
There are two free paths: the normal path, and the
exception path. For the normal path, when the arrow batch
is totally consumed by BE, BE will call jni to fetch the next arrow
batch. For this case, the arrow batch is freed automatically.
For the exceptional path, it happends when query is cancelled, or memory
failed to allocate. For these corner cases, arrow batch is freed in the
method close if it is not totally consumed by BE.
Current supported impala data types for query includes:
- BOOLEAN
- TINYINT
- SMALLINT
- INTEGER
- BIGINT
- FLOAT
- DOUBLE
- STRING
- DECIMAL(P,S)
- TIMESTAMP
- CHAR(N)
- VARCHAR(N)
- BINARY
- DATE
TODO:
- Patches pending submission:
- Support tpcds/tpch data-loading
for paimon data table.
- Virtual Column query support for querying
paimon data table.
- Query support with time travel.
- Query support for paimon meta tables.
- WIP:
- Snapshot incremental read.
- Complex type query support.
- Native paimon table scanner, instead of
jni based.
Testing:
- Create tests table in functional_schema_template.sql
- Add TestPaimonScannerWithLimit in test_scanners.py
- Add test_paimon_query in test_paimon.py.
- Already passed the tpcds/tpch test for paimon table, due to the
testing table data is currently generated by spark, and it is
not supported by impala now, we have to do this since hive
doesn't support generating paimon table for dynamic-partitioned
tables. we plan to submit a separate patch for tpcds/tpch data
loading and associated tpcds/tpch query tests.
- JVM Offheap memory leak tests, have run looped tpch tests for
1 day, no obvious offheap memory increase is observed,
offheap memory usage is within 10M.
Change-Id: Ie679a89a8cc21d52b583422336b9f747bdf37384
Reviewed-on: http://gerrit.cloudera.org:8080/23613
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
This commit is contained in:
119
fe/pom.xml
119
fe/pom.xml
@@ -558,7 +558,126 @@ under the License.
|
||||
</exclusion>
|
||||
</exclusions>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.apache.paimon</groupId>
|
||||
<artifactId>paimon-arrow</artifactId>
|
||||
<version>${paimon.version}</version>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.apache.arrow</groupId>
|
||||
<artifactId>arrow-vector</artifactId>
|
||||
<version>${arrow.version}</version>
|
||||
<exclusions>
|
||||
<exclusion>
|
||||
<groupId>log4j</groupId>
|
||||
<artifactId>log4j</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-1.2-api</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-slf4j-impl</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant-launcher</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<artifactId>flatbuffers-java</artifactId>
|
||||
<groupId>com.google.flatbuffers</groupId>
|
||||
</exclusion>
|
||||
</exclusions>
|
||||
</dependency>
|
||||
|
||||
<dependency>
|
||||
<groupId>org.apache.arrow</groupId>
|
||||
<artifactId>arrow-c-data</artifactId>
|
||||
<version>${arrow.version}</version>
|
||||
<exclusions>
|
||||
<exclusion>
|
||||
<groupId>log4j</groupId>
|
||||
<artifactId>log4j</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-1.2-api</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-slf4j-impl</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant-launcher</artifactId>
|
||||
</exclusion>
|
||||
</exclusions>
|
||||
</dependency>
|
||||
|
||||
<dependency>
|
||||
<groupId>org.apache.arrow</groupId>
|
||||
<artifactId>arrow-memory-core</artifactId>
|
||||
<version>${arrow.version}</version>
|
||||
<exclusions>
|
||||
<exclusion>
|
||||
<groupId>log4j</groupId>
|
||||
<artifactId>log4j</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-1.2-api</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-slf4j-impl</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant-launcher</artifactId>
|
||||
</exclusion>
|
||||
</exclusions>
|
||||
</dependency>
|
||||
|
||||
<dependency>
|
||||
<groupId>org.apache.arrow</groupId>
|
||||
<artifactId>arrow-memory-unsafe</artifactId>
|
||||
<version>${arrow.version}</version>
|
||||
<exclusions>
|
||||
<exclusion>
|
||||
<groupId>log4j</groupId>
|
||||
<artifactId>log4j</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-1.2-api</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-slf4j-impl</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant</artifactId>
|
||||
</exclusion>
|
||||
<exclusion>
|
||||
<groupId>org.apache.ant</groupId>
|
||||
<artifactId>ant-launcher</artifactId>
|
||||
</exclusion>
|
||||
</exclusions>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
<reporting>
|
||||
|
||||
@@ -22,6 +22,8 @@ import java.util.List;
|
||||
|
||||
import org.apache.hadoop.hive.metastore.api.ColumnStatisticsData;
|
||||
import org.apache.hadoop.hive.metastore.api.FieldSchema;
|
||||
import org.apache.impala.catalog.paimon.PaimonColumn;
|
||||
import org.apache.impala.catalog.paimon.PaimonStructField;
|
||||
import org.apache.impala.common.ImpalaRuntimeException;
|
||||
import org.apache.impala.thrift.TColumn;
|
||||
import org.apache.impala.thrift.TColumnDescriptor;
|
||||
@@ -102,6 +104,10 @@ public class Column {
|
||||
col = new IcebergColumn(columnDesc.getColumnName(), type, comment, position,
|
||||
columnDesc.getIceberg_field_id(), columnDesc.getIceberg_field_map_key_id(),
|
||||
columnDesc.getIceberg_field_map_value_id(), columnDesc.isIs_nullable());
|
||||
} else if (columnDesc.isIs_paimon_column()) {
|
||||
Preconditions.checkState(columnDesc.isSetIceberg_field_id());
|
||||
col = new PaimonColumn(columnDesc.getColumnName(), type, comment, position,
|
||||
columnDesc.getIceberg_field_id(), columnDesc.isIs_nullable());
|
||||
} else if (columnDesc.isIs_hbase_column()) {
|
||||
// HBase table column. The HBase column qualifier (column name) is not be set for
|
||||
// the HBase row key, so it being set in the thrift struct is not a precondition.
|
||||
@@ -159,6 +165,10 @@ public class Column {
|
||||
IcebergColumn iCol = (IcebergColumn) col;
|
||||
fields.add(new IcebergStructField(iCol.getName(), iCol.getType(),
|
||||
iCol.getComment(), iCol.getFieldId()));
|
||||
} else if (col instanceof PaimonColumn) {
|
||||
PaimonColumn pCol = (PaimonColumn) col;
|
||||
fields.add(new PaimonStructField(pCol.getName(), pCol.getType(),
|
||||
pCol.getComment(), pCol.getFieldId(), pCol.isNullable()));
|
||||
} else {
|
||||
fields.add(new StructField(col.getName(), col.getType(), col.getComment()));
|
||||
}
|
||||
|
||||
@@ -40,6 +40,8 @@ import org.apache.hadoop.hive.metastore.api.PrincipalType;
|
||||
import org.apache.impala.analysis.TableName;
|
||||
import org.apache.impala.catalog.events.InFlightEvents;
|
||||
import org.apache.impala.catalog.monitor.CatalogMonitor;
|
||||
import org.apache.impala.catalog.paimon.PaimonColumn;
|
||||
import org.apache.impala.catalog.paimon.PaimonStructField;
|
||||
import org.apache.impala.catalog.paimon.PaimonTable;
|
||||
import org.apache.impala.catalog.paimon.PaimonUtil;
|
||||
import org.apache.impala.common.ImpalaRuntimeException;
|
||||
@@ -667,6 +669,10 @@ public abstract class Table extends CatalogObjectImpl implements FeTable {
|
||||
IcebergColumn iCol = (IcebergColumn) col;
|
||||
return new IcebergStructField(iCol.getName(), iCol.getType(),
|
||||
iCol.getComment(), iCol.getFieldId());
|
||||
} else if (col instanceof PaimonColumn) {
|
||||
PaimonColumn pCol = (PaimonColumn) col;
|
||||
return new PaimonStructField(pCol.getName(), pCol.getType(), pCol.getComment(),
|
||||
pCol.getFieldId(), pCol.isNullable());
|
||||
} else {
|
||||
return new StructField(col.getName(), col.getType(), col.getComment());
|
||||
}
|
||||
|
||||
@@ -25,6 +25,7 @@ import org.apache.impala.analysis.CreateTableStmt;
|
||||
import org.apache.impala.analysis.Parser;
|
||||
import org.apache.impala.analysis.StatementBase;
|
||||
import org.apache.impala.analysis.TypeDef;
|
||||
import org.apache.impala.catalog.paimon.PaimonStructField;
|
||||
import org.apache.impala.common.AnalysisException;
|
||||
import org.apache.impala.common.Pair;
|
||||
import org.apache.impala.thrift.TColumnType;
|
||||
@@ -527,9 +528,16 @@ public abstract class Type {
|
||||
Pair<Type, Integer> res = fromThrift(col, nodeIdx);
|
||||
nodeIdx = res.second.intValue();
|
||||
if (thriftField.isSetField_id()) {
|
||||
// We create 'IcebergStructField' for Iceberg tables which have field id.
|
||||
structFields.add(new IcebergStructField(name, res.first, comment,
|
||||
thriftField.getField_id()));
|
||||
if (!thriftField.isSetIs_nullable()) {
|
||||
// We create 'IcebergStructField' for Iceberg tables which have field id.
|
||||
// if nullable is not set.
|
||||
structFields.add(new IcebergStructField(
|
||||
name, res.first, comment, thriftField.getField_id()));
|
||||
} else {
|
||||
// nullable is set, it is a PaimonStructField
|
||||
structFields.add(new PaimonStructField(name, res.first, comment,
|
||||
thriftField.getField_id(), thriftField.isIs_nullable()));
|
||||
}
|
||||
} else {
|
||||
structFields.add(new StructField(name, res.first, comment));
|
||||
}
|
||||
|
||||
@@ -19,7 +19,7 @@ package org.apache.impala.catalog.local;
|
||||
|
||||
import com.google.common.base.Preconditions;
|
||||
|
||||
import org.apache.hadoop.hive.metastore.api.MetaException;
|
||||
import org.apache.impala.catalog.Column;
|
||||
import org.apache.impala.catalog.TableLoadingException;
|
||||
import org.apache.impala.catalog.paimon.FePaimonTable;
|
||||
import org.apache.impala.catalog.paimon.PaimonUtil;
|
||||
@@ -29,6 +29,7 @@ import org.apache.log4j.Logger;
|
||||
import org.apache.paimon.table.Table;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.List;
|
||||
import java.util.Set;
|
||||
|
||||
/**
|
||||
@@ -45,18 +46,28 @@ public class LocalPaimonTable extends LocalTable implements FePaimonTable {
|
||||
Preconditions.checkNotNull(msTbl);
|
||||
Preconditions.checkNotNull(ref);
|
||||
try {
|
||||
LocalPaimonTable localPaimonTable = new LocalPaimonTable(db, msTbl, ref);
|
||||
Table table = PaimonUtil.createFileStoreTable(msTbl);
|
||||
List<Column> paimonColumns = PaimonUtil.toImpalaColumn(table);
|
||||
ColumnMap colMap = new ColumnMap(paimonColumns,
|
||||
/*numClusteringCols=*/table.partitionKeys().size(),
|
||||
db.getName() + "." + msTbl.getTableName(),
|
||||
/*isFullAcidSchema=*/false);
|
||||
LocalPaimonTable localPaimonTable =
|
||||
new LocalPaimonTable(db, msTbl, ref, colMap, table);
|
||||
return localPaimonTable;
|
||||
} catch (MetaException ex) {
|
||||
} catch (Exception ex) {
|
||||
throw new TableLoadingException("Failed to load table" + msTbl.getTableName(), ex);
|
||||
}
|
||||
}
|
||||
|
||||
protected LocalPaimonTable(LocalDb db, org.apache.hadoop.hive.metastore.api.Table msTbl,
|
||||
MetaProvider.TableMetaRef ref) throws MetaException {
|
||||
super(db, msTbl, ref);
|
||||
table_ = PaimonUtil.createFileStoreTable(msTbl);
|
||||
MetaProvider.TableMetaRef ref, ColumnMap columnMap, Table table) {
|
||||
super(db, msTbl, ref, columnMap);
|
||||
table_ = table;
|
||||
/// TODO: add virtual column later if it is supported.
|
||||
/// addVirtualColumns(ref.getVirtualColumns());
|
||||
applyPaimonTableStatsIfPresent();
|
||||
applyPaimonColumnStatsIfPresent();
|
||||
}
|
||||
|
||||
@Override
|
||||
|
||||
@@ -49,6 +49,8 @@ import org.apache.impala.catalog.SystemTable;
|
||||
import org.apache.impala.catalog.TableLoadingException;
|
||||
import org.apache.impala.catalog.VirtualColumn;
|
||||
import org.apache.impala.catalog.local.MetaProvider.TableMetaRef;
|
||||
import org.apache.impala.catalog.paimon.PaimonColumn;
|
||||
import org.apache.impala.catalog.paimon.PaimonStructField;
|
||||
import org.apache.impala.catalog.paimon.PaimonUtil;
|
||||
import org.apache.impala.common.Pair;
|
||||
import org.apache.impala.common.RuntimeEnv;
|
||||
@@ -451,6 +453,10 @@ abstract class LocalTable implements FeTable {
|
||||
IcebergColumn iCol = (IcebergColumn) col;
|
||||
fields.add(new IcebergStructField(iCol.getName(), iCol.getType(),
|
||||
iCol.getComment(), iCol.getFieldId()));
|
||||
} else if (col instanceof PaimonColumn) {
|
||||
PaimonColumn pCol = (PaimonColumn) col;
|
||||
fields.add(new PaimonStructField(pCol.getName(), pCol.getType(),
|
||||
pCol.getComment(), pCol.getFieldId(), pCol.isNullable()));
|
||||
} else {
|
||||
fields.add(new StructField(col.getName(), col.getType(), col.getComment()));
|
||||
}
|
||||
|
||||
@@ -0,0 +1,66 @@
|
||||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
package org.apache.impala.catalog.paimon;
|
||||
|
||||
import org.apache.impala.catalog.Column;
|
||||
import org.apache.impala.catalog.Type;
|
||||
import org.apache.impala.thrift.TColumn;
|
||||
import org.apache.impala.thrift.TColumnDescriptor;
|
||||
|
||||
/**
|
||||
* Represents a Paimon column.
|
||||
*
|
||||
* This class extends Column with the Paimon-specific field id. Field ids are used in
|
||||
* schema evolution to uniquely identify columns..
|
||||
*/
|
||||
public class PaimonColumn extends Column {
|
||||
private final int fieldId_;
|
||||
// False for required Paimon field, true for optional Paimon field
|
||||
private final boolean isNullable_;
|
||||
|
||||
public PaimonColumn(String name, Type type, String comment, int position, int fieldId,
|
||||
boolean isNullable) {
|
||||
super(name.toLowerCase(), type, comment, position);
|
||||
fieldId_ = fieldId;
|
||||
isNullable_ = isNullable;
|
||||
}
|
||||
|
||||
public PaimonColumn(String name, Type type, String comment, int position, int fieldId) {
|
||||
this(name, type, comment, position, fieldId, true);
|
||||
}
|
||||
|
||||
public int getFieldId() { return fieldId_; }
|
||||
|
||||
public boolean isNullable() { return isNullable_; }
|
||||
|
||||
@Override
|
||||
public TColumn toThrift() {
|
||||
TColumn tcol = super.toThrift();
|
||||
tcol.setIs_paimon_column(true);
|
||||
tcol.setIceberg_field_id(fieldId_);
|
||||
tcol.setIs_nullable(isNullable_);
|
||||
return tcol;
|
||||
}
|
||||
|
||||
@Override
|
||||
public TColumnDescriptor toDescriptor() {
|
||||
TColumnDescriptor desc = super.toDescriptor();
|
||||
desc.setIcebergFieldId(fieldId_);
|
||||
return desc;
|
||||
}
|
||||
}
|
||||
@@ -63,11 +63,11 @@ import org.apache.paimon.types.VariantType;
|
||||
* Utils for paimon and hive Type conversions, the class is from
|
||||
* org.apache.paimon.hive.HiveTypeUtils, refactor to fix the
|
||||
* following incompatible conversion issue:
|
||||
* paimon type LocalZonedTimestampType will convert to
|
||||
* org.apache.hadoop.hive.serde2.typeinfo.TimestampLocalTZTypeInfo
|
||||
* paimon type ${@link LocalZonedTimestampType} will convert to
|
||||
* ${@link org.apache.hadoop.hive.serde2.typeinfo.TimestampLocalTZTypeInfo}
|
||||
* however, it is not supported in impala, TableLoadingException
|
||||
* will raise while loading the table in method:
|
||||
* apache.impala.catalog.FeCatalogUtils#parseColumnType
|
||||
* ${@link org.apache.impala.catalog.FeCatalogUtils#parseColumnType}
|
||||
* To fix the issue LocalZonedTimestampType will be converted to
|
||||
* hive timestamp type.
|
||||
*/
|
||||
@@ -206,31 +206,26 @@ public class PaimonHiveTypeUtils {
|
||||
}
|
||||
|
||||
static DataType visit(TypeInfo type, HiveToPaimonTypeVisitor visitor) {
|
||||
if (!(type instanceof StructTypeInfo)) {
|
||||
if (type instanceof MapTypeInfo) {
|
||||
MapTypeInfo mapTypeInfo = (MapTypeInfo)type;
|
||||
return DataTypes.MAP(visit(mapTypeInfo.getMapKeyTypeInfo(), visitor),
|
||||
visit(mapTypeInfo.getMapValueTypeInfo(), visitor));
|
||||
} else if (type instanceof ListTypeInfo) {
|
||||
ListTypeInfo listTypeInfo = (ListTypeInfo)type;
|
||||
return DataTypes.ARRAY(
|
||||
visit(listTypeInfo.getListElementTypeInfo(), visitor));
|
||||
} else {
|
||||
return visitor.atomic(type);
|
||||
}
|
||||
} else {
|
||||
if (type instanceof StructTypeInfo) {
|
||||
StructTypeInfo structTypeInfo = (StructTypeInfo)type;
|
||||
ArrayList<String> fieldNames = structTypeInfo.getAllStructFieldNames();
|
||||
ArrayList<TypeInfo> typeInfos = structTypeInfo
|
||||
.getAllStructFieldTypeInfos();
|
||||
RowType.Builder builder = RowType.builder();
|
||||
|
||||
for(int i = 0; i < fieldNames.size(); ++i) {
|
||||
builder.field((String)fieldNames.get(i),
|
||||
visit((TypeInfo)typeInfos.get(i), visitor));
|
||||
}
|
||||
|
||||
return builder.build();
|
||||
} else if (type instanceof MapTypeInfo) {
|
||||
MapTypeInfo mapTypeInfo = (MapTypeInfo) type;
|
||||
return DataTypes.MAP(visit(mapTypeInfo.getMapKeyTypeInfo(), visitor),
|
||||
visit(mapTypeInfo.getMapValueTypeInfo(), visitor));
|
||||
} else if (type instanceof ListTypeInfo) {
|
||||
ListTypeInfo listTypeInfo = (ListTypeInfo) type;
|
||||
return DataTypes.ARRAY(visit(listTypeInfo.getListElementTypeInfo(), visitor));
|
||||
} else {
|
||||
return visitor.atomic(type);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -180,8 +180,9 @@ public class PaimonImpalaTypeUtils {
|
||||
rowType.getFields()
|
||||
.stream()
|
||||
.map(dataField
|
||||
-> new StructField(
|
||||
dataField.name().toLowerCase(), dataField.type().accept(this)))
|
||||
-> new PaimonStructField(dataField.name().toLowerCase(),
|
||||
dataField.type().accept(this), dataField.description(),
|
||||
dataField.id(), dataField.type().isNullable()))
|
||||
.collect(Collectors.toList());
|
||||
|
||||
return new StructType(structFields);
|
||||
@@ -254,12 +255,12 @@ public class PaimonImpalaTypeUtils {
|
||||
public static boolean isSupportedPrimitiveType(PrimitiveType primitiveType) {
|
||||
Preconditions.checkNotNull(primitiveType);
|
||||
switch (primitiveType) {
|
||||
case DOUBLE:
|
||||
case FLOAT:
|
||||
case BIGINT:
|
||||
case INT:
|
||||
case SMALLINT:
|
||||
case TINYINT:
|
||||
case DOUBLE:
|
||||
case FLOAT:
|
||||
case BOOLEAN:
|
||||
case STRING:
|
||||
case TIMESTAMP:
|
||||
@@ -267,7 +268,6 @@ public class PaimonImpalaTypeUtils {
|
||||
case DATE:
|
||||
case BINARY:
|
||||
case CHAR:
|
||||
case DATETIME:
|
||||
case VARCHAR: return true;
|
||||
default: return false;
|
||||
}
|
||||
|
||||
@@ -0,0 +1,87 @@
|
||||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
package org.apache.impala.catalog.paimon;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.impala.catalog.StructField;
|
||||
import org.apache.impala.catalog.Type;
|
||||
import org.apache.impala.thrift.TColumnType;
|
||||
import org.apache.impala.thrift.TStructField;
|
||||
import org.apache.impala.thrift.TTypeNode;
|
||||
|
||||
/**
|
||||
* Represents a Paimon StructField.
|
||||
*
|
||||
* This class extends StructField with Paimon-specific field.
|
||||
* Paimon uses field IDs for schema evolution and compatibility, similar to Iceberg.
|
||||
* We keep field id by this class, so we can use field id to resolve column on backend.
|
||||
*/
|
||||
public class PaimonStructField extends StructField {
|
||||
private final int fieldId_;
|
||||
// False for required Paimon field, true for optional Paimon field
|
||||
private final boolean isNullable_;
|
||||
|
||||
public PaimonStructField(String name, Type type, String comment, int fieldId) {
|
||||
this(name, type, comment, fieldId, true);
|
||||
}
|
||||
|
||||
public PaimonStructField(
|
||||
String name, Type type, String comment, int fieldId, boolean isNullable) {
|
||||
super(name, type, comment);
|
||||
fieldId_ = fieldId;
|
||||
isNullable_ = isNullable;
|
||||
}
|
||||
|
||||
public int getFieldId() { return fieldId_; }
|
||||
|
||||
public boolean isNullable() { return isNullable_; }
|
||||
|
||||
@Override
|
||||
public void toThrift(TColumnType container, TTypeNode node) {
|
||||
TStructField field = new TStructField();
|
||||
field.setName(name_);
|
||||
if (comment_ != null) field.setComment(comment_);
|
||||
field.setField_id(fieldId_);
|
||||
// Paimon-specific metadata - nullable and key properties could be added to
|
||||
// extended metadata if the Thrift definition supports it
|
||||
field.setIs_nullable(isNullable_);
|
||||
node.struct_fields.add(field);
|
||||
type_.toThrift(container);
|
||||
}
|
||||
|
||||
@Override
|
||||
public boolean equals(Object other) {
|
||||
if (!(other instanceof PaimonStructField)) return false;
|
||||
PaimonStructField otherStructField = (PaimonStructField) other;
|
||||
return otherStructField.name_.equals(name_) && otherStructField.type_.equals(type_)
|
||||
&& otherStructField.fieldId_ == fieldId_
|
||||
&& otherStructField.isNullable_ == isNullable_;
|
||||
}
|
||||
|
||||
@Override
|
||||
public int hashCode() {
|
||||
return Objects.hash(name_, type_, fieldId_, isNullable_);
|
||||
}
|
||||
|
||||
@Override
|
||||
public String toString() {
|
||||
return String.format("PaimonStructField{name=%s, type=%s, fieldId=%d, nullable=%s}",
|
||||
name_, type_, fieldId_, isNullable_);
|
||||
}
|
||||
}
|
||||
@@ -28,6 +28,7 @@ import org.apache.hadoop.hive.metastore.IMetaStoreClient;
|
||||
import org.apache.hadoop.hive.metastore.api.FieldSchema;
|
||||
import org.apache.impala.catalog.Column;
|
||||
import org.apache.impala.catalog.Db;
|
||||
import org.apache.impala.catalog.StructType;
|
||||
import org.apache.impala.catalog.Table;
|
||||
import org.apache.impala.catalog.TableLoadingException;
|
||||
import org.apache.impala.catalog.VirtualColumn;
|
||||
@@ -135,7 +136,8 @@ public class PaimonTable extends Table implements FePaimonTable {
|
||||
public void loadSchemaFromPaimon()
|
||||
throws TableLoadingException, ImpalaRuntimeException {
|
||||
loadSchema();
|
||||
addVirtualColumns();
|
||||
// TODO: add virtual column later if it is supported.
|
||||
// addVirtualColumns();
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -213,6 +215,17 @@ public class PaimonTable extends Table implements FePaimonTable {
|
||||
addVirtualColumn(VirtualColumn.BUCKET_ID);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void addColumn(Column col) {
|
||||
Preconditions.checkState(col instanceof PaimonColumn);
|
||||
PaimonColumn pCol = (PaimonColumn) col;
|
||||
colsByPos_.add(pCol);
|
||||
colsByName_.put(pCol.getName().toLowerCase(), col);
|
||||
((StructType) type_.getItemType())
|
||||
.addField(new PaimonStructField(col.getName(), col.getType(), col.getComment(),
|
||||
pCol.getFieldId(), pCol.isNullable()));
|
||||
}
|
||||
|
||||
/**
|
||||
* Loads the metadata of a Paimon table.
|
||||
* <p>
|
||||
|
||||
@@ -104,7 +104,6 @@ import org.apache.paimon.types.SmallIntType;
|
||||
import org.apache.paimon.types.TinyIntType;
|
||||
import org.apache.paimon.utils.InternalRowPartitionComputer;
|
||||
import org.apache.thrift.TException;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
|
||||
@@ -200,6 +199,17 @@ public class PaimonUtil {
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* function to close autoClosable object quitely.
|
||||
*/
|
||||
public static void closeQuitely(AutoCloseable closeable) {
|
||||
if (closeable != null) {
|
||||
try {
|
||||
closeable.close();
|
||||
} catch (Exception e) { LOG.warn("Error closing " + closeable, e); }
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts Paimon schema to an Impala schema.
|
||||
*/
|
||||
@@ -209,7 +219,8 @@ public class PaimonUtil {
|
||||
int pos = 0;
|
||||
for (DataField dataField : schema.getFields()) {
|
||||
Type colType = PaimonImpalaTypeUtils.toImpalaType(dataField.type());
|
||||
ret.add(new Column(dataField.name().toLowerCase(), colType, pos++));
|
||||
ret.add(new PaimonColumn(dataField.name().toLowerCase(), colType,
|
||||
dataField.description(), pos++, dataField.id(), dataField.type().isNullable()));
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
@@ -861,4 +872,43 @@ public class PaimonUtil {
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
/**
|
||||
* convert paimon api table schema to impala columns
|
||||
*/
|
||||
public static List<Column> toImpalaColumn(Table table) throws ImpalaRuntimeException {
|
||||
RowType rowType = table.rowType();
|
||||
List<DataField> dataFields = rowType.getFields();
|
||||
List<String> partitionKeys = table.partitionKeys()
|
||||
.stream()
|
||||
.map(String::toLowerCase)
|
||||
.collect(Collectors.toList());
|
||||
List<Column> impalaFields = convertToImpalaSchema(rowType);
|
||||
List<Column> impalaNonPartitionedFields = Lists.newArrayList();
|
||||
List<Column> impalaPartitionedFields = Lists.newArrayList();
|
||||
List<Column> columns = Lists.newArrayList();
|
||||
// lookup the clustering columns
|
||||
for (String name : partitionKeys) {
|
||||
int colIndex = PaimonUtil.getFieldIndexByNameIgnoreCase(rowType, name);
|
||||
Preconditions.checkArgument(colIndex >= 0);
|
||||
impalaPartitionedFields.add(impalaFields.get(colIndex));
|
||||
}
|
||||
// put non-clustering columns in natural order
|
||||
for (int i = 0; i < dataFields.size(); i++) {
|
||||
if (!partitionKeys.contains(dataFields.get(i).name().toLowerCase())) {
|
||||
impalaNonPartitionedFields.add(impalaFields.get(i));
|
||||
}
|
||||
}
|
||||
|
||||
int colPos = 0;
|
||||
for (Column col : impalaPartitionedFields) {
|
||||
col.setPosition(colPos++);
|
||||
columns.add(col);
|
||||
}
|
||||
for (Column col : impalaNonPartitionedFields) {
|
||||
col.setPosition(colPos++);
|
||||
columns.add(col);
|
||||
}
|
||||
return columns;
|
||||
}
|
||||
}
|
||||
|
||||
303
fe/src/main/java/org/apache/impala/planner/PaimonScanNode.java
Normal file
303
fe/src/main/java/org/apache/impala/planner/PaimonScanNode.java
Normal file
@@ -0,0 +1,303 @@
|
||||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
package org.apache.impala.planner;
|
||||
|
||||
import com.google.common.base.MoreObjects;
|
||||
import com.google.common.base.Preconditions;
|
||||
import com.google.common.collect.Iterables;
|
||||
import com.google.common.collect.Lists;
|
||||
import com.google.common.collect.Sets;
|
||||
import org.apache.commons.lang3.SerializationUtils;
|
||||
import org.apache.impala.analysis.Analyzer;
|
||||
import org.apache.impala.analysis.Expr;
|
||||
import org.apache.impala.analysis.MultiAggregateInfo;
|
||||
import org.apache.impala.analysis.SlotDescriptor;
|
||||
import org.apache.impala.analysis.TupleDescriptor;
|
||||
import org.apache.impala.catalog.paimon.FePaimonTable;
|
||||
import org.apache.impala.catalog.paimon.PaimonColumn;
|
||||
import org.apache.impala.common.AnalysisException;
|
||||
import org.apache.impala.common.ImpalaException;
|
||||
import org.apache.impala.common.ImpalaRuntimeException;
|
||||
import org.apache.impala.common.ThriftSerializationCtx;
|
||||
import org.apache.impala.planner.paimon.PaimonSplit;
|
||||
import org.apache.impala.thrift.TExplainLevel;
|
||||
import org.apache.impala.thrift.TNetworkAddress;
|
||||
import org.apache.impala.thrift.TPaimonScanNode;
|
||||
import org.apache.impala.thrift.TPlanNode;
|
||||
import org.apache.impala.thrift.TPlanNodeType;
|
||||
import org.apache.impala.thrift.TQueryOptions;
|
||||
import org.apache.impala.thrift.TScanRange;
|
||||
import org.apache.impala.thrift.TScanRangeLocation;
|
||||
import org.apache.impala.thrift.TScanRangeLocationList;
|
||||
import org.apache.impala.thrift.TScanRangeSpec;
|
||||
import org.apache.impala.util.ExecutorMembershipSnapshot;
|
||||
import org.apache.paimon.table.Table;
|
||||
import org.apache.paimon.table.source.DataSplit;
|
||||
import org.apache.paimon.table.source.ReadBuilder;
|
||||
import org.apache.paimon.table.source.Split;
|
||||
import org.apache.paimon.types.DataField;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.util.ArrayList;
|
||||
import java.util.Arrays;
|
||||
import java.util.HashSet;
|
||||
import java.util.List;
|
||||
import java.util.Set;
|
||||
|
||||
/**
|
||||
* Jni-based scan node of a single paimon table.
|
||||
*/
|
||||
public class PaimonScanNode extends ScanNode {
|
||||
private final static Logger LOG = LoggerFactory.getLogger(PaimonScanNode.class);
|
||||
private final static long PAIMON_ROW_AVG_SIZE_OVERHEAD = 4L;
|
||||
// FeTable object
|
||||
private final FePaimonTable table_;
|
||||
|
||||
// paimon table object
|
||||
private final Table paimonApiTable_;
|
||||
|
||||
// Indexes for the set of hosts that will be used for the query.
|
||||
// From analyzer.getHostIndex().getIndex(address)
|
||||
private final Set<Integer> hostIndexSet_ = new HashSet<>();
|
||||
// Array of top level field ids used for top level column scan pruning.
|
||||
private int[] projection_;
|
||||
// top level field id map
|
||||
private Set<Integer> fieldIdMap_ = Sets.newHashSet();
|
||||
// Splits generated for paimon scanning during planning stage.
|
||||
private List<Split> splits_;
|
||||
|
||||
public PaimonScanNode(PlanNodeId id, TupleDescriptor desc, List<Expr> conjuncts,
|
||||
MultiAggregateInfo aggInfo, FePaimonTable table) {
|
||||
super(id, desc, "SCAN PAIMON");
|
||||
conjuncts_ = conjuncts;
|
||||
aggInfo_ = aggInfo;
|
||||
table_ = table;
|
||||
paimonApiTable_ = table.getPaimonApiTable();
|
||||
}
|
||||
|
||||
@Override
|
||||
public void init(Analyzer analyzer) throws ImpalaException {
|
||||
super.init(analyzer);
|
||||
conjuncts_ = orderConjunctsByCost(conjuncts_);
|
||||
// TODO: implement predicate push down later here.
|
||||
|
||||
// materialize slots in remaining conjuncts_
|
||||
analyzer.materializeSlots(conjuncts_);
|
||||
collectProjectionId();
|
||||
computeMemLayout(analyzer);
|
||||
computeScanRangeLocations(analyzer);
|
||||
computePaimonStats(analyzer);
|
||||
}
|
||||
|
||||
public void computePaimonStats(Analyzer analyzer) {
|
||||
computeNumNodes(analyzer);
|
||||
// Update the cardinality, hint value will be used when table has no stats.
|
||||
inputCardinality_ = cardinality_ = estimateTableRowCount();
|
||||
cardinality_ = applyConjunctsSelectivity(cardinality_);
|
||||
cardinality_ = capCardinalityAtLimit(cardinality_);
|
||||
avgRowSize_ = estimateAvgRowSize();
|
||||
if (LOG.isTraceEnabled()) {
|
||||
LOG.trace("computeStats paimonScan: cardinality=" + Long.toString(cardinality_));
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Collect and analyze top-level columns.
|
||||
*/
|
||||
public void collectProjectionId() throws AnalysisException {
|
||||
projection_ = new int[desc_.getSlots().size()];
|
||||
for (int i = 0; i < desc_.getSlots().size(); i++) {
|
||||
SlotDescriptor sd = desc_.getSlots().get(i);
|
||||
if (sd.isVirtualColumn()) {
|
||||
throw new AnalysisException("Paimon Scanner doesn't support virtual columns.");
|
||||
}
|
||||
if (sd.getPath().getRawPath() != null && sd.getPath().getRawPath().size() > 1) {
|
||||
throw new AnalysisException("Paimon Scanner doesn't support nested columns.");
|
||||
}
|
||||
PaimonColumn paimonColumn = (PaimonColumn) desc_.getSlots().get(i).getColumn();
|
||||
projection_[i] = paimonColumn.getFieldId();
|
||||
fieldIdMap_.add(paimonColumn.getFieldId());
|
||||
}
|
||||
Preconditions.checkArgument(projection_.length == desc_.getSlots().size());
|
||||
LOG.info(String.format("table %s projection fields: %s", table_.getFullName(),
|
||||
Arrays.toString(projection_)));
|
||||
}
|
||||
|
||||
protected long estimateSplitRowCount(Split s) {
|
||||
if (s instanceof DataSplit) {
|
||||
DataSplit dataSplit = (DataSplit) s;
|
||||
if (dataSplit.mergedRowCountAvailable()) { return dataSplit.mergedRowCount(); }
|
||||
}
|
||||
return s.rowCount();
|
||||
}
|
||||
|
||||
protected long estimateTableRowCount() {
|
||||
return splits_.stream()
|
||||
.map(this ::estimateSplitRowCount)
|
||||
.reduce(Long::sum)
|
||||
.orElse(-1L);
|
||||
}
|
||||
|
||||
protected long estimateAvgRowSize() {
|
||||
List<DataField> dataColumns = paimonApiTable_.rowType().getFields();
|
||||
return dataColumns.stream()
|
||||
.filter(df -> fieldIdMap_.contains(df.id()))
|
||||
.mapToInt(column -> column.type().defaultSize())
|
||||
.sum()
|
||||
+ PAIMON_ROW_AVG_SIZE_OVERHEAD;
|
||||
}
|
||||
/**
|
||||
* Compute the scan range locations for the given table using the scan tokens.
|
||||
*/
|
||||
private void computeScanRangeLocations(Analyzer analyzer)
|
||||
throws ImpalaRuntimeException {
|
||||
scanRangeSpecs_ = new TScanRangeSpec();
|
||||
ReadBuilder readBuilder = paimonApiTable_.newReadBuilder();
|
||||
|
||||
// 2. Plan splits in 'Coordinator'.
|
||||
splits_ = readBuilder.newScan().plan().splits();
|
||||
|
||||
if (splits_.size() <= 0) {
|
||||
LOG.info("no paimon data available");
|
||||
return;
|
||||
}
|
||||
|
||||
for (Split split : splits_) {
|
||||
List<TScanRangeLocation> locations = new ArrayList<>();
|
||||
// TODO: Currently, set to dummy network address for random executor scheduling,
|
||||
// don't forget to get actual location for data locality after native table scan
|
||||
// is supported.
|
||||
//
|
||||
{
|
||||
TNetworkAddress address = new TNetworkAddress("localhost", 12345);
|
||||
// Use the network address to look up the host in the global list
|
||||
Integer hostIndex = analyzer.getHostIndex().getOrAddIndex(address);
|
||||
locations.add(new TScanRangeLocation(hostIndex));
|
||||
hostIndexSet_.add(hostIndex);
|
||||
}
|
||||
|
||||
TScanRange scanRange = new TScanRange();
|
||||
// TODO: apply predicate push down later.
|
||||
PaimonSplit paimonSplit = new PaimonSplit(split, null);
|
||||
byte[] split_data_serialized = SerializationUtils.serialize(paimonSplit);
|
||||
scanRange.setFile_metadata(split_data_serialized);
|
||||
TScanRangeLocationList locs = new TScanRangeLocationList();
|
||||
locs.setScan_range(scanRange);
|
||||
locs.setLocations(locations);
|
||||
scanRangeSpecs_.addToConcrete_ranges(locs);
|
||||
}
|
||||
}
|
||||
|
||||
@Override
|
||||
protected double computeSelectivity() {
|
||||
List<Expr> allConjuncts = Lists.newArrayList(Iterables.concat(conjuncts_));
|
||||
return computeCombinedSelectivity(allConjuncts);
|
||||
}
|
||||
|
||||
/**
|
||||
* Estimate the number of impalad nodes that this scan node will execute on (which is
|
||||
* ultimately determined by the scheduling done by the backend's Scheduler).
|
||||
* As of now, scan ranges are scheduled round-robin, since they have no location
|
||||
* information. .
|
||||
*/
|
||||
protected void computeNumNodes(Analyzer analyzer) {
|
||||
ExecutorMembershipSnapshot cluster = ExecutorMembershipSnapshot.getCluster();
|
||||
final int maxInstancesPerNode = getMaxInstancesPerNode(analyzer);
|
||||
final int maxPossibleInstances =
|
||||
analyzer.numExecutorsForPlanning() * maxInstancesPerNode;
|
||||
int totalNodes = 0;
|
||||
int totalInstances = 0;
|
||||
int numRemoteRanges = splits_.size();
|
||||
|
||||
// The remote ranges are round-robined across all the impalads.
|
||||
int numRemoteNodes = Math.min(numRemoteRanges, analyzer.numExecutorsForPlanning());
|
||||
// The remote assignments may overlap, but we don't know by how much
|
||||
// so conservatively assume no overlap.
|
||||
totalNodes = Math.min(numRemoteNodes, analyzer.numExecutorsForPlanning());
|
||||
|
||||
totalInstances = Math.min(numRemoteRanges, totalNodes * maxInstancesPerNode);
|
||||
|
||||
numNodes_ = Math.max(totalNodes, 1);
|
||||
numInstances_ = Math.max(totalInstances, 1);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void computeNodeResourceProfile(TQueryOptions queryOptions) {
|
||||
// current batch size is from query options, so estimated bytes
|
||||
// is calculated as BATCH_SIZE * average row size * 2
|
||||
long batchSize = getRowBatchSize(queryOptions);
|
||||
long memSize = batchSize * (long) getAvgRowSize() * 2;
|
||||
nodeResourceProfile_ =
|
||||
new ResourceProfileBuilder().setMemEstimateBytes(memSize).build();
|
||||
}
|
||||
|
||||
@Override
|
||||
protected void toThrift(TPlanNode msg, ThriftSerializationCtx serialCtx) {
|
||||
toThrift(msg);
|
||||
}
|
||||
|
||||
@Override
|
||||
protected void toThrift(TPlanNode node) {
|
||||
node.node_type = TPlanNodeType.PAIMON_SCAN_NODE;
|
||||
node.paimon_table_scan_node = new TPaimonScanNode(desc_.getId().asInt(),
|
||||
ByteBuffer.wrap(SerializationUtils.serialize(paimonApiTable_)),
|
||||
table_.getFullName());
|
||||
}
|
||||
|
||||
@Override
|
||||
public void computeProcessingCost(TQueryOptions queryOptions) {
|
||||
processingCost_ = computeScanProcessingCost(queryOptions);
|
||||
}
|
||||
|
||||
@Override
|
||||
protected String getNodeExplainString(
|
||||
String prefix, String detailPrefix, TExplainLevel detailLevel) {
|
||||
StringBuilder result = new StringBuilder();
|
||||
|
||||
String aliasStr = desc_.hasExplicitAlias() ? " " + desc_.getAlias() : "";
|
||||
result.append(String.format("%s%s:%s [%s%s]\n", prefix, id_.toString(), displayName_,
|
||||
table_.getFullName(), aliasStr));
|
||||
|
||||
switch (detailLevel) {
|
||||
case MINIMAL: break;
|
||||
case STANDARD: // Fallthrough intended.
|
||||
case EXTENDED: // Fallthrough intended.
|
||||
case VERBOSE: {
|
||||
if (!conjuncts_.isEmpty()) {
|
||||
result.append(detailPrefix
|
||||
+ "predicates: " + Expr.getExplainString(conjuncts_, detailLevel) + "\n");
|
||||
}
|
||||
if (!runtimeFilters_.isEmpty()) {
|
||||
result.append(detailPrefix + "runtime filters: ");
|
||||
result.append(getRuntimeFilterExplainString(false, detailLevel));
|
||||
}
|
||||
}
|
||||
}
|
||||
return result.toString();
|
||||
}
|
||||
|
||||
@Override
|
||||
protected String debugString() {
|
||||
MoreObjects.ToStringHelper helper = MoreObjects.toStringHelper(this);
|
||||
helper.addValue(super.debugString());
|
||||
helper.addValue("paimonTable=" + table_.getFullName());
|
||||
return helper.toString();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,64 @@
|
||||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
package org.apache.impala.planner;
|
||||
|
||||
import com.google.common.base.Preconditions;
|
||||
import org.apache.impala.analysis.Analyzer;
|
||||
import org.apache.impala.analysis.Expr;
|
||||
import org.apache.impala.analysis.MultiAggregateInfo;
|
||||
import org.apache.impala.analysis.TableRef;
|
||||
import org.apache.impala.catalog.paimon.FePaimonTable;
|
||||
import org.apache.impala.common.ImpalaException;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* ScanNode factory class for Paimon, currently, only Jni based is supported
|
||||
* Will add native scanNode later.
|
||||
*/
|
||||
public class PaimonScanPlanner {
|
||||
private static final Logger LOG = LoggerFactory.getLogger(PaimonScanPlanner.class);
|
||||
|
||||
private Analyzer analyzer_;
|
||||
private PlannerContext ctx_;
|
||||
private TableRef tblRef_;
|
||||
private List<Expr> conjuncts_;
|
||||
private MultiAggregateInfo aggInfo_;
|
||||
|
||||
private FePaimonTable table_;
|
||||
|
||||
public PaimonScanPlanner(Analyzer analyzer, PlannerContext ctx, TableRef paimonTblRef,
|
||||
List<Expr> conjuncts, MultiAggregateInfo aggInfo) throws ImpalaException {
|
||||
Preconditions.checkState(paimonTblRef.getTable() instanceof FePaimonTable);
|
||||
analyzer_ = analyzer;
|
||||
ctx_ = ctx;
|
||||
tblRef_ = paimonTblRef;
|
||||
conjuncts_ = conjuncts;
|
||||
aggInfo_ = aggInfo;
|
||||
table_ = (FePaimonTable) paimonTblRef.getTable();
|
||||
}
|
||||
|
||||
public PlanNode createPaimonScanPlan() throws ImpalaException {
|
||||
PaimonScanNode ret = new PaimonScanNode(
|
||||
ctx_.getNextNodeId(), tblRef_.getDesc(), conjuncts_, aggInfo_, table_);
|
||||
ret.init(analyzer_);
|
||||
return ret;
|
||||
}
|
||||
}
|
||||
@@ -1910,6 +1910,10 @@ public class SingleNodePlanner implements SingleNodePlannerIntf {
|
||||
conjuncts);
|
||||
scanNode.init(analyzer);
|
||||
return scanNode;
|
||||
} else if (table instanceof FePaimonTable) {
|
||||
PaimonScanPlanner paimonScanPlanner =
|
||||
new PaimonScanPlanner(analyzer, ctx_, tblRef, conjuncts, aggInfo);
|
||||
return paimonScanPlanner.createPaimonScanPlan();
|
||||
} else if (table instanceof FeHBaseTable) {
|
||||
// HBase table
|
||||
scanNode = new HBaseScanNode(ctx_.getNextNodeId(), tblRef.getDesc());
|
||||
|
||||
@@ -0,0 +1,45 @@
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package org.apache.impala.planner.paimon;
|
||||
|
||||
import org.apache.paimon.predicate.Predicate;
|
||||
import org.apache.paimon.table.source.Split;
|
||||
|
||||
import java.io.Serializable;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Paimon split entity used by paimon jni scanner.
|
||||
* */
|
||||
public class PaimonSplit implements Serializable {
|
||||
// Paimon split instance to perform scan.
|
||||
private final Split split_;
|
||||
// predicates that can be pushed to paimon source.
|
||||
private final ArrayList<Predicate> predicates_;
|
||||
|
||||
public PaimonSplit(Split split, ArrayList<Predicate> predicates) {
|
||||
split_ = split;
|
||||
predicates_ = predicates;
|
||||
}
|
||||
|
||||
public Split getSplit() { return split_; }
|
||||
|
||||
public List<Predicate> getPredicates() { return predicates_; }
|
||||
}
|
||||
@@ -0,0 +1,44 @@
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
|
||||
package org.apache.impala.util.paimon;
|
||||
|
||||
import org.apache.arrow.memory.RootAllocator;
|
||||
|
||||
/*
|
||||
*
|
||||
* Arrow Root Allocation singleton
|
||||
* Note: all Arrow code should use this as root allocator.
|
||||
*
|
||||
* */
|
||||
public class ArrowRootAllocation {
|
||||
private static RootAllocator ROOT_ALLOCATOR;
|
||||
|
||||
private ArrowRootAllocation() {}
|
||||
|
||||
public static RootAllocator rootAllocator() {
|
||||
synchronized (ArrowRootAllocation.class) {
|
||||
if (ROOT_ALLOCATOR == null) {
|
||||
ROOT_ALLOCATOR = new RootAllocator(Long.MAX_VALUE);
|
||||
Runtime.getRuntime().addShutdownHook(new Thread(() -> ROOT_ALLOCATOR.close()));
|
||||
}
|
||||
}
|
||||
return ROOT_ALLOCATOR;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,40 @@
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package org.apache.impala.util.paimon;
|
||||
|
||||
import org.apache.arrow.vector.types.pojo.ArrowType;
|
||||
import org.apache.arrow.vector.types.pojo.FieldType;
|
||||
import org.apache.paimon.arrow.ArrowFieldTypeConversion;
|
||||
import org.apache.paimon.types.DecimalType;
|
||||
|
||||
/**
|
||||
* It is an extension of {@link ArrowFieldTypeConversion.ArrowFieldTypeVisitor} class.
|
||||
* To change the decimal conversion behavior.
|
||||
* Paimon decimal type will convert to arrow Decimal128 type, it involves byte copy
|
||||
* and padding, which will cause additional overhead to pass data to BE.
|
||||
* To Eliminate the overhead, will directly pass the decimal unscaled bytes to BE, So
|
||||
* Arrow binary type will be used instead of Decimal128 data type.
|
||||
*/
|
||||
public class PaimonArrowFieldTypeFactory
|
||||
extends ArrowFieldTypeConversion.ArrowFieldTypeVisitor {
|
||||
@Override
|
||||
public FieldType visit(DecimalType decimalType) {
|
||||
return new FieldType(decimalType.isNullable(), new ArrowType.Binary(), null);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,83 @@
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package org.apache.impala.util.paimon;
|
||||
|
||||
import org.apache.arrow.vector.FieldVector;
|
||||
import org.apache.arrow.vector.VarBinaryVector;
|
||||
import org.apache.paimon.arrow.writer.ArrowFieldWriter;
|
||||
import org.apache.paimon.arrow.writer.ArrowFieldWriterFactory;
|
||||
import org.apache.paimon.arrow.writer.ArrowFieldWriterFactoryVisitor;
|
||||
import org.apache.paimon.data.DataGetters;
|
||||
import org.apache.paimon.data.Decimal;
|
||||
import org.apache.paimon.data.columnar.ColumnVector;
|
||||
import org.apache.paimon.data.columnar.DecimalColumnVector;
|
||||
import org.apache.paimon.types.DecimalType;
|
||||
|
||||
import javax.annotation.Nullable;
|
||||
|
||||
/**
|
||||
* It is an extension of {@link ArrowFieldWriterFactoryVisitor} class.
|
||||
* To change the decimal field writer behavior.
|
||||
* Will directly convert paimon decimal type to arrow binary value.
|
||||
*/
|
||||
public class PaimonArrowFieldWriterFactory extends ArrowFieldWriterFactoryVisitor {
|
||||
@Override
|
||||
public ArrowFieldWriterFactory visit(DecimalType decimalType) {
|
||||
return (fieldVector, isNullable)
|
||||
-> new DecimalWriter(fieldVector, decimalType.getPrecision(),
|
||||
decimalType.getScale(), isNullable);
|
||||
}
|
||||
|
||||
public static class DecimalWriter extends ArrowFieldWriter {
|
||||
// decimal precision
|
||||
private final int precision_;
|
||||
// decimal scale
|
||||
private final int scale_;
|
||||
|
||||
public DecimalWriter(
|
||||
FieldVector fieldVector, int precision, int scale, boolean isNullable) {
|
||||
super(fieldVector, isNullable);
|
||||
this.precision_ = precision;
|
||||
this.scale_ = scale;
|
||||
}
|
||||
|
||||
protected void doWrite(ColumnVector columnVector, @Nullable int[] pickedInColumn,
|
||||
int startIndex, int batchRows) {
|
||||
VarBinaryVector decimalVector = (VarBinaryVector) this.fieldVector;
|
||||
|
||||
for (int i = 0; i < batchRows; ++i) {
|
||||
int row = this.getRowNumber(startIndex, i, pickedInColumn);
|
||||
if (columnVector.isNullAt(row)) {
|
||||
decimalVector.setNull(i);
|
||||
} else {
|
||||
Decimal value = ((DecimalColumnVector) columnVector)
|
||||
.getDecimal(row, this.precision_, this.scale_);
|
||||
byte[] bytes = value.toUnscaledBytes();
|
||||
decimalVector.setSafe(i, bytes);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
protected void doWrite(int rowIndex, DataGetters getters, int pos) {
|
||||
((VarBinaryVector) this.fieldVector)
|
||||
.setSafe(rowIndex,
|
||||
getters.getDecimal(pos, this.precision_, this.scale_).toUnscaledBytes());
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,94 @@
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package org.apache.impala.util.paimon;
|
||||
|
||||
import org.apache.arrow.c.ArrowArray;
|
||||
import org.apache.arrow.c.ArrowSchema;
|
||||
import org.apache.arrow.memory.BufferAllocator;
|
||||
import org.apache.arrow.vector.VectorSchemaRoot;
|
||||
import org.apache.paimon.arrow.vector.ArrowCStruct;
|
||||
import org.apache.paimon.arrow.vector.ArrowFormatCWriter;
|
||||
import org.apache.paimon.data.InternalRow;
|
||||
import org.apache.paimon.types.RowType;
|
||||
|
||||
/**
|
||||
* The wrapper of {@link PaimonArrowFormatWriter} to expose JVM off heap address to
|
||||
* BE.
|
||||
* TODO: this class is based on ${@link ArrowFormatCWriter} to allow the customization
|
||||
* of Field writer. will remove if relevant PR is accepted by paimon
|
||||
* community. Refer to
|
||||
* ${@link <a href="https://github.com/apache/paimon/pull/6695">...</a>}
|
||||
* for more detail.
|
||||
*/
|
||||
public class PaimonArrowFormatNativeWriter implements AutoCloseable {
|
||||
// arrow array vector
|
||||
private final ArrowArray array_;
|
||||
// arrow schema
|
||||
private final ArrowSchema schema_;
|
||||
// arrow RecordBatch writer.
|
||||
private final PaimonArrowFormatWriter realWriter_;
|
||||
|
||||
public PaimonArrowFormatNativeWriter(
|
||||
RowType rowType, int writeBatchSize, boolean caseSensitive) {
|
||||
this(new PaimonArrowFormatWriter(rowType, writeBatchSize, caseSensitive));
|
||||
}
|
||||
|
||||
public PaimonArrowFormatNativeWriter(RowType rowType, int writeBatchSize,
|
||||
boolean caseSensitive, BufferAllocator allocator) {
|
||||
this(new PaimonArrowFormatWriter(rowType, writeBatchSize, caseSensitive, allocator));
|
||||
}
|
||||
|
||||
private PaimonArrowFormatNativeWriter(PaimonArrowFormatWriter arrowFormatWriter) {
|
||||
this.realWriter_ = arrowFormatWriter;
|
||||
BufferAllocator allocator = realWriter_.getAllocator();
|
||||
array_ = ArrowArray.allocateNew(allocator);
|
||||
schema_ = ArrowSchema.allocateNew(allocator);
|
||||
}
|
||||
|
||||
public boolean write(InternalRow currentRow) { return realWriter_.write(currentRow); }
|
||||
|
||||
public ArrowCStruct flush() {
|
||||
realWriter_.flush();
|
||||
VectorSchemaRoot vectorSchemaRoot = realWriter_.getVectorSchemaRoot();
|
||||
return PaimonArrowUtils.serializeToCStruct(
|
||||
vectorSchemaRoot, array_, schema_, realWriter_.getAllocator());
|
||||
}
|
||||
|
||||
public void reset() { realWriter_.reset(); }
|
||||
|
||||
public boolean empty() { return realWriter_.empty(); }
|
||||
|
||||
public void release() {
|
||||
array_.release();
|
||||
schema_.release();
|
||||
}
|
||||
|
||||
@Override
|
||||
public void close() {
|
||||
array_.close();
|
||||
schema_.close();
|
||||
realWriter_.close();
|
||||
}
|
||||
|
||||
public VectorSchemaRoot getVectorSchemaRoot() {
|
||||
return realWriter_.getVectorSchemaRoot();
|
||||
}
|
||||
|
||||
public BufferAllocator getAllocator() { return realWriter_.getAllocator(); }
|
||||
}
|
||||
@@ -0,0 +1,126 @@
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package org.apache.impala.util.paimon;
|
||||
|
||||
import org.apache.arrow.memory.BufferAllocator;
|
||||
import org.apache.arrow.memory.RootAllocator;
|
||||
import org.apache.arrow.vector.VectorSchemaRoot;
|
||||
import org.apache.arrow.vector.util.OversizedAllocationException;
|
||||
import org.apache.paimon.arrow.vector.ArrowFormatWriter;
|
||||
import org.apache.paimon.arrow.writer.ArrowFieldWriter;
|
||||
import org.apache.paimon.arrow.writer.ArrowFieldWriterFactoryVisitor;
|
||||
import org.apache.paimon.data.InternalRow;
|
||||
import org.apache.paimon.types.DataType;
|
||||
import org.apache.paimon.types.RowType;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
|
||||
/**
|
||||
* Write paimon Internal rows to arrow RecordBatch batch in Java side.
|
||||
* TODO: this class is based on ${@link ArrowFormatWriter} to allow the customization
|
||||
* of Field writer. will remove if relevant PR is accepted by paimon
|
||||
* community. Refer to
|
||||
* ${@link <a href="https://github.com/apache/paimon/pull/6695">...</a>}
|
||||
* for more detail.
|
||||
*/
|
||||
public class PaimonArrowFormatWriter implements AutoCloseable {
|
||||
private static final Logger LOG =
|
||||
LoggerFactory.getLogger(PaimonArrowFormatWriter.class);
|
||||
// writer factory
|
||||
private static final ArrowFieldWriterFactoryVisitor FIELD_WRITER_FACTORY =
|
||||
new PaimonArrowFieldWriterFactory();
|
||||
// field type factory
|
||||
private static final PaimonArrowFieldTypeFactory FIELD_TYPE_FACTORY =
|
||||
new PaimonArrowFieldTypeFactory();
|
||||
// arrow vector schema root
|
||||
private final VectorSchemaRoot vectorSchemaRoot_;
|
||||
// arrow field writers
|
||||
private final ArrowFieldWriter[] fieldWriters_;
|
||||
// arrow RecordBatch batch size
|
||||
private final int batchSize_;
|
||||
// buffer allocator.
|
||||
private final BufferAllocator allocator_;
|
||||
// rowid for current batch.
|
||||
private int rowId_;
|
||||
|
||||
public PaimonArrowFormatWriter(
|
||||
RowType rowType, int writeBatchSize, boolean caseSensitive) {
|
||||
this(rowType, writeBatchSize, caseSensitive, new RootAllocator());
|
||||
}
|
||||
|
||||
public PaimonArrowFormatWriter(RowType rowType, int writeBatchSize,
|
||||
boolean caseSensitive, BufferAllocator allocator) {
|
||||
this(rowType, writeBatchSize, caseSensitive, allocator, FIELD_WRITER_FACTORY);
|
||||
}
|
||||
|
||||
public PaimonArrowFormatWriter(RowType rowType, int writeBatchSize,
|
||||
boolean caseSensitive, BufferAllocator allocator,
|
||||
ArrowFieldWriterFactoryVisitor fieldWriterFactory) {
|
||||
this.allocator_ = allocator;
|
||||
|
||||
vectorSchemaRoot_ = PaimonArrowUtils.createVectorSchemaRoot(
|
||||
rowType, allocator, caseSensitive, FIELD_TYPE_FACTORY);
|
||||
|
||||
fieldWriters_ = new ArrowFieldWriter[rowType.getFieldCount()];
|
||||
|
||||
for (int i = 0; i < fieldWriters_.length; i++) {
|
||||
DataType type = rowType.getFields().get(i).type();
|
||||
fieldWriters_[i] = type.accept(fieldWriterFactory)
|
||||
.create(vectorSchemaRoot_.getVector(i), type.isNullable());
|
||||
}
|
||||
|
||||
this.batchSize_ = writeBatchSize;
|
||||
}
|
||||
|
||||
public void flush() { vectorSchemaRoot_.setRowCount(rowId_); }
|
||||
|
||||
public boolean write(InternalRow currentRow) {
|
||||
if (rowId_ >= batchSize_) { return false; }
|
||||
for (int i = 0; i < currentRow.getFieldCount(); i++) {
|
||||
try {
|
||||
fieldWriters_[i].write(rowId_, currentRow, i);
|
||||
} catch (OversizedAllocationException | IndexOutOfBoundsException e) {
|
||||
// maybe out of memory
|
||||
LOG.warn("Arrow field writer failed while writing", e);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
rowId_++;
|
||||
return true;
|
||||
}
|
||||
|
||||
public boolean empty() { return rowId_ == 0; }
|
||||
|
||||
public void reset() {
|
||||
for (ArrowFieldWriter fieldWriter : fieldWriters_) { fieldWriter.reset(); }
|
||||
rowId_ = 0;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void close() {
|
||||
vectorSchemaRoot_.close();
|
||||
allocator_.close();
|
||||
}
|
||||
|
||||
public VectorSchemaRoot getVectorSchemaRoot() { return vectorSchemaRoot_; }
|
||||
|
||||
public BufferAllocator getAllocator() { return allocator_; }
|
||||
}
|
||||
@@ -0,0 +1,137 @@
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package org.apache.impala.util.paimon;
|
||||
|
||||
import org.apache.arrow.c.ArrowArray;
|
||||
import org.apache.arrow.c.ArrowSchema;
|
||||
import org.apache.arrow.c.Data;
|
||||
import org.apache.arrow.memory.BufferAllocator;
|
||||
import org.apache.arrow.vector.VectorSchemaRoot;
|
||||
import org.apache.arrow.vector.complex.ListVector;
|
||||
import org.apache.arrow.vector.complex.MapVector;
|
||||
import org.apache.arrow.vector.types.Types;
|
||||
import org.apache.arrow.vector.types.pojo.Field;
|
||||
import org.apache.arrow.vector.types.pojo.FieldType;
|
||||
import org.apache.arrow.vector.types.pojo.Schema;
|
||||
import org.apache.paimon.arrow.ArrowFieldTypeConversion;
|
||||
import org.apache.paimon.arrow.ArrowUtils;
|
||||
import org.apache.paimon.arrow.vector.ArrowCStruct;
|
||||
import org.apache.paimon.table.SpecialFields;
|
||||
import org.apache.paimon.types.ArrayType;
|
||||
import org.apache.paimon.types.DataField;
|
||||
import org.apache.paimon.types.DataType;
|
||||
import org.apache.paimon.types.MapType;
|
||||
import org.apache.paimon.types.RowType;
|
||||
|
||||
import java.util.ArrayList;
|
||||
import java.util.Arrays;
|
||||
import java.util.Collections;
|
||||
import java.util.List;
|
||||
import java.util.stream.Collectors;
|
||||
|
||||
import static org.apache.paimon.utils.StringUtils.toLowerCaseIfNeed;
|
||||
|
||||
/**
|
||||
* Utilities for creating Arrow objects.
|
||||
* TODO: this class is based on ${@link ArrowUtils} to allow the customization
|
||||
* of Field writer. will remove if relevant PR is accepted by paimon
|
||||
* community. Refer to
|
||||
* ${@link <a href="https://github.com/apache/paimon/pull/6695">...</a>}
|
||||
* for more detail.
|
||||
*/
|
||||
public class PaimonArrowUtils {
|
||||
static final String PARQUET_FIELD_ID = "PARQUET:field_id";
|
||||
|
||||
public static VectorSchemaRoot createVectorSchemaRoot(RowType rowType,
|
||||
BufferAllocator allocator, boolean caseSensitive,
|
||||
ArrowFieldTypeConversion.ArrowFieldTypeVisitor visitor) {
|
||||
List<Field> fields =
|
||||
rowType.getFields()
|
||||
.stream()
|
||||
.map(f
|
||||
-> toArrowField(toLowerCaseIfNeed(f.name(), caseSensitive), f.id(),
|
||||
f.type(), visitor, 0))
|
||||
.collect(Collectors.toList());
|
||||
return VectorSchemaRoot.create(new Schema(fields), allocator);
|
||||
}
|
||||
|
||||
public static Field toArrowField(String fieldName, int fieldId, DataType dataType,
|
||||
ArrowFieldTypeConversion.ArrowFieldTypeVisitor visitor, int depth) {
|
||||
FieldType fieldType = dataType.accept(visitor);
|
||||
fieldType = new FieldType(fieldType.isNullable(), fieldType.getType(),
|
||||
fieldType.getDictionary(),
|
||||
Collections.singletonMap(PARQUET_FIELD_ID, String.valueOf(fieldId)));
|
||||
List<Field> children = null;
|
||||
if (dataType instanceof ArrayType) {
|
||||
Field field = toArrowField(ListVector.DATA_VECTOR_NAME, fieldId,
|
||||
((ArrayType) dataType).getElementType(), visitor, depth + 1);
|
||||
FieldType typeInner = field.getFieldType();
|
||||
field = new Field(field.getName(),
|
||||
new FieldType(typeInner.isNullable(), typeInner.getType(),
|
||||
typeInner.getDictionary(),
|
||||
Collections.singletonMap(PARQUET_FIELD_ID,
|
||||
String.valueOf(
|
||||
SpecialFields.getArrayElementFieldId(fieldId, depth + 1)))),
|
||||
field.getChildren());
|
||||
children = Collections.singletonList(field);
|
||||
} else if (dataType instanceof MapType) {
|
||||
MapType mapType = (MapType) dataType;
|
||||
|
||||
Field keyField = toArrowField(MapVector.KEY_NAME, fieldId,
|
||||
mapType.getKeyType().notNull(), visitor, depth + 1);
|
||||
FieldType keyType = keyField.getFieldType();
|
||||
keyField = new Field(keyField.getName(),
|
||||
new FieldType(keyType.isNullable(), keyType.getType(), keyType.getDictionary(),
|
||||
Collections.singletonMap(PARQUET_FIELD_ID,
|
||||
String.valueOf(SpecialFields.getMapKeyFieldId(fieldId, depth + 1)))),
|
||||
keyField.getChildren());
|
||||
|
||||
Field valueField = toArrowField(MapVector.VALUE_NAME, fieldId,
|
||||
mapType.getValueType().notNull(), visitor, depth + 1);
|
||||
FieldType valueType = valueField.getFieldType();
|
||||
valueField = new Field(valueField.getName(),
|
||||
new FieldType(valueType.isNullable(), valueType.getType(),
|
||||
valueType.getDictionary(),
|
||||
Collections.singletonMap(PARQUET_FIELD_ID,
|
||||
String.valueOf(SpecialFields.getMapValueFieldId(fieldId, depth + 1)))),
|
||||
valueField.getChildren());
|
||||
|
||||
FieldType structType = new FieldType(false, Types.MinorType.STRUCT.getType(), null,
|
||||
Collections.singletonMap(PARQUET_FIELD_ID, String.valueOf(fieldId)));
|
||||
Field mapField = new Field(MapVector.DATA_VECTOR_NAME,
|
||||
// data vector, key vector and value vector CANNOT be null
|
||||
structType, Arrays.asList(keyField, valueField));
|
||||
|
||||
children = Collections.singletonList(mapField);
|
||||
} else if (dataType instanceof RowType) {
|
||||
RowType rowType = (RowType) dataType;
|
||||
children = new ArrayList<>();
|
||||
for (DataField field : rowType.getFields()) {
|
||||
children.add(toArrowField(field.name(), field.id(), field.type(), visitor, 0));
|
||||
}
|
||||
}
|
||||
return new Field(fieldName, fieldType, children);
|
||||
}
|
||||
|
||||
public static ArrowCStruct serializeToCStruct(VectorSchemaRoot vsr, ArrowArray array,
|
||||
ArrowSchema schema, BufferAllocator bufferAllocator) {
|
||||
Data.exportVectorSchemaRoot(bufferAllocator, vsr, null, array, schema);
|
||||
return ArrowCStruct.of(array, schema);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,221 @@
|
||||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
package org.apache.impala.util.paimon;
|
||||
|
||||
import com.google.common.collect.Lists;
|
||||
import org.apache.arrow.memory.BufferAllocator;
|
||||
import org.apache.commons.lang3.SerializationUtils;
|
||||
import org.apache.impala.catalog.paimon.PaimonUtil;
|
||||
import org.apache.impala.common.ImpalaException;
|
||||
import org.apache.impala.common.JniUtil;
|
||||
import org.apache.impala.planner.paimon.PaimonSplit;
|
||||
import org.apache.impala.thrift.TPaimonJniScanParam;
|
||||
import org.apache.paimon.arrow.vector.ArrowCStruct;
|
||||
import org.apache.paimon.data.InternalRow;
|
||||
import org.apache.paimon.predicate.Predicate;
|
||||
import org.apache.paimon.reader.RecordReader;
|
||||
import org.apache.paimon.reader.RecordReaderIterator;
|
||||
import org.apache.paimon.table.Table;
|
||||
import org.apache.paimon.table.source.ReadBuilder;
|
||||
import org.apache.paimon.table.source.Split;
|
||||
import org.apache.paimon.types.DataField;
|
||||
import org.apache.paimon.types.RowType;
|
||||
import org.apache.thrift.protocol.TBinaryProtocol;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.util.Arrays;
|
||||
import java.util.List;
|
||||
import java.util.stream.Collectors;
|
||||
|
||||
/**
|
||||
* The Fe Paimon Jni Scanner, used by backend PaimonJniScanner.
|
||||
*/
|
||||
public class PaimonJniScanner implements AutoCloseable {
|
||||
private final static Logger LOG = LoggerFactory.getLogger(PaimonJniScanner.class);
|
||||
|
||||
public final static int DEFAULT_ROWBATCH_SIZE = 1024;
|
||||
public final static long DEFAULT_INITIAL_RESERVATION = 32 * 1024;
|
||||
|
||||
private final static TBinaryProtocol.Factory protocolFactory_ =
|
||||
new TBinaryProtocol.Factory();
|
||||
// Paimon api table.
|
||||
private Table table_ = null;
|
||||
// Paimon splits assigned to the scanner.
|
||||
private List<PaimonSplit> splits_ = null;
|
||||
// Paimon schema after projection.
|
||||
private RowType projectedSchema_;
|
||||
// Paimon data record iterator.
|
||||
private RecordReaderIterator<InternalRow> iterator_;
|
||||
// batch size;
|
||||
int batchSize_;
|
||||
// paimon to arrow RecordBatch writer.
|
||||
private PaimonArrowFormatNativeWriter writer_;
|
||||
// arrow off heap allocator.
|
||||
private BufferAllocator bufferAllocator_;
|
||||
// total rows metrics.
|
||||
private long totalRows_ = 0;
|
||||
// upper bound mem limit.
|
||||
// -1 means no limit.
|
||||
private long allocator_mem_limit_ = -1;
|
||||
|
||||
/**
|
||||
* Constructor for PaimonJniScanner, will be called in Open
|
||||
* method of BE PaimonJniScanNode.
|
||||
* @param jni_scan_param_thrift: thrift form of paimon scan param.
|
||||
* */
|
||||
public PaimonJniScanner(byte[] jni_scan_param_thrift) {
|
||||
TPaimonJniScanParam paimonJniScanParam = new TPaimonJniScanParam();
|
||||
try {
|
||||
JniUtil.deserializeThrift(
|
||||
protocolFactory_, paimonJniScanParam, jni_scan_param_thrift);
|
||||
} catch (ImpalaException ex) { LOG.error("failed to get paimon jni scan param"); }
|
||||
// table
|
||||
table_ = SerializationUtils.deserialize(paimonJniScanParam.getPaimon_table_obj());
|
||||
// splits
|
||||
splits_ = Lists.newArrayList();
|
||||
for (ByteBuffer split_data : paimonJniScanParam.getSplits()) {
|
||||
ByteBuffer split_data_serialized = split_data.compact();
|
||||
splits_.add(SerializationUtils.deserialize(split_data_serialized.array()));
|
||||
}
|
||||
// projection field ids
|
||||
int[] projectionFieldIds =
|
||||
paimonJniScanParam.getProjection().stream().mapToInt(Integer::intValue).toArray();
|
||||
// projected fields and schema
|
||||
DataField[] projectedFields =
|
||||
Arrays.stream(projectionFieldIds)
|
||||
.mapToObj(fieldId -> table_.rowType().getField(fieldId))
|
||||
.toArray(DataField[] ::new);
|
||||
projectedSchema_ = RowType.of(projectedFields);
|
||||
// get batch size
|
||||
batchSize_ = paimonJniScanParam.getBatch_size();
|
||||
if (batchSize_ <= 0) { batchSize_ = DEFAULT_ROWBATCH_SIZE; }
|
||||
// get mem limit
|
||||
allocator_mem_limit_ = paimonJniScanParam.getMem_limit_bytes();
|
||||
String allocatorName =
|
||||
"paimonscan_" + table_.uuid() + paimonJniScanParam.getFragment_id().toString();
|
||||
// create allocator
|
||||
if (allocator_mem_limit_ > 0) {
|
||||
bufferAllocator_ = ArrowRootAllocation.rootAllocator().newChildAllocator(
|
||||
allocatorName, DEFAULT_INITIAL_RESERVATION, allocator_mem_limit_);
|
||||
} else {
|
||||
bufferAllocator_ = ArrowRootAllocation.rootAllocator().newChildAllocator(
|
||||
allocatorName, DEFAULT_INITIAL_RESERVATION, Long.MAX_VALUE);
|
||||
}
|
||||
LOG.info(String.format("Open with mem_limit: %d bytes, batch_size:%d rows, "
|
||||
+ "Projection field ids:%s",
|
||||
allocator_mem_limit_, batchSize_, Arrays.toString(projectionFieldIds)));
|
||||
}
|
||||
|
||||
/**
|
||||
* Perform table splits scanning, will be called in Open
|
||||
* method of BE PaimonJniScanNode.
|
||||
* */
|
||||
public void ScanTable() {
|
||||
// If we are on a stack frame that was created through JNI we need to set the context
|
||||
// class loader as Paimon might use reflection to dynamically load classes and
|
||||
// methods.
|
||||
JniUtil.setContextClassLoaderForThisThread(this.getClass().getClassLoader());
|
||||
writer_ = new PaimonArrowFormatNativeWriter(
|
||||
projectedSchema_, batchSize_, false, bufferAllocator_);
|
||||
// Create and scan the metadata table
|
||||
initReader();
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the next arrow batch, will be called in GetNext
|
||||
* method of BE PaimonJniScanNode.
|
||||
* @param address: return three long values to BE
|
||||
* address[0]: schema address of arrow batch.
|
||||
* address[1]: vector address of arrow batch.
|
||||
* address[2]: offheap memory consumption for current batch.
|
||||
* */
|
||||
public long GetNextBatch(long[] address) {
|
||||
if (!writer_.empty()) { writer_.reset(); }
|
||||
int rows = 0;
|
||||
for (int i = 0; i < batchSize_; i++) {
|
||||
if (iterator_.hasNext()) {
|
||||
boolean result = writer_.write(iterator_.next());
|
||||
if (result) { rows++; }
|
||||
} else {
|
||||
break;
|
||||
}
|
||||
}
|
||||
totalRows_ += rows;
|
||||
if (rows > 0) {
|
||||
ArrowCStruct cStruct = writer_.flush();
|
||||
address[0] = cStruct.schemaAddress();
|
||||
address[1] = cStruct.arrayAddress();
|
||||
address[2] = bufferAllocator_.getAllocatedMemory();
|
||||
return rows;
|
||||
} else {
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
protected boolean initReader() {
|
||||
try {
|
||||
ReadBuilder readBuilder = table_.newReadBuilder().withReadType(projectedSchema_);
|
||||
// Apply push down predicates if present.
|
||||
// Currently predicates are always null/empty,
|
||||
// All conjuncts are evaluated by the C++ scanner.
|
||||
List<Predicate> predicates = splits_.get(0).getPredicates();
|
||||
if (predicates != null && !predicates.isEmpty()) {
|
||||
readBuilder.withFilter(predicates);
|
||||
}
|
||||
// Create Iterator for given splits.
|
||||
List<Split> splits =
|
||||
splits_.stream().map(PaimonSplit::getSplit).collect(Collectors.toList());
|
||||
RecordReader<InternalRow> reader = readBuilder.newRead().createReader(splits);
|
||||
iterator_ = new RecordReaderIterator<>(reader);
|
||||
LOG.info(
|
||||
String.format("Reading %d splits for %s", splits.size(), table_.fullName()));
|
||||
return true;
|
||||
} catch (Exception ex) {
|
||||
LOG.error("failed to init reader for " + table_.fullName(), ex);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Perform clean up operation , will be called in Close
|
||||
* method of BE PaimonJniScanNode.
|
||||
* */
|
||||
@Override
|
||||
public void close() throws Exception {
|
||||
// release writer resources.
|
||||
PaimonUtil.closeQuitely(writer_);
|
||||
|
||||
// release arrow allocator resources owned by current scanner.
|
||||
PaimonUtil.closeQuitely(bufferAllocator_);
|
||||
// used to check mem leak in more detail if arrow allocation
|
||||
// debug is turned on.
|
||||
if (bufferAllocator_.getAllocatedMemory() > 0) {
|
||||
LOG.error(
|
||||
String.format("Leaked memory for %s is %d bytes, dump:%s", table_.fullName(),
|
||||
bufferAllocator_.getAllocatedMemory(), bufferAllocator_.toVerboseString()));
|
||||
}
|
||||
LOG.info(String.format("Peak memory for %s is %d bytes, total rows: %d",
|
||||
table_.fullName(), bufferAllocator_.getPeakMemoryAllocation(), totalRows_));
|
||||
|
||||
// release iterator resources
|
||||
PaimonUtil.closeQuitely(iterator_);
|
||||
}
|
||||
}
|
||||
@@ -97,11 +97,11 @@ public class ImpalaTypeUtilsTest {
|
||||
// Test row type
|
||||
RowType rowType = new RowType(Arrays.asList(new DataField(0, "id", new IntType()),
|
||||
new DataField(1, "name", DataTypes.STRING())));
|
||||
StructType expectedStructType = new StructType(Arrays.asList(
|
||||
new StructField("id", Type.INT,
|
||||
rowType.getField(0).description()),
|
||||
new StructField("name", Type.STRING,
|
||||
rowType.getField(1).description())));
|
||||
StructType expectedStructType = new StructType(
|
||||
Arrays.asList(new PaimonStructField("id", Type.INT,
|
||||
rowType.getField(0).description(), rowType.getField(0).id()),
|
||||
new PaimonStructField("name", Type.STRING, rowType.getField(1).description(),
|
||||
rowType.getField(1).id())));
|
||||
assertEquals(expectedStructType, PaimonImpalaTypeUtils.toImpalaType(rowType));
|
||||
|
||||
// doesn't support time
|
||||
|
||||
Reference in New Issue
Block a user