This change stops including some boost library header files
which pulls in other unnecessary boost library header files.
This reduces the amount of cross-compiled code which needs
to be materialized during codegen.
This change also removes some UDF's Prepare() and Close()
functions and UDF functions fromUtc(), toUtc() and uuid()
from cross-compilation as they won't benefit from it.
With this change, the bitcode module reduces from 2.12 MB to 1.86MB.
Change-Id: I543809c69da0b4085a0e299b91cd550b274c46af
Reviewed-on: http://gerrit.cloudera.org:8080/3793
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Internal Jenkins
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
This patch refactors hdfs-parquet-scanner.cc into several files.
The new responsibilities of each file/component are roughly as follows:
hdfs-parquet-scanner.h/cc
- Creates column readers and uses them to materializes row batches.
- Evaluates runtime filters and conjuncts, populates row batch queue.
parquet-metadata-utils.h/cc
- Contains utilities for validating Parquet file metadata.
- Parses the schema of a Parquet file into our internal schema
representation.
- Resolves SchemaPaths (e.g. from a table descriptor) against
the internal representation of the Parquet file schema.
parquet-column-readers.h/cc
- Contains the per-column data reading, parsing and value
materialization logic.
Testing: A private core/hdfs run passed.
Change-Id: I4c5fd46f9c1a0ff2a4c30ea5a712fbae17c68f92
Reviewed-on: http://gerrit.cloudera.org:8080/3596
Tested-by: Internal Jenkins
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
gen_build_version.sh previously had a --noclean option which did not
overwrite the version information if it was already populated. Since
--noclean was the default option, it always never updated the version
information.
This patch modifies gen_build_version.py to generate a
common/version.cc instead of a common/version.h. Now,
common/version.h will be a part of the git repo and will not need to
be modified on every build. It declares the functions that will return
the build information. These functions will be defined in
common/version.cc and the build information will change on every new
build.
Since only the .cc file changes on every build, we will not incur a
highly noticable change in build times.
Also changed the function names from GetImpalaBuild...() to
GetImpaladBuild...() so as to avoid naming confusion between the
Impala-lzo and the Impala functions.
There is an accompanying change in the Impala-lzo library too.
Change-Id: Ie461110b6f8ca545f04ea33b7b502aea550b8551
Reviewed-on: http://gerrit.cloudera.org:8080/2651
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Sailesh Mukil <sailesh@cloudera.com>
This patch adds two query options for runtime filters:
RUNTIME_FILTER_MAX_SIZE
RUNTIME_FILTER_MIN_SIZE
These options define the minimum and maximum filter sizes for a filter,
no matter what the estimates produced by the planner are. Filter sizes
are rounded up to the nearest power of two.
Change-Id: I5c13c200a0f1855f38a5da50ca34a737e741868b
Reviewed-on: http://gerrit.cloudera.org:8080/2966
Tested-by: Internal Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This patch doesn't use 'auto' for the loop index type, as it's not clear
yet where the savings in typing outweigh the cost of eliding the type.
Change-Id: Iae1ca36313e3562311b6418478bf54b6d9b0bf7d
Reviewed-on: http://gerrit.cloudera.org:8080/2890
Tested-by: Internal Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This patch helps reduce compile times when modifying function
implementations in decimal-value.h, e.g. when tuning the implementations
of decimal operators. decimal-value.h was included in many places that
only need to know about the layout of DecimalValue, not the
implementation of decimal operations. It was included indirectly in many
files, e.g. via runtime-state.h.
The patch moves those functions to decimal-value.inline.h and is able to
avoid including decimal-value.inline.h in most headers. We also need to
do the same thing for raw-value.h and runtime-filter.h, because some
of the inline functions in raw-value.h referenced inline functions in
decimal-value.h, and functions in runtime-filter.h referenced inline
functions in runtime-filter.h.
It also moves timestamp parsing logic from .h to .cc file. This slightly
reduces the size of the llvm bitcode module and will slightly reduce
compile times. The functions are too large to benefit from inlining in
generated code.
Change-Id: Ic7a2f388cd14a4427c43af2724340a2ffe8fae3d
Reviewed-on: http://gerrit.cloudera.org:8080/2485
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
This patch has the FE include only materialized slots in the tuple
descriptors shipped to the BE. This simplifies BE code which had to
skip over unmaterialized slots (which aren't used anywhere outside the
FE).
Change-Id: I2f69078a391e38d30fa129fba12185208375b7c9
Reviewed-on: http://gerrit.cloudera.org:8080/1764
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Internal Jenkins
Also renames ArrayVal to CollectionVal and related variable
names. CollectionValues represent both arrays and maps, so ArrayValue
was a misleading name.
Change-Id: I5b482e4dafcffda7c6e8f3e71f7b9fa34125f5c4
Reviewed-on: http://gerrit.cloudera.org:8080/1266
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Internal Jenkins
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
This patch adds ArrayValue, the in-memory representation of arrays and
maps (which are treated as an array of key/value structs). It also
adds ArrayValueBuilder, which is a helper class for creating
ArrayValues.
Change-Id: Iba0348d1a25876bbed452c93d2c4ed90a701e9d3
Reviewed-on: http://gerrit.cloudera.org:8080/487
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Internal Jenkins
This patch removes all occurrences of "using namespace std" and "using
namespace boost(.*)" from the codebase. However, there are still cases
where namespace directives are used (e.g. for rapidjson, thrift,
gutil). These have to be tackled in subsequent patches.
To reduce the patch size, this patch introduces a new header file called
"names.h" that will include many of our most frequently used symbols iff
the corresponding include was already added. This means, that this
header file will pull in for example map / string / vector etc, only iff
vector was already included. This requires "common/names.h" to be the
last include. After including `names.h` a new block contains a sorted list
of using definitions (this patch does not fix namespace directive
declarations for other than std / boost namespaces.)
Change-Id: Iebe4c054670d655bc355347e381dae90999cfddf
Reviewed-on: http://gerrit.cloudera.org:8080/338
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
These are the backend changes necessary for reading structs in Parquet
files. I wrote this against Alex's preliminary frontend work, and
ad-hoc tables containing structs work. We won't be able to add
automated tested until the FE changes are in as well, but I'd like to
get these changes in so we can at least get converage of our existing
workloads.
The bulk of the changes are in the Parquet scanner. The rest is around
changing the column index of a slot descriptor to a column path, in
order to support nested columns.
Change-Id: Ifbd865b52c2b4679d81643184b1f36bf539ffcfd
Reviewed-on: http://gerrit.cloudera.org:8080/62
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: Internal Jenkins
This patch reworks a lot of the metrics subsystem, laying much of the
groundwork for unifying runtime profiles and metrics in the future, as
well as enabling better rendering of metric data in our webpages, and
richer integration with thirdparty monitoring tools like CM.
There are lots of changes. The most significant are below.
TODO (incomplete list):
* Add descriptions for all metrics
* Settle on a standard hierarchy for process-wide metric groups
* Add path-based resolution for searching for metrics (i.e. resolve
"group1.group2.metric_name")
* Add a histogram metric type
Improvements for all metrics:
** New 'description' field, which allows a human-readable description to
be provided for each metric.
** Metrics must serialise themselves to JSON via the RapidJson
library (all by-hand JSON serialisation has been removed).
** Metrics are contained in MetricGroups (replacing the old 'Metrics'
class), which are hierarchically arranged to make grouping metrics
into smaller subsystems more natural.
** Metrics are rendered via the new webserver templating engine,
replacing the old /metrics endpoint. The old /jsonmetrics endpoint is
retained for backwards compatibility.
Improvements for 'simple' metrics:
** SimpleMetric replaces the old PrimitiveMetric class (using much of
the same code), and are metrics whose value does not itself have
relevant structure (as opposed to sets, lists, etc).
** SimpleMetrics have 'kinds' (counter, gauge, property etc)
** ... and units (from TCounterType), to make pretty-printing easier.
Change-Id: Ida1d125172d8572dfe9541b4271604eff95cfea6
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5722
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
Currently we do not support per record compression for SEQUENCEFILE; we do support no
compression and block compression. Per record compression is typically very slow
(since the compressor is invoked per record in the table) and not widely used.
We chose to add support for per record compression as part of our effort to use Impala
for all of our testdata loading infrastructure. We have per record compressed tables
in testdata, so even though there is no customer demand for per record compression,
we need it to migrate our data loading off of Hive.
Change-Id: I6ea98ae0d31cceff8236b4b006c3a9fc00f64131
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5302
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f62a76f8d00b8dbc2846deb36ee5f65031ad846e)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5322
Introduces support for writing tables stored as Avro files. This supports writing all
data types except TIMESTAMP. Supports the following COMPRESSION_CODECs: NONE, DEFLATE,
SNAPPY.
Change-Id: Ica62063a4f172533c30dd1e8b0a11856da452467
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3863
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 15c6066d05d5077bee0d5123d26777b0715eb9c6)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4056
The runtime profile as we present it is not very useful and I think the structure of
it makes it hard to consume. This patch adds a new client facing schemed set of
counters that are collected from the runtime profiles. For example, with this structure
it would be easy to have the shell get the stats of a running query and print a useful
progress report or to check the most relevant metrics for diagnosing issues.
Here's an example of the output for one of the tpch queries:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
------------------------------------------------------------------------------------------------------------------------
09:MERGING-EXCHANGE 1 79.738us 79.738us 5 5 0 -1.00 B UNPARTITIONED
05:TOP-N 3 84.693us 88.810us 5 5 12.00 KB 120.00 B
04:AGGREGATE 3 5.263ms 6.432ms 5 5 44.00 KB 10.00 MB MERGE FINALIZE
08:AGGREGATE 3 16.659ms 27.444ms 52.52K 600.12K 3.20 MB 15.11 MB MERGE
07:EXCHANGE 3 2.644ms 5.1ms 52.52K 600.12K 0 0 HASH(o_orderpriority)
03:AGGREGATE 3 342.913ms 966.291ms 52.52K 600.12K 10.80 MB 15.11 MB
02:HASH JOIN 3 2s165ms 2s171ms 144.87K 600.12K 13.63 MB 941.01 KB INNER JOIN, BROADCAST
|--06:EXCHANGE 3 8.296ms 8.692ms 57.22K 15.00K 0 0 BROADCAST
| 01:SCAN HDFS 2 1s412ms 1s978ms 57.22K 15.00K 24.21 MB 176.00 MB tpch.orders o
00:SCAN HDFS 3 8s032ms 8s558ms 3.79M 600.12K 32.29 MB 264.00 MB tpch.lineitem l
Change-Id: Iaad4b9dd577c375006313f19442bee6d3e27246a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2964
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
- A few places didn't have total timer at the beginning.
- Async build thread for blocking join nodes really messed things up (sum of
children was more than the time in the join node).
Change-Id: I9176ce37cf22f2bcebea21b117e45cce066dbc1d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2276
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
This patch cleans up analysis and execution of scalar and aggregate functions
so that there is no difference between how builtins and user functions are
handled. The only difference is that the catalog is populated with the builtins
all the time.
The BE always gets a TFunction object and just executes it (builtins will have
an empty hdfs file location).
This removes the opcode registry and all of the functionality is subsumed by
the catalog, most of which was already duplicated there anyway.
This also introduces the concept of a system database; databases that the
user cannot modify and is populated automatically on startup.
Change-Id: Iaa3f84dad0a1a57691f5c7d8df7305faf01d70ed
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1386
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1577
This patch incorporates the gutil library in thirdparty/gutil into the
build, and uses strings::Substitute in one place as a proof-of-concept.
Some other cleanups for $IMPALA_HOME/CMakeLists.txt are included.
Change-Id: I851bf6f130c2f4039f3df3c6d60f842a5661e5da
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1026
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
This patch fixes a deficiency in a previous attempt to keep legacy
compatibility with CM4 when it comes to query IDs sent to the debug page
for cancellation. Those query IDs are sent as <decimal-int>
<decimal-int>, whereas going forward we want to accept <hex-int>:<hex-int>.
Change-Id: I4a3611d1e0c613198861b2c8052aa48ef7bc8714
Reviewed-on: http://gerrit.ent.cloudera.com:8080/950
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
This patch goes some way to improving recovery after an INSERT
fails. Inserts now write intermediate results to
<table_dir>/.impala_insert_staging. After execution completes, either
successfully or not, the query-specific directory under that directory
is deleted.
This doesn't complete the job for better cleanup (although this goes as
far as IMPALA-449 suggests). Two things to do in the future:
* Have each backend delete its own staging files on error. The
difficulty getting there now is that backends don't know if they are
cancelled in error or because a LIMIT was reached.
* If the operation to move files to their final destinations should
fail during FinalizeQuery(), the coordinator should perform
compensation actions and delete the files that made it.
Note: We also considered a query-wide and impalad-wide option to change
the staging dir. There are advantages to this (all intermediate results
go to a known location which is easy to clean up on failure), but also
security and other operational concerns. Worth revisiting in the future.
Change-Id: Ia54cf36db6a382e359877f87d7d40aad7fdb77be
Reviewed-on: http://gerrit.ent.cloudera.com:8080/670
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.
Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.
What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
contains the change. An impalad will wait for the statestore heartbeat that contains this
version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing
Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
same JAR.
Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
This patch also adds a number of improvements to NativeUdfExpr. Highlights include:
* Correctly handling the lowering of AnyVal struct types (required for ABI compatibility)
* A rudimentary library cache for reusing handles produced by dlopen
* More complicated test cases
Change-Id: Iab9acdd7d7c4308e5d7ee3210f21b033fda5a195
Reviewed-on: http://gerrit.ent.cloudera.com:8080/540
Tested-by: jenkins
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
This change adds support audit event logging in Impala. This feature is
disabled by default and is enabled by setting the -audit_event_log_dir
flag. When auditing is enabled, details on each query that Impala executes
will be saved to the audit log along with the current session state. This
includes information such as the statement type, catalog objects accessed
by the query, and whether there the operation passed authorization.
Change-Id: I39b78664c971124ec79c5fcee998065dd53fd32e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/142
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
the average number of active scanner thread (i.e. those that are not blocked by
IO).
In the hdfs-scan-node, whenever a thread is started, it will increment the
active_scanner_thread_counter_. When a scanner thread enter the
scan-range-context's GetRawBytes or GetBytes, the counter will be decremented.
A new sampling thread is created to sample the value of
active_scanner_thread_counter_ and compute the average.
A bucket couting of HdfsReadThreadConcurrent is also added.
The output of the hdfs-scan-node profile is also updated. Here's the new output
for hdfs-scan-node after running count(*) from tpch.lineitem.
HDFS_SCAN_NODE (id=0):(10s254ms 99.75%)
File Formats: TEXT/NONE:12
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:6/351.21M
(351208888) 1:6/402.65M (402653184)
- AverageHdfsReadThreadConcurrency: 1.95
- HdfsReadThreadConcurrencyCountPercentage=0: 0.00
- HdfsReadThreadConcurrencyCountPercentage=1: 5.00
- HdfsReadThreadConcurrencyCountPercentage=2: 95.00
- HdfsReadThreadConcurrencyCountPercentage=3: 0.00
- AverageScannerThreadConcurrency: 0.15
- BytesRead: 718.94 MB
- MemoryUsed: 0.00
- NumDisksAccessed: 2
- PerReadThreadRawHdfsThroughput: 36.75 MB/sec
- RowsReturned: 6.00M (6001215)
- RowsReturnedRate: 585.25 K/sec
- ScanRangesComplete: 12
- ScannerThreadsInvoluntaryContextSwitches: 168
- ScannerThreadsTotalWallClockTime: 1m40s
- DelimiterParseTime: 2s128ms
- MaterializeTupleTime: 723.0us
- ScannerThreadsSysTime: 10.0ms
- ScannerThreadsUserTime: 2s090ms
- ScannerThreadsVoluntaryContextSwitches: 99
- TotalRawHdfsReadTime: 19s561ms
- TotalReadThroughput: 68.69 MB/sec
Also includes
- Cleanup of HdfsScanNode/IoMgr interaction
- Rename of ScanRangeContext to scanner context
- Removed files that were no longer being used
Rdtsc is not accurate, due to changes in cpu frequency. Very often, the time
reported in the profile is even longer than the time reported by the shell.
This patch replaces Rdtcs with CLOCK_MONOTONIC. It is as fast as Rdtsc and
accurate. It is not affected by cpu frequency changes and it is not affected by
user setting the system clock.
Note that the new profile report will always report time, rather than in clock
cycle. Here's the new profile:
Averaged Fragment 1:(68.241ms 0.00%)
completion times: min:69ms max:69ms mean: 69ms stddev:0
execution rates: min:91.60 KB/sec max:91.60 KB/sec mean:91.60 KB/sec
stddev:0.00 /sec
split sizes: min: 6.32 KB, max: 6.32 KB, avg: 6.32 KB, stddev: 0.00
- RowsProduced: 1
CodeGen:
- CodegenTime: 566.104us <--* reporting in microsec instead of
clock cycle
- CompileTime: 33.202ms
- LoadTime: 2.671ms
- ModuleFileSize: 44.61 KB
DataStreamSender:
- BytesSent: 16.00 B
- DataSinkTime: 50.719us
- SerializeBatchTime: 18.365us
- ThriftTransmitTime: 145.945us
AGGREGATION_NODE (id=1):(68.384ms 15.50%)
- BuildBuckets: 1.02K
- BuildTime: 13.734us
- GetResultsTime: 6.650us
- MemoryUsed: 32.01 KB
- RowsReturned: 1
- RowsReturnedRate: 14.00 /sec
HDFS_SCAN_NODE (id=0):(57.808ms 84.71%)
- BytesRead: 6.32 KB
- DelimiterParseTime: 62.370us
- MaterializeTupleTime: 767ns
- MemoryUsed: 0.00
- PerDiskReadThroughput: 9.32 MB/sec
- RowsReturned: 100
- RowsReturnedRate: 1.73 K/sec
- ScanRangesComplete: 4
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsReadTime: 662.431us
- ScannerThreadsSysTime: 0
- ScannerThreadsTotalWallClockTime: 25ms
- ScannerThreadsUserTime: 0
- ScannerThreadsVoluntaryContextSwitches: 4
- TotalReadThroughput: 0.00 /sec