mirror of https://github.com/apache/impala.git synced 2026-01-27 06:10:53 -05:00

Go to file

Alex Behm 2db16efda8 Nested Types: Plan generation for correlated and child table refs with Subplans.

The plan generation is heuristic. A SubplanNode is placed as low as possible in the
plan tree - as soon as its required parent tuple ids are materialized.
This approach is simple to understand and implement, but not always optimal. For
example, it may be better to place a Subplan after a selective join, but today we
will place it below the join if it is correct to do so.

For such scenarios, the straight_join hint can be used to manually tune the join
and Subplan order. If straight_join is used, correlated and child table refs are placed
into the same SubplanNode if they are adjacent in the FROM clause.

Change-Id: I53e4623eb58f8b7ad3d02be15ad8726769f6f8c9
Reviewed-on: http://gerrit.cloudera.org:8080/401
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins

2015-08-19 18:37:02 +00:00

IMPALA-2015: Add support for nested loop join

2015-08-19 08:40:14 +00:00

bin

Addendum: Quietly resolve FE dependencies in Jenkins runs.

2015-08-14 00:23:41 +00:00

cmake_modules

Making CMake modules more modular for non-toolchain build

2015-07-22 02:01:34 +00:00

common

IMPALA-2015: Add support for nested loop join

2015-08-19 08:40:14 +00:00

ext-data-source

Upgrade a few important mvn plugins.

2015-05-20 03:12:57 +00:00

Nested Types: Plan generation for correlated and child table refs with Subplans.

2015-08-19 18:37:02 +00:00

infra/python

Python: Bootstrap a virtualenv and add impala-python command

2015-08-01 01:30:12 +00:00

llvm-ir

Move IR cross compile output to a better folder for packaging.

2012-06-01 13:14:18 -07:00

shell

IMPALA-1975: Automatically reconnect failed connections from the shell

2015-08-05 01:00:54 +00:00

ssh_keys

Move ssh keys from bin directory to fix packaging build break

2014-01-08 10:44:12 -08:00

testdata

Nested Types: Plan generation for correlated and child table refs with Subplans.

2015-08-19 18:37:02 +00:00

tests

IMPALA-2015: Add support for nested loop join

2015-08-19 08:40:14 +00:00

www

Add HdrHistogram and HistogramMetric

2015-05-26 00:39:00 +00:00

.gitignore

Add MetricDefs, static definitions of metric metadata generated from json

2015-05-14 21:27:28 +00:00

buildall.sh

Clean stale python object files and cached directories in buildall.

2015-06-25 00:09:23 +00:00

CMakeLists.txt

Optional Impala Toolchain

2015-06-13 03:11:44 +00:00

LICENSE.txt

Add text of Apache license

2014-05-08 11:16:53 -07:00

NOTICE.txt

Add NOTICE.txt file to Impala repo

2014-07-02 15:23:24 -07:00

README.md

Fix link syntax for README.md

2015-03-23 20:32:23 +00:00

README.md

Welcome to Impala

Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters.

Impala is a modern, massively-distributed, massively-parallel, C++ query engine that lets you analyze, transform and combine data from a variety of data sources:

Best of breed performance and scalability.
Support for data stored in HDFS, Apache HBase and Amazon S3.
Wide analytic SQL support, including window functions and subqueries.
On-the-fly code generation using LLVM to generate CPU-efficient code tailored specifically to each individual query.
Support for the most commonly-used Hadoop file formats, including the Apache Parquet (incubating) project.
Apache-licensed, 100% open source.

More about Impala

To learn more about Impala as a business user, or to try Impala live or in a VM, please visit the Impala homepage.

If you are interested in contributing to Impala as a developer, or learning more about Impala's internals and architecture, visit the Impala wiki.

Languages

C++ 49.2%

Java 30.5%

Python 14.5%

JavaScript 1.3%

C 1.2%

Other 3.2%