John Russell eaefbb90ce IMPALA-1654: [DOCS] DDL for multiple partitions
Syntax and usage notes for ALTER TABLE,
COMPUTE STATS, and SHOW FILES.

Mixed in a little bit with new Kudu syntax for
ALTER TABLE. Didn't include all new Kudu info
in this CR, the better to minimize merge conflicts.

Added note about performance/scalability of IMPALA-1654.

Added new Known Issue item for IMPALA-4106 under Performance category.

Change-Id: I2060552d5081e5f93b1b1f398414c52fa03f215b
Reviewed-on: http://gerrit.cloudera.org:8080/5726
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-01-25 18:39:12 +00:00

Welcome to Impala

Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters.

Impala is a modern, massively-distributed, massively-parallel, C++ query engine that lets you analyze, transform and combine data from a variety of data sources:

  • Best of breed performance and scalability.
  • Support for data stored in HDFS, Apache HBase and Amazon S3.
  • Wide analytic SQL support, including window functions and subqueries.
  • On-the-fly code generation using LLVM to generate CPU-efficient code tailored specifically to each individual query.
  • Support for the most commonly-used Hadoop file formats, including the Apache Parquet (incubating) project.
  • Apache-licensed, 100% open source.

More about Impala

To learn more about Impala as a business user, or to try Impala live or in a VM, please visit the Impala homepage.

If you are interested in contributing to Impala as a developer, or learning more about Impala's internals and architecture, visit the Impala wiki.

Supported Platforms

Impala only supports Linux at the moment.

Build Instructions

See bin/bootstrap_build.sh.

Export Control Notice

This distribution uses cryptographic software and may be subject to export controls. Please refer to EXPORT_CONTROL.md for more information.

Description
Apache Impala
Readme 256 MiB
Languages
C++ 49.6%
Java 29.9%
Python 14.6%
JavaScript 1.4%
C 1.2%
Other 3.2%