mirror of
https://github.com/apache/impala.git
synced 2025-12-25 02:03:09 -05:00
c9cb00f4a19a3eaf9fe93ffe69ec5ccab8996843
We should only need to recreate the Sentry Policy DB when formatting a cluster. Previously buildall.sh always tried to create the database regardless of whether it was needed. E.g. if a machine was just building Impala without running tests, there is no need to create any of the test databases. This fixes a regression when running buildall.sh on a machine without postgres set up. Change-Id: I35bb1cb275bb4da3f91f496010a7f6ee4daa2792 Reviewed-on: http://gerrit.cloudera.org:8080/1782 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Internal Jenkins
Welcome to Impala
Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters.
Impala is a modern, massively-distributed, massively-parallel, C++ query engine that lets you analyze, transform and combine data from a variety of data sources:
- Best of breed performance and scalability.
- Support for data stored in HDFS, Apache HBase and Amazon S3.
- Wide analytic SQL support, including window functions and subqueries.
- On-the-fly code generation using LLVM to generate CPU-efficient code tailored specifically to each individual query.
- Support for the most commonly-used Hadoop file formats, including the Apache Parquet (incubating) project.
- Apache-licensed, 100% open source.
More about Impala
To learn more about Impala as a business user, or to try Impala live or in a VM, please visit the Impala homepage.
If you are interested in contributing to Impala as a developer, or learning more about Impala's internals and architecture, visit the Impala wiki.
Languages
C++
49.6%
Java
29.9%
Python
14.6%
JavaScript
1.4%
C
1.2%
Other
3.2%