Commit Graph

354 Commits

Author SHA1 Message Date
Lenni Kuff
7584312540 IMPALA-167: Impala should gracefully handle unsupported Hive table types 2014-01-08 10:48:56 -08:00
Lenni Kuff
ca0d23a844 IMPALA-157: Support CREATE TABLE LIKE DDL 2014-01-08 10:48:55 -08:00
Henry Robinson
8d87972695 Improve parser coverage
This patch adds support for the following SQL constructs

  - Unary + operator
  - The ALL keyword, in SELECT ALL and SELECT aggregate_func(ALL *)
  - REAL and INTEGER as type synonyms for DOUBLE and INT respectively
  - The AS keyword after a table spec. e.g. SELECT * FROM tbl AS t0
2014-01-08 10:48:54 -08:00
Elliott Clark
ade4453a17 Allow impala hbase serdproperties to have newlines 2014-01-08 10:48:54 -08:00
Alex Behm
be03e6c21c IMPALA-138: Error messages for unknown column types are particularly bad. 2014-01-08 10:48:53 -08:00
Alex Behm
b72c6ab71b IMPALA-152: SELECT ... ORDER BY without specifying a table name fails with IllegalStateException 2014-01-08 10:48:52 -08:00
Alex Behm
a01573af63 IMPALA-65: Add MySQL-style string literals with escaping. 2014-01-08 10:48:51 -08:00
Nong Li
0df9476be1 Parquet data loading. 2014-01-08 10:48:48 -08:00
Alex Behm
f4d961a241 IMPALA-151: bad DCHECK on "select count(foo, bar)" queries 2014-01-08 10:48:47 -08:00
Lenni Kuff
6a7b7ea2e0 IMPALA-149: Serialize Hive Metastore metadata loading
This change serializes the Hive Metastore metadata loading code to workaround HIVE-3521
2014-01-08 10:48:45 -08:00
Nong Li
6e293090e6 Parquet writer.
Change-Id: I7117b545e3d3a7803a219234ad992040a6c7c4ec
2014-01-08 10:48:44 -08:00
Alexander Behm
39e443407b IMPALA-136: GROUP BY float/double. 2014-01-08 10:48:43 -08:00
Nong Li
62b4afbde4 Catalog loading now accesses volume id internals to get at disk id. 2014-01-08 10:48:42 -08:00
Marcel Kornacker
d7bfe6c68d IMPALA-144: partition pruning for arbitrary predicates that are fully bound by partition columns
This makes partition pruning more effective by extending it to predicates that are fully bound by the partition column,
e.g., '<col> IN (1, 2, 3)' will also be used to prune partitions, in addition to equality and binary comparisons.
2014-01-08 10:48:41 -08:00
Lenni Kuff
d57440e87d Allow column comments for CREATE TABLE and DESCRIBE <table> statements 2014-01-08 10:48:37 -08:00
Lenni Kuff
f1fc449e93 Fix analysis error message when no matching function is found with the given args 2014-01-08 10:48:36 -08:00
Alex Behm
8453d0aa5c IMPALA-83: Cast of string literal to boolean fails analysis. 2014-01-08 10:48:36 -08:00
Lenni Kuff
9f71374875 IMPALA-102: Add support for CREATE TABLE ... PARTITIONED BY (col1, col2) 2014-01-08 10:48:35 -08:00
Henry Robinson
71e6d81d1b IMP-261: Clean up network address handling 2014-01-08 10:48:33 -08:00
Marcel Kornacker
77f4fc8cf9 Adding memory limits
- new class MemLimit
- new query flag MEM_LIMIT
- implementation of impalad flag mem_limit

Still missing:
- parsing a mem limit spec that contains "M/G", as in: 1.25G
2014-01-08 10:48:33 -08:00
Lenni Kuff
87d8f79efe Add support for CREATE TABLE ... STORED AS PARQUETFILE 2014-01-08 10:48:32 -08:00
Lenni Kuff
8184622364 Fix expr-test break due to "IF" becoming a keyword 2014-01-08 10:48:30 -08:00
Lenni Kuff
1cd847c856 IMPALA-81: Add support for CREATE/DROP DATABASE/TABLE
This adds Impala support for CREATE/DROP DATABASE/TABLE. With this change, Impala
supports creating tables in the metastore stored as text, sequence, and rc file format.
It currently only supports creating unpartitioned tables and tables stored in HDFS.
2014-01-08 10:48:30 -08:00
Marcel Kornacker
c02d25baa8 IMPALA-20: Limit clause in inline view not handled correctly by planner
- this adds a SelectNode that evaluates conjuncts and enforces the limit
- all limits are now distributed: enforced both by the child plan fragment and
  by the merging ExchangeNode
- all limits w/ Order By are now distributed: enforced both by the child plan fragment and
  by the merging TopN node
2014-01-08 10:48:29 -08:00
ishaan
09d6d931f4 Change the way data is loaded 2014-01-08 10:48:09 -08:00
Alan Choi
9c11c0ce2d HiveServer2 clean up
This patch adds

1. use boost uuid
2. add unit test for HiveServer2 metadata operation
3. add JDBC metadata unit test
4. implement all remaining HiveServer2: GetFunctions and GetTableTypes
5. remove in-process impala server from fe-support
2014-01-08 10:48:06 -08:00
Skye Wanderman-Milne
8b87099998 IMPALA-2: Support for Avro data files
Adds HdfsAvroScanner, as well as modifies the sequence scanners to be more general.
2014-01-08 10:48:05 -08:00
Alan Choi
73c8ee3d96 IMPALA-18 Ignore hidden file prefixed with . or _ 2014-01-08 10:48:00 -08:00
Alan Choi
fee1b47e33 IMPALA-26 Check HDFS direct read and block location tracking 2014-01-08 10:47:59 -08:00
Lenni Kuff
d2e4776731 Support passing snapshot file to buildall, add script to run all tests, remove old tests 2014-01-08 10:47:59 -08:00
Lenni Kuff
57a8603150 Fix FE build break due to change in Catalog ctor signature 2014-01-08 10:47:57 -08:00
Henry Robinson
84e35d591c IMPALA-74: Read fs name and port from Hadoop's configuration 2014-01-08 10:47:56 -08:00
Lenni Kuff
557b59a80d IMPALA-58: Impala should retry connection to metastore instead of dying 2014-01-08 10:47:56 -08:00
Alan Choi
251a8a2bf1 IMP-57: rename fe_port to beeswax_port 2014-01-08 10:47:53 -08:00
Nong Li
7001fb103e Move Impala to CDH4.2 RC2 2014-01-08 10:47:50 -08:00
Lenni Kuff
900ddb5cbf IMP-48: Use separate HiveMetaStoreClient connections across requests
This change modifies the Catalog to create HiveMetaStoreClient connections on a
 per-request basis. This resolves an issue when Impala is talking over Thrift to a Hive
 Metastore Service. The Hive Thrift client is not thread safe so concurrent metadata loads
 would fail.

 To reduce the overhead associated with creating a new connection each time, a simple
 connection pool was added. The pool is initialized with a fixed number of connections
 and new connections are added on an as-needed basis.
2014-01-08 10:47:38 -08:00
Marcel Kornacker
63e3cd0279 Adding query option DEBUG_ACTION 2014-01-08 10:47:37 -08:00
Lenni Kuff
1896701399 IMPALA-44: Database names are case sensitive 2014-01-08 10:47:34 -08:00
Henry Robinson
2c0d10dd15 IMP-722: Fix normal Findbugs warnings 2014-01-08 10:47:31 -08:00
Lenni Kuff
42eef84200 IMPALA-30: Make Catalog Db and Table caches thread safe 2014-01-08 10:47:30 -08:00
Lenni Kuff
1a2695781d Add support for targeting JDBC via run-workload and add Impala Jdbc Client tool 2014-01-08 10:47:29 -08:00
Alan Choi
be98df19c8 HiveServer2
This patch implements the HiveServer2  API.

We have tested it with Lenni's patch against the tpch workload. It has also
been tested manually against Hive's beeline with queries and metadata operations.

All of the HiveServer2 code is implemented in impala-hs2-server.cc. Beeswax
code is refactored to impala-beeswax-server.cc.

HiveServer2 has a few more metadata operations. These operations go through
impala-hs2-server to ddl-executor and then to FE. The logics are implemented in
fe/src/main/java/com/cloudera/impala/service/MetadataOp.java.

Because of the Thrift union issue, I have to modify the generated c++ file.
Therefore, all the HiveServer2 thrift generated c++ code are checked into
be/src/service/hiveserver2/. Once the thrift issue is resolved, I'll remove
these files.

Change-Id: I9a8fe5a09bf250ddc43584249bdc87b6da5a5881
2014-01-08 10:47:24 -08:00
Henry Robinson
7ba437a52e Code changes to build against thrift 0.9.0 in thirdparty/ 2014-01-08 10:47:22 -08:00
Marcel Kornacker
b1b2e659af IMP-573: Adding block location cache to HdfsTable in order to speed up getBlockMetadata().
- we're now pre-computing and caching the result of HdfsTable.getBlockMetadata() on a per-partition basis
- to make the cache more compact, we're collecting pools of unique strings:
  file names are collected per partition and ip/port strings are collected per table
2014-01-08 10:47:19 -08:00
Skye Wanderman-Milne
0387bb92bf IMPALA-14: Files with .gz extension reported as 'not supported'
Fix logic in HdfsTable. Rename TestUnsupportedFormats to TestCompressedFormats and refactor with new tests.
2014-01-08 10:47:13 -08:00
Nong Li
02c329b97a Update RC files to use io mgr and remove scanner support for non-io mgr. 2014-01-08 10:47:11 -08:00
Marcel Kornacker
bf56c21c1b IMP-618
Adding DEFAULT_ORDER_BY_LIMIT query option.
Also removing deprecated PARTITION_AGG query option.
2014-01-08 10:47:04 -08:00
Marcel Kornacker
e0525b515c extra debug output to diagnose planner slowdown on table with large number of partitions 2014-01-08 10:47:04 -08:00
Nong Li
b575b08357 Fix planner to reject compressed text formats. 2014-01-08 10:47:01 -08:00
Henry Robinson
0dbd8317a9 Improve debug webpage presentation 2014-01-08 10:47:00 -08:00