impala

mirror of https://github.com/apache/impala.git synced 2026-01-02 03:00:32 -05:00

Author	SHA1	Message	Date
casey	2351266d0e	Replace single process mini-dfs with multiple processes This should allow individual service components, such as a single nodemanager, to be shutdown for failure testing. The mini-cluster bundled with hadoop is a single process that does not expose the ability to control individual roles. Now each role can be controlled and configured independently of the others. Change-Id: Ic1d42e024226c6867e79916464d184fce886d783 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1432 Tested-by: Casey Ching <casey@cloudera.com> Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2297 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-04-23 18:24:05 -07:00
Nong Li	87295a4e06	Decimal implementation. This patch implements decimal support for text based formats. Change-Id: I8e2c9e512ed149fe965216a72cb21fffd4f18e75 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1669 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2238 Tested-by: jenkins	2014-04-14 21:07:32 -07:00
Lenni Kuff	aa0b7a35f5	IMPALA-880: COMPUTE STATS should update partitions in batches When updating partition metadata as part of COMPUTE STATS we would previously attempt to update all partitions at once. This could lead to HMS socket timeouts and also could run into issues if there were > 32K partitions. In this change we now update the partitions in batches, with a max size of 500 partitions per batch. We also compare whether the row count has changed and only update partitions that have been modified. Change-Id: If7bfcc30f86fc2fdd79855b981067ac29a47b5e1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1913 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1918	2014-03-14 19:20:12 -07:00
ishaan	9e043e862c	Fix run-hbase.sh to correctly pick up the classpath. We run wat-for-hbase-master.py after starting hbase to account for a race between the master and region server. This script has not been working for some time. It caused no ill effects sinc the said race was absent. However, the race has manifested itself again, so the script needs to be fixed. Setting the correct classpath does so. Change-Id: I783a7473cfd24a9cb66711f5428f7052ceb96282 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1756 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-03-05 01:04:56 -08:00
ishaan	00724a47da	Prefix the path to the local core-site to the classpath used by minillama With a recent upstream change, a core-site.xml was introduced in a YARN test jar pulled in by thirdparty. This causes MiniLlama to ignore options set in fe/src/test/resources/core-site.xml. The problem manifests itself with the MiniDfsCluster starting on an arbitary port, but it would have also caused a lot of tests to fail as none of the compression codecs are pulled in. This change prepends the classpath used by minillama with the path to the internal core-site. Change-Id: Iee267fe12e02301baec059a1f7469288c038d6fa Reviewed-on: http://gerrit.ent.cloudera.com:8080/1739 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-03-04 09:59:50 -08:00
Lenni Kuff	bf16b5cd0d	IMPALA-749: Fetch partitions in batches, rather than all at once. This updates how Impala fetches partition metadata from the Hive Metastore to fetch partitions in batches, rather than all at once. This helps reduce the load on the HMS and also lets Impala scale to above 32K partitions. The downside is that it may require additional RPCs to get all the partitions. This is done by first querying the metastore to get all the partition names that exist, then splitting the list of names into seperate batches to get the actual partition metadata. Impala uses a default size of 1000 partitions per batch, but it can be configured by setting the 'hive.metastore.batch.retrieve.table.partition.max' parameter in the hive-site.xml config file. Change-Id: Ide0ec30ef8a9e00f79c26551aa8e5e7814c73034 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1662 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1698	2014-02-28 22:30:45 -08:00
Alex Behm	9cabee4a71	Wait for the Metastore to come up before starting HiveServer2. Change-Id: Ic8e29efe63f6745e1ff44248657cbd7882bb16d9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1626 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1670 Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-02-25 21:05:33 -08:00
Alex Behm	8223e1e44b	Avoid Hive replication bug (CDH-17414) by 'warming up' HiveServer2 after it starts. The purpose of this patch is to avoid CDH-17414 which causes data files loaded with Hive to incorrectly have a replication factor of 1. When using beeline this problem only appears to occur immediately after creating the first HBase table since starting HiveServer2, i.e., subsequent loads seem to function correctly. This patch add a new script that creates an external HBase table in Hive to 'warm up' HiveServer2 immediately after it is started. Subsequent loads should assign a correct replication factor. Change-Id: Ic54c9401b67b748a8848d19f82b8e7df9535e845 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1640 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-02-25 17:33:53 -08:00
Lenni Kuff	b4f5c1edcf	Enable lazy loading of table metadata for the CatalogService/Impalad This change adds support for lazy loading of table metadata to the CatalogService/Impalad. The way this works is that the CatalogService initially sends out an update with only the databases and table names (wrapped as IncompleteTables). When an Impalad encounters one of these tables, it will contact the catalog service to get the metadata, possibly triggering a metadata load if the catalog server has not yet loaded this table. With these changes the catalog server starts up in just seconds, even for large metastores since it only needs to call into the metastore to get the list of tables and databases. The performance of "invalidate metadata" also improves for the same reason. I also picked up the catalog cleanup patch I had to make the APIs a bit more consistent and remove the need for using a LoadingCache for databases. This also fixes up the FE tests to run in a more realistic fashion. The FE tests now run against catalog object recieved from the catalog server. This actually turned up some bugs in our previous test configuration where we were not running with the correct column stats (we were always running with avgSerializedSize = slotSize). This changed some plans so the planner tests needed to be updated. Still TODO: This does not include the changes to perform background metadata loading. I will send that out as a separate patch on top of this. Change-Id: Ied16f8a7f3a3393e89d6bfea78f0ba708d0ddd0e Saving changes Change-Id: I48c34408826b7396004177f5fc61a9523e664acc Reviewed-on: http://gerrit.ent.cloudera.com:8080/1328 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/1338 Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-21 21:43:29 -08:00
Nong Li	04b501d3a1	[CDH5] Collect metadata for cached blocks. Change-Id: I81026de2f9a08553dc15e07090b8297120aa7462 (cherry picked from commit 69414f67b20016e49b739a46d6e2b4b57e1d1a3c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1252 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-15 15:12:20 -08:00
Nong Li	53d7bbb97a	[CDH5] Impala changes for updated thirdparty components. Changes include: - version changes in impala-config - version changes in various loading scripts - hbase jars are no longer in hive/lib - mini-llama script changes - updates due to sentry api changes - JDBC tests disabled - unsupported types tests disabled. Change-Id: If8cf1b7ad8e22aa4d23094b9a4b1047f7e9d93ee	2014-01-15 15:12:13 -08:00
Alex Behm	c70905628b	Using MiniLlama's --write-hdfs-conf to dump the MiniDfs conf for our test setup. Change-Id: I238f375bda4ef95fa3d5ae9a29bd1dfc2aa3e401	2014-01-15 15:12:06 -08:00
Alex Behm	760750af27	Enforcing reserved memory resources via mem limits. Fixed codepath with rm disabled. Set enable_rm to false by default. Change-Id: I3bf2d0525d91243ec3c0ea048b0c03680befcda2 Conflicts: be/src/runtime/runtime-state.cc	2014-01-15 15:12:05 -08:00
Alex Behm	dc7b398bd3	Impala reserves resources from YARN via LLama. Impala reserves resources from YARN via Llama and handles resources preemptions by cancelling affected queries. Adds the Impala Resource Broker for interacting with Llama. Refactors scheduler and coordinator to move fragment-to-host assignment logic into scheduler. Local test setup uses MiniLLama. Change-Id: Ic7b0fe43de52d30f4207b4e65cce7e6a294e54e1	2014-01-15 15:12:04 -08:00
Alex Behm	fc6ecd39e5	[CDH5] Fixed issue with data loading using JDK7 and Hive (HIVE-5068). Fixed missing dependency in testdata for HBase region splitting. Change-Id: Iab002f652bc1b1c2f8ce60b7505f592eedcb9cc0	2014-01-15 15:11:32 -08:00
Alex Behm	60003ad211	[CDH5] Changes to make Impala work on CDH5. Mostly fixing up dependency versions. Minor code changes to address HBase API changes. Change-Id: Icbbeb13eefa29e38286328d45600117a383cd106	2014-01-15 15:11:23 -08:00
Skye Wanderman-Milne	561da008c7	IMPALA-729: fix resource management in Parquet scanner for multiple row groups We weren't attaching resources to the row batch when starting a new row group, so it was possible for string data to be overwritten. This patch removes CloseStreams() and merges its functionality with AttachCompletedResources() so it's not possible to destroy streams without transferring the resources first. It also merges and removes ScannerContext::Close(). Also adds test cases for IMPALA-720. Change-Id: Ia8f40c7d39d8702716f1d337fe797e2696bd0fcb	2014-01-08 10:56:26 -08:00
Lenni Kuff	fbe79fc47b	Use separate log files for each of our mini-cluster services Also adds a bit more logging on which individual services are starting. Change-Id: I53f12e1825fbf738e2fb8325874c3126e55f3f44 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1147 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:54:37 -08:00
Alex Behm	c6397ca1e3	Revert "Revert to FROM-clause order if any table is lacking stats." This reverts commit 7e84cbe3bab9bf30a57ac58d9ef525ebc10a7b7a. Change-Id: I89d55ca2bcb8eb6eddc244d3e7b005074d04c26a Reviewed-on: http://gerrit.ent.cloudera.com:8080/1104 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:29 -08:00
Alex Behm	df0b28d163	Revert to FROM-clause order if any table is lacking stats. Change-Id: I7d09c0f393e2bfeefa386845fc6bbba4ab6c8812 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1095 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:28 -08:00
Skye Wanderman-Milne	9e17042185	Allow zero bit width dict/RLE decoders. This allows us to read single-value dictionary-encoded columns generated by parquet-mr. Change-Id: I80903d910d0cc3a3e4ebf02e34212d868e94feb4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1098 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:27 -08:00
Skye Wanderman-Milne	de531e15bd	IMPALA-694: Allow Impala to read files produced by parquet-mr version <= 1.2.8 parquet-mr had a bug where it didn't include the dictionary page's header in the total column size. We now compensate for this by detecting these files and padding the scan range length. This required changing how the scanner detects when it's finished: it now counts the number of rows rather than checking eosr (since the scan range may be longer than the column). Change-Id: Id9933808b965003c0c3b3aa78c32fe29a0c4bcbe Reviewed-on: http://gerrit.ent.cloudera.com:8080/1097 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:27 -08:00
Lenni Kuff	e63cc59a94	Add partitioned tpcds planner tests (SQL-92 style joins) Adds the TPCDS queries as planner tests and fixes a few small issues with the Planner test file parser. This adds the TPC-DS queries using SQL-92 style joins that have a hand optimized (although not perfect) join order. Change-Id: I2d81e66af740b2d826b8ebd0c5ba8553b5faf0a2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1019 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:26 -08:00
Skye Wanderman-Milne	acdc792355	IMPALA-695: Use the local path of Hive UDF jars in the FE. The FE was creating class loaders with the HDFS locations of Hive UDF libs, rather than the local locations created by the BE. Our tests still passed since we only used UDFs already on the classpath (e.g. Hive builtins). Change-Id: Idbe9c98ad6adb84b70cb44efbf9ad0afc53366ca Reviewed-on: http://gerrit.ent.cloudera.com:8080/1081 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:25 -08:00
Skye Wanderman-Milne	b54d16dabd	IMPALA-679: Append hash of HDFS path to filename in CopyHdfsFile() to avoid collisions. Change-Id: Ia84fa81fe043a9604248d66ed963ef3f91b0601e Reviewed-on: http://gerrit.ent.cloudera.com:8080/1018 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:22 -08:00
Lenni Kuff	0bae3978c9	Update compute-stats.py to execute using Impala Updates our compute stats script to execute using Impala. This allows us to easily compute stats on all tables in a database or all tables in the metastore. The updated stats caused one of the TPCH plans to change so this also updates the TPCH planner test results. Change-Id: I17e5dcd1036a35e40eb4eb2c8e4a20702db9049c Reviewed-on: http://gerrit.ent.cloudera.com:8080/1024 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:18 -08:00
Lenni Kuff	76fa3b2ded	Update DDL to support 'STORED AS PARQUET' and 'STORED AS AVRO' syntax This change updates our DDL syntax support to allow for using 'STORED AS PARQUET' as well as 'STORED AS PARQUETFILE'. Moving forward we should prefer the new syntax, but continue to support the old. I made the same change for 'AVROFILE', but since we have not yet documented the 'AVROFILE' syntax I left out support for the old syntax. Change-Id: I10c73a71a94ee488c9ae205485777b58ab8957c9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1053 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:18 -08:00
Nong Li	c1a64d6863	Add kill-mini-llama to CDH4 branch. This makes it easier to switch between our branches and a no-op if for those of us staying on CDH4. Change-Id: Ic07eb8a7ba7e48db118c06c221aabe5e124f3bfb Reviewed-on: http://gerrit.ent.cloudera.com:8080/1033 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:54:17 -08:00
ishaan	fcdcf1a9d8	Parallelize data loaded through Impala to speed up data loading. Currently, we execute all the queries involved in data loading serially. This change creates a separate .sql file for each file format, compression codec and compression scheme combination, and executes all the files in parallel. Additionally, we now store all the .sql files (independent of workload) in $IMPALA_HOME/data_load_files/<dataset_name>. Note that only data loaded through Impala is parallelized, data loaded through hive and hbase remains serial. On our build machines, the time taken to load all the data from snapshot was on the order of 15 minutes. Change-Id: If8a862c43f0e75b506ca05d83eacdc05621cbbf8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/804 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:53 -08:00
Lenni Kuff	498c2529d4	Test CR: Change spacing in run-all.sh Change-Id: I2362799213a7faca3892e38fb874bfbbd0c1718f Reviewed-on: http://gerrit.ent.cloudera.com:8080/803 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:50 -08:00
Lenni Kuff	8b2acf5c22	IMPALA-425: Detect read-only tables and disable INSERT/LOAD operations on these tables With this change we now detect if a table is read-only and disable INSERT/LOAD operations on these tables. A table is read-only if Impala does not have write permission on the HDFS base directory of the table or any one of the partition directories (if the table is partitioned). Change-Id: I25515b2d0ffb7fe297359437fd937a3d6e0406a0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/713 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:37 -08:00
Alex Behm	51e914e911	Use hive-exec instead of hive-builtin because hive-builtin does not exist in CDH5 Hive. Change-Id: I11993c7eebc9f5f07f112810d7e81d07ce157193 Reviewed-on: http://gerrit.ent.cloudera.com:8080/715 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:33 -08:00
Lenni Kuff	72e211ca4a	Use Hive Metastore Service instead of HiveServer 1 in test infrastructure Change-Id: I4e2ba02b2101bae95d196ab13f9453e1b3a9d7be Reviewed-on: http://gerrit.ent.cloudera.com:8080/689 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:26 -08:00
Nong Li	4800995d44	Add execution for Hive UDFs. Change-Id: I6a5ad96fed77e2b8a2701f21a917a8eb7a11d500 Reviewed-on: http://gerrit.ent.cloudera.com:8080/458 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:25 -08:00
Nong Li	6b9a7de02e	Add symbol resolution during analysis for create function stmts. Before this, we had to specify the entire mangled symbol. This can be quite long and quite tedious (take a look at some of the create UDA test cases that specify all the symbols). This patch adds some code to convert from the user function signature to the mangled name. This means the user can specify the unmangled name and we can do the symbol lookup. The mangling rules are pretty convoluted but if it is messed up, the user can always specify the full symbol. Some other minor cleanup in: - JNI from FE to BE - UDFs/UDAs that are loaded as test data Change-Id: I733dbf3a72cb7b06221c27e622d161bcca0d74a8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/624 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:20 -08:00
ishaan	8a43426879	Sleep after starting the hiveserver2 service to guards against it not starting on time. Change-Id: I9a0de1cc63089cba2f9b59942ee45abc44b8662e Reviewed-on: http://gerrit.ent.cloudera.com:8080/643 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:17 -08:00
Lenni Kuff	b07a3ccfd6	Use an external Hive Metastore Service for local test runs Using an external Hive Metastore Service for local test runs has a number of benefits. Some of the benefits are that it helps separate the metastore logs from the impala logs, and that it is more representative of what is on real cluster environments. It also may help with some of the concurrency issues that we have been seeing when running directly against the backend database since we no longer spin up an in-process metastore server for each client connection. The metastore is started by running "run-hive-server.sh" which is invoked as part of "run-all.sh". Change-Id: If60fa97aa38e4ad5cf578b9b409eeea1e0e29375 Reviewed-on: http://gerrit.ent.cloudera.com:8080/628 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:15 -08:00
Skye Wanderman-Milne	b7f83bcd73	Add support for LLVM IR UDFs. This patch also adds a number of improvements to NativeUdfExpr. Highlights include: * Correctly handling the lowering of AnyVal struct types (required for ABI compatibility) * A rudimentary library cache for reusing handles produced by dlopen * More complicated test cases Change-Id: Iab9acdd7d7c4308e5d7ee3210f21b033fda5a195 Reviewed-on: http://gerrit.ent.cloudera.com:8080/540 Tested-by: jenkins Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:03 -08:00
Nong Li	e5ed8e4105	Move minicluster_xml_conf to HADOOP_CONF_DIR. The current location gets deleted if you rebuild, making you have to restart mini dfs. Change-Id: If71b144534255fa8df2bfa187c0814ffdf28463e Reviewed-on: http://gerrit.ent.cloudera.com:8080/550 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:03 -08:00
Lenni Kuff	79cdeac3d6	Consolidate test cluster under IMPALA_HOME/cluster_logs + store logs during data loading Change-Id: I8f6239e4ccb0515c85bf80193a475788fb18dedb Reviewed-on: http://gerrit.ent.cloudera.com:8080/518 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:56 -08:00
Skye Wanderman-Milne	fd99db0300	First pass at UdfExpr. Change-Id: I517bf56541749b5c2459554821c7bf838239fdf0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/439 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:50 -08:00
Henry Robinson	a46276325c	IMPALA-415: Don't delete hidden files in the root directory for INSERT OVERWRITE INSERT OVERWRITE into an unpartitioned table is supposed to remove all data files from the root. This should not include hidden files or directories. This patch excludes hidden files from deletion, and adds a test case. Partition directories are still removed in their entirety: the cost of statting a large number of files and directories rather than issuing a single "rm -rf" outweighs the benefits of preserving hidden files for now. Hive does not preserve hidden files in either configuration. Change-Id: Ia73e55e011c26c88f14745075210cf359764e3c1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/418 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:50 -08:00
Lenni Kuff	a1f2f72f49	Add Impala DDL support for creation of AVRO tables + support for CREATE/ALTER SERDEPROPERTIES This change adds Impala DDL support for creation of AVRO tables. Additionally, it add Impala support for CREATE and ALTER SERDEPROPERTIES which are used when creating Avro backed tables. This syntax is not exactly the same as the Hive support since it introduces a new fileformat (AVROFILE) that implies the needed Serialization library, input format, and output format. Change-Id: I5047e419198a89599e9d014fdedfee1a20437a7d Reviewed-on: http://gerrit.ent.cloudera.com:8080/464 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:48 -08:00
Lenni Kuff	d6d1557fe7	Capture cluster logs with each test run / don't use mvn for starting cluster services Change-Id: I708b547e49d035c5f029ea86119cc844ccbc5643 Reviewed-on: http://gerrit.ent.cloudera.com:8080/404 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:40 -08:00
Lenni Kuff	9f54242941	Add retry loop around split-hbase to fix build breaks Change-Id: I539407ce05d705b6b4e88d0791fc4ec236c79c80 Reviewed-on: http://gerrit.ent.cloudera.com:8080/399 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:39 -08:00
ishaan	6735e3983f	Fix build failure because of hbase data loading. Change-Id: I796656332c3733a1ffdc338d206009efa6c451ac Reviewed-on: http://gerrit.ent.cloudera.com:8080/360 Tested-by: jenkins Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:52:37 -08:00
ishaan	53cd9eadab	Treat HBase as a file format for functional tests Change-Id: Ia01181a1e10eb108419122d347e9d869a69e8922 Reviewed-on: http://gerrit.ent.cloudera.com:8080/102 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:52:36 -08:00
Lenni Kuff	f264db1647	Automatically force load partitioned tables to ensure valid partition metadata Change-Id: Ief91102f30d4669503d473299256a74a50d8fe3c Reviewed-on: http://gerrit.ent.cloudera.com:8080/261 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:17 -08:00
Lenni Kuff	17ed6ea177	Partition TPC-DS dataset and add additional TPC-DS workload queries Change-Id: I5410e68fdfd818a8287e0974332c3e36c344c300 Reviewed-on: http://gerrit.ent.cloudera.com:8080/99 Tested-by: jenkins <kitchen-build@cloudera.com> Reviewed-by: Marcel Kornacker <marcel@cloudera.com>	2014-01-08 10:52:13 -08:00
Skye Wanderman-Milne	6e7406df8b	IMPALA-502: Impala does not return NULL for case where table has extra string column and data does not (it returns an empty string) Change-Id: I0cfe5ce5fc279d46610a3cc191a501ccbc335296 Reviewed-on: http://gerrit.ent.cloudera.com:8080/127 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:02 -08:00

1 2 3 4

176 Commits