impala

mirror of https://github.com/apache/impala.git synced 2026-01-04 00:00:56 -05:00

Author	SHA1	Message	Date
casey	2351266d0e	Replace single process mini-dfs with multiple processes This should allow individual service components, such as a single nodemanager, to be shutdown for failure testing. The mini-cluster bundled with hadoop is a single process that does not expose the ability to control individual roles. Now each role can be controlled and configured independently of the others. Change-Id: Ic1d42e024226c6867e79916464d184fce886d783 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1432 Tested-by: Casey Ching <casey@cloudera.com> Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2297 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-04-23 18:24:05 -07:00
Nong Li	85be9a5050	Update bin/make* -notests to include other artifacts for packages. Change-Id: I95e95f0a2e2131875b95d6676620bec7117b7f8a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2250 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-04-16 00:37:00 -07:00
Nong Li	f9dd32724c	Cleanup build scripts. Consolidated our build scripts and added the -notests option which skips build the BE tests. Change-Id: Ida6aa064b7fe47e535c142b9af92b7c158e83c32 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2043 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2201	2014-04-13 17:11:39 -07:00
Lenni Kuff	d101ef86e2	[CDH5] Bump version to 1.4.0-cdh5-INTERNAL Change-Id: I0a0334084e444c948f1718133afb2d7246dde414 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2193 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-04-11 16:03:09 -07:00
Henry Robinson	8e5848eaf8	RM fixes to get tests passing * One last NotifyThreadUsageChange() mismatched pair * Don't set resource in plan fragment params if there isn't a resource available. This fixes the problem where if no fragment with resources was assigned to the same node as the coordinator, the coordinator would have a dummy resource allocation which didn't work with expansion. * Substitute #ID in all impalad arguments to start-impala-cluster.py with the 0-indexed ID of the impalad being started. This is required to have different Impala processes use different cgroups. Change-Id: If8c8fd8bef0809bdaf16115a45a9695fc2bf3e1b (cherry picked from commit c71ce45e97570b8c09900eb5ae2e26984d3306a4) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2060 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-03-24 15:07:45 -07:00
Lenni Kuff	3d82c9a5d6	Bump version from 1.3.0-INTERNAL to cdh5-1.3.0 Change-Id: Ib7a37b190091a3f9eb6d6f0f560dd40aed23e231 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2031 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-20 17:22:11 -07:00
Lenni Kuff	86f69fb96f	IMP-1306: Fix build scripts to properly generate Impala version info for packaging builds The problem was that were were deleting the version.info file because the default of gen_build_version.py recently changed from --noclean to --clean. Also fixed a bug in the shell version generation and made debugging a bit easier by dumping the contents of version.info whenever it is generated. Change-Id: I764d01c9e46eed1bd39de79bf076c15afa599486 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1901 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com> (cherry picked from commit fa673b4d3342fc825ee7fa942bd254234d222906) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1910 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-14 08:45:16 -07:00
Matthew Jacobs	a283d72cdd	[cdh5] Add latest cdh5 hadoop, hbase, and hive snapshots to thirdparty Change-Id: I60c93b259a26e86aca60f2b3b5b6226eabc0b5eb	2014-03-05 01:06:09 -08:00
Lenni Kuff	8a16709265	Perform prioritized load requests for missing tables in HS2 metadata ops The HS2 metadata operations do not go through analysis() so the prioritized loading will not happen for them. Most of the HS2 metadata ops work purely on table/db names, but GetColumns() requires loading the table metadata. This patch updates MetadataOp to collect a set of missing tables and request these tables be loaded from the catalog server. The operation will wait until the tables are loaded in the local catalog before proceeding. Change-Id: I070f2a0d9194d3317f09431971be9a8dffbc7386 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1542 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1557	2014-03-01 17:16:50 -08:00
Alex Behm	3d764619f7	Run Hive data loading through beeline instead of the Hive shell. Fixes our log configuration to put the Hive logs in cluster_logs/hive. Change-Id: I5d98581e35325f2173e4b3170e36bec42d33f8f3 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1497 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1615 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-02-20 15:43:31 -08:00
Alex Behm	62338694e4	Skip generation of version and impala-ir .cc files in buildall.sh if -noclean is specified. Before this patch the -noclean option had almost no effect on the BE build time because some source files were re-generated with .py scripts regardless. This change allows ./buildall -skiptests -noclean to do a true incremental rebuild. Change-Id: Ib3af85db05bdc96a2279a22c1d49d735f2cabd4e Reviewed-on: http://gerrit.ent.cloudera.com:8080/1394 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1415	2014-01-31 13:57:13 -08:00
ishaan	01ef3ef4c1	load-data.py should exit if a bash command returns a non-zero error code. Change-Id: I2f732a276a42d2697fa55bce0f18ac89e9a6f0a1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1397 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1408 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-30 15:47:13 -08:00
Henry Robinson	5535a8a128	[CDH5] Set CDH major version to 5 Change-Id: Ibc36ed435dd36d3489d27a977bf1726bbf2927a1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1306 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-17 14:34:01 -08:00
Henry Robinson	241270044b	Add CDH_MAJOR_VERSION environment variable CDH_MAJOR_VERSION controls where HDFS data is written. In the future, we can use its value to parameterise Jenkins jobs so that the right code is run / data is generated. Change-Id: Id2957df6d708bc6c50faf7a8a609aff5f9571662 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1293 Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1305 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-17 14:33:18 -08:00
Nong Li	53d7bbb97a	[CDH5] Impala changes for updated thirdparty components. Changes include: - version changes in impala-config - version changes in various loading scripts - hbase jars are no longer in hive/lib - mini-llama script changes - updates due to sentry api changes - JDBC tests disabled - unsupported types tests disabled. Change-Id: If8cf1b7ad8e22aa4d23094b9a4b1047f7e9d93ee	2014-01-15 15:12:13 -08:00
Alex Behm	e6299b684b	Setting proper cgroup CPU shares based on the reserved resource. Change-Id: I58992b11e71ed7ad7ea7639050d74fd3eaa4d1d1	2014-01-15 15:12:07 -08:00
Alex Behm	5cfbec9139	Impala creates and manages its own CGroups instead of using the Yarn-NM provided ones. Change-Id: Id09ba2641ad33fbc109eea2dd6fe80b1863b5cac	2014-01-15 15:12:06 -08:00
Alex Behm	760750af27	Enforcing reserved memory resources via mem limits. Fixed codepath with rm disabled. Set enable_rm to false by default. Change-Id: I3bf2d0525d91243ec3c0ea048b0c03680befcda2 Conflicts: be/src/runtime/runtime-state.cc	2014-01-15 15:12:05 -08:00
Alex Behm	dc7b398bd3	Impala reserves resources from YARN via LLama. Impala reserves resources from YARN via Llama and handles resources preemptions by cancelling affected queries. Adds the Impala Resource Broker for interacting with Llama. Refactors scheduler and coordinator to move fragment-to-host assignment logic into scheduler. Local test setup uses MiniLLama. Change-Id: Ic7b0fe43de52d30f4207b4e65cce7e6a294e54e1	2014-01-15 15:12:04 -08:00
Alex Behm	fc6ecd39e5	[CDH5] Fixed issue with data loading using JDK7 and Hive (HIVE-5068). Fixed missing dependency in testdata for HBase region splitting. Change-Id: Iab002f652bc1b1c2f8ce60b7505f592eedcb9cc0	2014-01-15 15:11:32 -08:00
Alex Behm	60003ad211	[CDH5] Changes to make Impala work on CDH5. Mostly fixing up dependency versions. Minor code changes to address HBase API changes. Change-Id: Icbbeb13eefa29e38286328d45600117a383cd106	2014-01-15 15:11:23 -08:00
Nong Li	752b8e3ee4	[CDH5] Added CDH5 beta2 versions of Hadoop, Hive, HBase and Llama to thirdparty. Change-Id: Id033c0246c0ffdffd0c7703eaff9600086912380	2014-01-15 15:11:13 -08:00
Lenni Kuff	8571920753	Bump version to v1.3.0-INTERNAL Change-Id: I32bae4daf093794b09f4ca85b9abdc686791aee8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1281 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-14 21:33:22 -08:00
ishaan	4e9913b52f	Fix race in data loading by creating text tables first. While loading parquet, there are a few table creation queries that use the 'like' keyword; this ends up opening a small race window when all the table formats are created concurrently. With this change, we create the text tables first before attempting to parallelize the rest of the data loading. Change-Id: Ib84cf0e5120b3588d3f0503d7119ca055e08e53f Reviewed-on: http://gerrit.ent.cloudera.com:8080/1241 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-10 15:01:59 -08:00
Nong Li	056c7d94d6	Remove compute stats option from bin/load-data.py This option is not implemented in this script and doesn't make it obvious that it doesn't do anything. Change-Id: I1a1eff38460fd181c486cfca2840108a58e21603 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1059 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-10 14:01:35 -08:00
Henry Robinson	9a0dc18700	Remove a couple of unused files * upload_codereview.py is no longer used since Rietveld is long gone * runplanservice is deprecated as there is no longer a separate PlanService * README only mentions a single internal wiki page. Change-Id: Iba61a3d62381deb882c4168f142574f2492e0969 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1249 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-09 09:56:05 -08:00
Alex Behm	6483f53581	Additional options for JVM debugging in impala startup scripts. Enables JVM debugging by default for the catalogd and impalads created via bin/start-impala-cluster.py. Adds a -jvm_args command line option for passing additional JVM args to the catalogd and impalads. Change-Id: I68e901661bd1fd7eefa05ba84dbacf29dd124685 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1213 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:54:40 -08:00
ishaan	0ed1781323	Invalidate metadata before loading parquet data through Impala. During a full data load, we load all the data (except parquet) via hive, and then load the parquet data via Impala. The catalog service does not update the metadata of tables changed outside Impala, so we need to explicitly invalidate the metadata before loading parquet data. Change-Id: Iec39db9ea46e4a11b17589881732629a56444120 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1207 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:39 -08:00
Lenni Kuff	baf79f8185	Call 'invalidate metadata' after loading test data instead of before Instead of calling 'invalidate metadata' before loading each workload we should call it once, after loading all test data. This will allow us to pickup data inserted by Hive. The only reason this worked before is because we restart Impala before running the tests. This will also be a bit faster if loading multiple workloads. Change-Id: I28d42bbf5d7a24b5fde687d67a4b41472ec4b897 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1153 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:37 -08:00
Henry Robinson	177b9ba3b1	Remove nonblocking server (and dependencies) from build Goodnight, sweet non-blocking prince. We didn't support, or test, this configuration, and it doesn't work with security or sessions and brings in some annoying dependencies that are a pain to build. We have other RPC-stack options to investigate; we may wind up re-adding the non-blocking server but only in a way that supports all required features more regularly. Change-Id: Ifbcabc5014441f6d31c342c4e288dd7fc6201443	2014-01-08 10:54:35 -08:00
ishaan	7e520f8f23	Make workload runner logging more concise and readable. This patch makes the workload runner's logging concise and more informative. Specifically, it - logs the time taken for each iteration of a query. - changes the default log level to INFO. - The output is less verbose. Change-Id: I5f964cf76269fd64ce127b9e4c51fe1deafd1d1b Reviewed-on: http://gerrit.ent.cloudera.com:8080/1076 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:54:35 -08:00
Henry Robinson	0440f26f3e	Add -gdb flag to start-impalad.sh to start Impala under gdb Change-Id: I19f027680cfbf6a7cbc4b311e07f244d67ff683d Reviewed-on: http://gerrit.ent.cloudera.com:8080/1125 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:33 -08:00
Nong Li	1c2e767b89	Bump version to 1.2.3-INTERNAL. Change-Id: I2baf2aa41587ccf24331da7cba399cedb296a2e0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1132 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:32 -08:00
Nong Li	2489e211f0	Update version to 1.2.2. Change-Id: Id70f4af930050075a41b1953fc4c5c935bb5b671	2014-01-08 10:54:30 -08:00
Henry Robinson	6d9a7e290d	Build Openldap as a thirdparty package Change-Id: Ifbb0f468a23186f4160fceb462953bc321469c27 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1049 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-08 10:54:20 -08:00
Henry Robinson	cb965d259a	Build changes to use cyrus-sasl-2.1.23 Change-Id: Ie87e35945b6a415b0383cb75ffcae2fe35755623 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1047 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-08 10:54:19 -08:00
Nong Li	b225477ae9	Bump version to 1.2.2-INTERNAL. Change-Id: I256ef47b6e957a2723422e606d1b87f4e800bbf9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1032 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:54:17 -08:00
Lenni Kuff	01660374c6	Additional fe and testdata pom.xml cleanup This change cleans up our FE pom.xml file by removing unneeded dependencies and system dependencies (system dependencies are now pulled in from the Maven release repository). The upside is that our pom is cleaner and it will also help reduce the likelihood of broken dependencies since Maven will pull in the right versions. The downside is that we now pull in quite a few more JARs. Note: I was unable to find release artifacts for Sentry and Parquet so I leaving those as "system" for now. Change-Id: I0b917b09a02243d78d89747591ab6bccacf7cf38 Saving changes Change-Id: I3697a7b44884c40e077b3e354fef76625e1b881d Reviewed-on: http://gerrit.ent.cloudera.com:8080/1011 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:17 -08:00
Lenni Kuff	e86ca62ec7	Do not append any JARs from thirdparty/ to the classpath Change-Id: Id68c1bc118a1b8efebb6d035ca94a41cf1c4ded1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1005 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:16 -08:00
Henry Robinson	ce2781c48d	Remove bad quotes from thrift configure script Change-Id: Id671f5366813378ead9362f67b082b7af705b005 Reviewed-on: http://gerrit.ent.cloudera.com:8080/994 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-08 10:54:14 -08:00
Sean Mackrory	2b313a9782	IMP-1147. Impala build fails: PIC_LIB_PATH: unbound variable Change-Id: Ifb173b553b9a52392b5d7caf3630032b89e89c2d Reviewed-on: http://gerrit.ent.cloudera.com:8080/992 Reviewed-by: Sean Mackrory <sean@cloudera.com> Tested-by: Sean Mackrory <sean@cloudera.com>	2014-01-08 10:54:14 -08:00
Sean Mackrory	bb39e33101	IMP-1106. Allow libevent location to be overridden in Thrift dependency build Change-Id: Ia4d92bb4bdfcb7ba29a36904afdb9fd5e398307d Reviewed-on: http://gerrit.ent.cloudera.com:8080/968 Reviewed-by: Henry Robinson <henry@cloudera.com> Reviewed-by: Sean Mackrory <sean@cloudera.com> Tested-by: Sean Mackrory <sean@cloudera.com>	2014-01-08 10:54:14 -08:00
ishaan	287953e87c	Better error logging while loading data. Change-Id: I67cbd9fd1d915ea043a731b7951f29fec25fc446 Reviewed-on: http://gerrit.ent.cloudera.com:8080/982 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:13 -08:00
Lenni Kuff	6e09b90ea3	Properly set timeout in start-impala-cluster Change-Id: I8cedf484d0ce9d2752e3970883f419ab51a82c3b Reviewed-on: http://gerrit.ent.cloudera.com:8080/980 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:13 -08:00
Lenni Kuff	e2b9b4a735	Bump version to v1.2.1 Change-Id: I8f1c9ae1fd0ad195fa7817d324d192c2386eac09 Reviewed-on: http://gerrit.ent.cloudera.com:8080/974 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:54:12 -08:00
ishaan	81b80c702c	Upgrade thirdparty to use CDH4.5 bits. The following changes have been made: -- Update hbase -- Update hive -- Update hadoop -- Update the parquet version to 1.2.5 Change-Id: Id6ceaef0e9eebab27ffd408160116fa84ed300fb	2014-01-08 10:54:09 -08:00
Lenni Kuff	6282d364a8	IMP-1134: DoAsUser and impersonator are reversed in audit logs The audit logs currently have the "impersonator" field set to what we call the doAsUser and the "user" field set as the connected user. They should be reversed. Added basic tests to validate the correct event gets audited. Change-Id: Idfa0aaa6c88debedc4993bd0489dbd3f696fcf17 Reviewed-on: http://gerrit.ent.cloudera.com:8080/958 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:03 -08:00
ishaan	bf5359be8d	Cleanup Impala connections after data is loaded. Change-Id: I152b09808740d5344462bcbaf4df4b71d88504cc Reviewed-on: http://gerrit.ent.cloudera.com:8080/953 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:02 -08:00
Lenni Kuff	6c25e78715	Add option to start-impala-cluster to only restart impalad This helps speed up the restart time becuase we don't need to restart the catalog server and reload the table metadata. This is useful if you want to restart the impalad with a different command line parameter or if you are making changes to only the impalad binary. Change-Id: I0b714afaf7e508c450a353a53d67d95165de3486 Reviewed-on: http://gerrit.ent.cloudera.com:8080/897 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:59 -08:00
Lenni Kuff	f579ee8b25	Fix logging in load-data to print the query being executed Change-Id: I4332e8d3a340f11e1bbb1f6c5126b0b9b4a2ad8e Reviewed-on: http://gerrit.ent.cloudera.com:8080/949 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:58 -08:00

1 2 3 4 5 ...

312 Commits