impala

mirror of https://github.com/apache/impala.git synced 2025-12-30 21:02:41 -05:00

Author	SHA1	Message	Date
ishaan	3bed0be1df	Refactor the performance framework and change its execution strategy. This patch introduces new abstractions and changes the way queries are run via the workload runner. A new class 'Workload' is introduced, which represents the notion of a workload in the performance framework (i.e, A set of query names mapped to query strings). The new workflow is: - run-workload acts as a driver. It accepts user parmaters for which queries to run and their execution strategy. It generates workload objects and passes them to the workload-runner. - The workload runner takes a workload, its execution parameters and generates a set of test vectors over which the workload is run iteratively. - A workload is executed by initialiazing a QueryExecutor for each query being run in a test vector. The workload executor is then responsible for execution and gathering results. - The execution details of every query being executed are are stored and returned to the driver (run-workload). Change-Id: Ia16360140d65e6733e534e823bc5d5614622ab5f Reviewed-on: http://gerrit.ent.cloudera.com:8080/3616 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins	2014-07-25 18:17:11 -07:00
ishaan	0d0614765d	Only use nproc to determine functional test concurrency when it's available in the os. Some operating systems don't ship which nproc, which causes impala-config.sh to fail. This change alleviates the problem by checking if nproc exists, and setting a reasonable default if it fails. Change-Id: Ic6e4d0fbce57eedc82163cfa17f71bdccbc38b51 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3208 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-06-20 12:52:08 -07:00
ishaan	f92c9a9335	Run local tests at lower concurrency. Currently, we launch #nproc processes to run tests locally. This patch changes the default to #proc/2, to not overload the system. Change-Id: I8bca23eb7462a0c497df93f82a60d85835bedbe9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2972 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-06-19 12:48:29 -07:00
Paden Tomasello	0326f17bb3	Adding Lz4 Codec. Change-Id: I037d4e0de3b2cd2b8582caea058c8e1f2f880ff3 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3027 Reviewed-by: Paden Tomasello <paden.tomasello@cloudera.com> Tested-by: jenkins	2014-06-16 14:20:34 -07:00
Lenni Kuff	d5a9ada976	[CDH5] Bump version to v1.5.0-cdh5 Change-Id: I80bf635d37a9c98d51acf6dc35527a21c6b88d76 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2983 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-06-11 22:14:00 -07:00
Lenni Kuff	fa98766ceb	IMPALA-1038: Abort test run if any test fails This behavior regressed recently, this fixes the regression. Change-Id: I80939131953fc1838da0690c3e7e7bf455bd6180 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2968 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins (cherry picked from commit b6f8b7f4679c82ca2fb443224fcd88402c3a4136) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2975 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-06-11 14:54:06 -07:00
Srinath Shankar	5755b0bdee	Order by without limit for Impala Enable order-by without limit Added BufferedBlockMgr to allocate buffers and spill to disk. Added Sorter for the external sort impelementation Added new SortNode execution node that completely sorts its input Changes to enable writing in IoMgr went in a separate patch. Reviewed-on: http://gerrit.ent.cloudera.com:8080/1539 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins Conflicts: testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test Change-Id: I3ece32affe5b006f53bbdfcc03ded01471e818ac Reviewed-on: http://gerrit.ent.cloudera.com:8080/2900 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins	2014-06-09 16:58:08 -07:00
Henry Robinson	3e7e7ed0dc	Fix impala-config.sh when JAVA_HOME not set Change-Id: Iaefda2039de1a5aafc782bca582d3007abcf6eff Reviewed-on: http://gerrit.ent.cloudera.com:8080/2803 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit 48db5de6825cba8b6a1c1c658ff79a9641341dca) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2814 Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-06-03 19:48:57 -07:00
Lenni Kuff	f34a0507bf	[CDH5] Add support for Sentry Service to Impala This change adds support for authorizing based on policy metadata read from the Sentry Service. Authorization is role based and roles are granted to user groups. Each role can have zero or more privileges associated with it, granting fine grained access to specific catalog objects at server, URI, database, or table scope. This patch only adds support to authorize against metadata read from the Sentry Policy Service, it does not add support for GRANT/REVOKE statements in Impala. The authorization metadata is read by the catalog server from the Sentry Service and propagated to all nodes in the cluster in the "catalog-update" statestore topic. To enable the Catalog Server to read policy metadata, the --sentry_config must be set to a valid sentry-site.xml config file. On the impalad side, we continue to support authorization based on a file-based provider. To enable file based authorization set the --authorization_policy_file to a non-empty value. If --authorization_policy_file is not set, authorization will be done based on cached policy metadata received from the Catalog Server (via the statestore). TODO: There are still some issues with the Sentry Service that require disabling some of the authorization tests and adding some workarounds. I have added comments in the code where these workarounds are needed. Change-Id: I3765748d2cdbe00f59eefa3c971558efede38eb1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2552 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-06-03 07:19:52 -07:00
Nong Li	5d80942d42	[CDH5] IMPALA-1019: Fix cancellation path in io mgr for cached reads. Change-Id: I11efd65d1efa900f79afe88b781262a44ac5006a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2703 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-30 19:14:39 -07:00
Lenni Kuff	c45e9a70d9	[CDH5] Add DDL support for HDFS caching This change adds DDL support for HDFS caching. The DDL allows the user to indicate a table or partition should be cached and which pool to cache the data into: * Create a cached table: CREATE TABLE ... CACHED IN 'poolName' * Cache a table/partition: ALTER TABLE ... [partitionSpec] SET CACHED IN 'poolName' * Uncache a table/partition: ALTER TABLE ... [partitionSpec] SET UNCACHED When a table/partition is marked as cached, a new HDFS caching request is submitted to cache the location (HDFS path) of the table/partition and the ID of that request is stored with in the table metadata (in the table properties). This is stored as: 'cache_directive_id'='<requestId>'. The cache requests and IDs are managed by HDFS and persisted across HDFS restarts. When a cached table or partition is dropped it is important to uncache the cached data (drop the associated cache request). For partitioned tables, this means dropping all cache requests from all cached partitions in the table. Likewise, if a partitioned table is created as cached, new partitions should be marked as cached by default. It is desirable to know which cache pools exists early on (in analysis) so the query will fail without hitting HDFS/CatalogServer if a non-existent pool is specified. To support this, a new cache pool catalog object type was introduced. The catalog server caches the known pools (periodically refreshing the cache) and sends the known pools out in catalog updates. This allows impalads to perform analysis checks on cache pool existence going to HDFS. It would be easy to use this to add basic cache pool management in the future (ADD/DROP/SHOW CACHE POOL). Waiting for the table/partition to become cached may take a long time. Instead of blocking the user from access the time during this period we will wait for the cache requests to complete in the background and once they have finished the table metadata will be automatically refreshed. Change-Id: I1de9c6e25b2a3bdc09edebda5510206eda3dd89b Reviewed-on: http://gerrit.ent.cloudera.com:8080/2310 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-05-27 16:47:15 -07:00
Lenni Kuff	79d43e1e41	Handle cases where environment variables are not defined in impala-config.sh Change-Id: Iee2800cb02299a9ed26da6fd079e3a72fe2a2482 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2537 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2539	2014-05-13 08:22:42 -07:00
Lenni Kuff	f1d9c0f58b	[CDH5] Update Impala's Sentry dependency to Sentry v1.3 (from v1.2) This updates Impala to use Sentry v1.3 instead of Sentry v1.2. No major functionality changed between Sentry versions, but some Sentry classes were moved and APIs changed. Change-Id: I3765748d2cdbe00f59eefa3c971558efede38ebd Reviewed-on: http://gerrit.ent.cloudera.com:8080/2319 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-05-13 02:57:07 -07:00
Lenni Kuff	13c794db91	[CDH5] Update dependency versions to CDH5.1.0 This just updates the versions, it doesn't touch anything in /thirdparty. Change parquet version to append SNAPSHOT Added hadoop-hbase-compat jar in AUX_CLASSPATH and mapreduce/*.jar to HDFS Change-Id: I4471ef4476997371cf49a9d54cfa63f2fda126e4	2014-05-07 15:10:40 -07:00
casey	192d52c258	Testing: Generate queries and compare results against other databases This is the intital commit and is a work in progress. See the README for a list of possible improvements. As an overview of how the files are related: model.py: This is the base upon which the other files are built. It contains something like a grammer for queries. query_generator.py: Generates random permutations of the model. model_translator.py: Produces SQL based on the model discrepancy_searcher.py: Uses the above to generate, run, and compare query results. Change-Id: Iaca6277766f5a86568eaa3f05b99c832942ab38b Reviewed-on: http://gerrit.ent.cloudera.com:8080/1648 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Casey Ching <casey@cloudera.com>	2014-05-01 14:20:35 -07:00
ishaan	88ec1e0a83	Increase default_pool_max_requests in run-all-tests. Temporarily increase the cap on max requested queries while running tests to unblock builds. Currently, the exhaustive runs always fails, and there are some intermittent failures in the core runs. Change-Id: I26b9ce343d72bab7687e49f7dbd7bf3bf655a294 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2323 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2395	2014-04-28 23:47:24 -07:00
Skye Wanderman-Milne	60db4d4d82	CDH-18416: Don't inline ReadWriteUtil::ReadZLong() For wide Avro tables, ReadZLong() would get inlined many times into a single function body, causing LLVM to crash. Not inlining doesn't seem to have a performance impact on narrow tables, and helps with wide tables. This change also adds tests over wide (i.e. many-column) tables. The test tables are produced by specifying shell commands to generate test tables in functional_schema_template.sql, which are executed in generate-schema-statements.py. In the SQL templates, sections starting with a ` are treated as shell commands. The output of the shell command is then used as the section text. This is only a starting point; it isn't currently implemented for all sections, and may have to be tweaked if we use this mechanism for all tables. Change-Id: Ife0d857d19b21534167a34c8bc06bc70bef34910 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2206 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com> (cherry picked from commit 1c5951e3cce25a048208ab9bb3a3aed95e41cf67) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2353 Tested-by: jenkins	2014-04-28 15:58:15 -07:00
casey	2351266d0e	Replace single process mini-dfs with multiple processes This should allow individual service components, such as a single nodemanager, to be shutdown for failure testing. The mini-cluster bundled with hadoop is a single process that does not expose the ability to control individual roles. Now each role can be controlled and configured independently of the others. Change-Id: Ic1d42e024226c6867e79916464d184fce886d783 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1432 Tested-by: Casey Ching <casey@cloudera.com> Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2297 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-04-23 18:24:05 -07:00
Nong Li	85be9a5050	Update bin/make* -notests to include other artifacts for packages. Change-Id: I95e95f0a2e2131875b95d6676620bec7117b7f8a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2250 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-04-16 00:37:00 -07:00
Nong Li	f9dd32724c	Cleanup build scripts. Consolidated our build scripts and added the -notests option which skips build the BE tests. Change-Id: Ida6aa064b7fe47e535c142b9af92b7c158e83c32 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2043 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2201	2014-04-13 17:11:39 -07:00
Lenni Kuff	d101ef86e2	[CDH5] Bump version to 1.4.0-cdh5-INTERNAL Change-Id: I0a0334084e444c948f1718133afb2d7246dde414 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2193 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-04-11 16:03:09 -07:00
Henry Robinson	8e5848eaf8	RM fixes to get tests passing * One last NotifyThreadUsageChange() mismatched pair * Don't set resource in plan fragment params if there isn't a resource available. This fixes the problem where if no fragment with resources was assigned to the same node as the coordinator, the coordinator would have a dummy resource allocation which didn't work with expansion. * Substitute #ID in all impalad arguments to start-impala-cluster.py with the 0-indexed ID of the impalad being started. This is required to have different Impala processes use different cgroups. Change-Id: If8c8fd8bef0809bdaf16115a45a9695fc2bf3e1b (cherry picked from commit c71ce45e97570b8c09900eb5ae2e26984d3306a4) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2060 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-03-24 15:07:45 -07:00
Lenni Kuff	3d82c9a5d6	Bump version from 1.3.0-INTERNAL to cdh5-1.3.0 Change-Id: Ib7a37b190091a3f9eb6d6f0f560dd40aed23e231 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2031 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-20 17:22:11 -07:00
Lenni Kuff	86f69fb96f	IMP-1306: Fix build scripts to properly generate Impala version info for packaging builds The problem was that were were deleting the version.info file because the default of gen_build_version.py recently changed from --noclean to --clean. Also fixed a bug in the shell version generation and made debugging a bit easier by dumping the contents of version.info whenever it is generated. Change-Id: I764d01c9e46eed1bd39de79bf076c15afa599486 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1901 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com> (cherry picked from commit fa673b4d3342fc825ee7fa942bd254234d222906) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1910 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-14 08:45:16 -07:00
Matthew Jacobs	a283d72cdd	[cdh5] Add latest cdh5 hadoop, hbase, and hive snapshots to thirdparty Change-Id: I60c93b259a26e86aca60f2b3b5b6226eabc0b5eb	2014-03-05 01:06:09 -08:00
Lenni Kuff	8a16709265	Perform prioritized load requests for missing tables in HS2 metadata ops The HS2 metadata operations do not go through analysis() so the prioritized loading will not happen for them. Most of the HS2 metadata ops work purely on table/db names, but GetColumns() requires loading the table metadata. This patch updates MetadataOp to collect a set of missing tables and request these tables be loaded from the catalog server. The operation will wait until the tables are loaded in the local catalog before proceeding. Change-Id: I070f2a0d9194d3317f09431971be9a8dffbc7386 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1542 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1557	2014-03-01 17:16:50 -08:00
Alex Behm	3d764619f7	Run Hive data loading through beeline instead of the Hive shell. Fixes our log configuration to put the Hive logs in cluster_logs/hive. Change-Id: I5d98581e35325f2173e4b3170e36bec42d33f8f3 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1497 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1615 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-02-20 15:43:31 -08:00
Alex Behm	62338694e4	Skip generation of version and impala-ir .cc files in buildall.sh if -noclean is specified. Before this patch the -noclean option had almost no effect on the BE build time because some source files were re-generated with .py scripts regardless. This change allows ./buildall -skiptests -noclean to do a true incremental rebuild. Change-Id: Ib3af85db05bdc96a2279a22c1d49d735f2cabd4e Reviewed-on: http://gerrit.ent.cloudera.com:8080/1394 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1415	2014-01-31 13:57:13 -08:00
ishaan	01ef3ef4c1	load-data.py should exit if a bash command returns a non-zero error code. Change-Id: I2f732a276a42d2697fa55bce0f18ac89e9a6f0a1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1397 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1408 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-30 15:47:13 -08:00
Henry Robinson	5535a8a128	[CDH5] Set CDH major version to 5 Change-Id: Ibc36ed435dd36d3489d27a977bf1726bbf2927a1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1306 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-17 14:34:01 -08:00
Henry Robinson	241270044b	Add CDH_MAJOR_VERSION environment variable CDH_MAJOR_VERSION controls where HDFS data is written. In the future, we can use its value to parameterise Jenkins jobs so that the right code is run / data is generated. Change-Id: Id2957df6d708bc6c50faf7a8a609aff5f9571662 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1293 Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1305 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-17 14:33:18 -08:00
Nong Li	53d7bbb97a	[CDH5] Impala changes for updated thirdparty components. Changes include: - version changes in impala-config - version changes in various loading scripts - hbase jars are no longer in hive/lib - mini-llama script changes - updates due to sentry api changes - JDBC tests disabled - unsupported types tests disabled. Change-Id: If8cf1b7ad8e22aa4d23094b9a4b1047f7e9d93ee	2014-01-15 15:12:13 -08:00
Alex Behm	e6299b684b	Setting proper cgroup CPU shares based on the reserved resource. Change-Id: I58992b11e71ed7ad7ea7639050d74fd3eaa4d1d1	2014-01-15 15:12:07 -08:00
Alex Behm	5cfbec9139	Impala creates and manages its own CGroups instead of using the Yarn-NM provided ones. Change-Id: Id09ba2641ad33fbc109eea2dd6fe80b1863b5cac	2014-01-15 15:12:06 -08:00
Alex Behm	760750af27	Enforcing reserved memory resources via mem limits. Fixed codepath with rm disabled. Set enable_rm to false by default. Change-Id: I3bf2d0525d91243ec3c0ea048b0c03680befcda2 Conflicts: be/src/runtime/runtime-state.cc	2014-01-15 15:12:05 -08:00
Alex Behm	dc7b398bd3	Impala reserves resources from YARN via LLama. Impala reserves resources from YARN via Llama and handles resources preemptions by cancelling affected queries. Adds the Impala Resource Broker for interacting with Llama. Refactors scheduler and coordinator to move fragment-to-host assignment logic into scheduler. Local test setup uses MiniLLama. Change-Id: Ic7b0fe43de52d30f4207b4e65cce7e6a294e54e1	2014-01-15 15:12:04 -08:00
Alex Behm	fc6ecd39e5	[CDH5] Fixed issue with data loading using JDK7 and Hive (HIVE-5068). Fixed missing dependency in testdata for HBase region splitting. Change-Id: Iab002f652bc1b1c2f8ce60b7505f592eedcb9cc0	2014-01-15 15:11:32 -08:00
Alex Behm	60003ad211	[CDH5] Changes to make Impala work on CDH5. Mostly fixing up dependency versions. Minor code changes to address HBase API changes. Change-Id: Icbbeb13eefa29e38286328d45600117a383cd106	2014-01-15 15:11:23 -08:00
Nong Li	752b8e3ee4	[CDH5] Added CDH5 beta2 versions of Hadoop, Hive, HBase and Llama to thirdparty. Change-Id: Id033c0246c0ffdffd0c7703eaff9600086912380	2014-01-15 15:11:13 -08:00
Lenni Kuff	8571920753	Bump version to v1.3.0-INTERNAL Change-Id: I32bae4daf093794b09f4ca85b9abdc686791aee8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1281 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-14 21:33:22 -08:00
ishaan	4e9913b52f	Fix race in data loading by creating text tables first. While loading parquet, there are a few table creation queries that use the 'like' keyword; this ends up opening a small race window when all the table formats are created concurrently. With this change, we create the text tables first before attempting to parallelize the rest of the data loading. Change-Id: Ib84cf0e5120b3588d3f0503d7119ca055e08e53f Reviewed-on: http://gerrit.ent.cloudera.com:8080/1241 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-10 15:01:59 -08:00
Nong Li	056c7d94d6	Remove compute stats option from bin/load-data.py This option is not implemented in this script and doesn't make it obvious that it doesn't do anything. Change-Id: I1a1eff38460fd181c486cfca2840108a58e21603 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1059 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-10 14:01:35 -08:00
Henry Robinson	9a0dc18700	Remove a couple of unused files * upload_codereview.py is no longer used since Rietveld is long gone * runplanservice is deprecated as there is no longer a separate PlanService * README only mentions a single internal wiki page. Change-Id: Iba61a3d62381deb882c4168f142574f2492e0969 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1249 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-09 09:56:05 -08:00
Alex Behm	6483f53581	Additional options for JVM debugging in impala startup scripts. Enables JVM debugging by default for the catalogd and impalads created via bin/start-impala-cluster.py. Adds a -jvm_args command line option for passing additional JVM args to the catalogd and impalads. Change-Id: I68e901661bd1fd7eefa05ba84dbacf29dd124685 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1213 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:54:40 -08:00
ishaan	0ed1781323	Invalidate metadata before loading parquet data through Impala. During a full data load, we load all the data (except parquet) via hive, and then load the parquet data via Impala. The catalog service does not update the metadata of tables changed outside Impala, so we need to explicitly invalidate the metadata before loading parquet data. Change-Id: Iec39db9ea46e4a11b17589881732629a56444120 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1207 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:39 -08:00
Lenni Kuff	baf79f8185	Call 'invalidate metadata' after loading test data instead of before Instead of calling 'invalidate metadata' before loading each workload we should call it once, after loading all test data. This will allow us to pickup data inserted by Hive. The only reason this worked before is because we restart Impala before running the tests. This will also be a bit faster if loading multiple workloads. Change-Id: I28d42bbf5d7a24b5fde687d67a4b41472ec4b897 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1153 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:37 -08:00
Henry Robinson	177b9ba3b1	Remove nonblocking server (and dependencies) from build Goodnight, sweet non-blocking prince. We didn't support, or test, this configuration, and it doesn't work with security or sessions and brings in some annoying dependencies that are a pain to build. We have other RPC-stack options to investigate; we may wind up re-adding the non-blocking server but only in a way that supports all required features more regularly. Change-Id: Ifbcabc5014441f6d31c342c4e288dd7fc6201443	2014-01-08 10:54:35 -08:00
ishaan	7e520f8f23	Make workload runner logging more concise and readable. This patch makes the workload runner's logging concise and more informative. Specifically, it - logs the time taken for each iteration of a query. - changes the default log level to INFO. - The output is less verbose. Change-Id: I5f964cf76269fd64ce127b9e4c51fe1deafd1d1b Reviewed-on: http://gerrit.ent.cloudera.com:8080/1076 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:54:35 -08:00
Henry Robinson	0440f26f3e	Add -gdb flag to start-impalad.sh to start Impala under gdb Change-Id: I19f027680cfbf6a7cbc4b311e07f244d67ff683d Reviewed-on: http://gerrit.ent.cloudera.com:8080/1125 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:33 -08:00
Nong Li	1c2e767b89	Bump version to 1.2.3-INTERNAL. Change-Id: I2baf2aa41587ccf24331da7cba399cedb296a2e0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1132 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:32 -08:00

1 2 3 4 5 ...

329 Commits