Files
impala/testdata/workloads/functional-query/queries/QueryTest
Dimitris Tsirogiannis b78b37cbdc IMPALA-1480: Slow DDL statements with large number of partitions
This commit improves the performance of DDL statements on tables with
large number of partitions. Previously, the catalog would force-reload
the entire table metadata during the execution of DDL and insert
statements, causing significant delays for tables with large number of
partitions. With this commit the catalog is reusing any cached table
entries to partially reload table metadata for only those partitions
that have been modified. With this change we've improved the performance
of some DDL and insert statements by at least 4-5X.

This commit also adds basic table-level locking to protect table
metadata from concurrent DDL operations.

Preliminary performance measurements
-----------------------------------
Workload: insert into table partition () select ... limit 10
Iterations: 10

Num partitions  OLD (avg time sec)	NEW (avg time sec)
1K		1.15			0.45
5K		3.65			0.9
10K		5.75			1.38
15K		10.1			2.02
30K		25.4			4.46

Workload: alter table partition() set location...
Iterations: 10

Num partitions	OLD (avg time sec)	NEW (avg time sec)
1K		0.8			0.47
5K		4.3			0.71
10K		7.1			1.2
15K		13.2			1.8
30K		26.8			3.4

Change-Id: I4da7fb6df0a71162b0cb60e6025a4019cb9572bf
Reviewed-on: http://gerrit.cloudera.org:8080/1706
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
2016-01-28 09:18:57 +00:00
..
2014-05-16 22:26:11 -07:00
2014-06-11 03:10:11 -07:00
2014-06-11 03:10:11 -07:00
2014-06-11 03:10:11 -07:00
2014-06-11 03:10:11 -07:00
2014-06-24 02:14:27 -07:00
2015-03-11 16:39:39 -07:00
2014-01-08 10:48:09 -08:00
2014-06-20 13:35:10 -07:00