mirror of
https://github.com/apache/impala.git
synced 2026-01-28 09:03:52 -05:00
This patch fixes the slow performance in Impala shell, especially for large queries by replacing all calls to sqlparse.format(sql_string, strip_comments=True) with the custom implementation of strip comments that does not use grouping. The code to strip leading comments was also refactored to not use grouping. * Benchmark running a query with 12K columns * Before the patch: $ time impala-shell.sh -f large.sql --quiet real 2m4.154s user 2m0.536s sys 0m0.088s After the patch: $ time impala-shell.sh -f large.sql --quiet real 0m3.885s user 0m1.516s sys 0m0.048s Testing: - Added a new test to test the Impala shell performance - Ran all shell tests on Python 2.6 and Python 2.7 Change-Id: Idac9f3caed7c44846a8c922dbe5ca3bf3b095b81 Reviewed-on: http://gerrit.cloudera.org:8080/10939 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>