The .proto file extension is used for the Google Protocol Buffers
language. Impala uses this language to specify the format of messages
used by KRPC. Add support for this language to .clang-format so that we
can have consistent formatting.
The proposed support is:
Language: Proto
BasedOnStyle: Google
ColumnLimit: 90
This produces only a few diffs when run against the existing Impala
code. I’m not proposing to make any changes to .proto files, this is
just to show what clang-format will do. Apart from wrapping comments and
code at 90 chars, the diffs are mostly of the form
-syntax="proto2";
+syntax = "proto2";
- message Certificate {};
+ message Certificate {
+ };
- optional bool client_timeout_defined = 4 [ default = false ];
+ optional bool client_timeout_defined = 4 [default = false];
- UNKNOWN = 999;
- NEGOTIATE = 1;
- SASL_SUCCESS = 0;
- SASL_INITIATE = 2;
+ UNKNOWN = 999;
+ NEGOTIATE = 1;
+ SASL_SUCCESS = 0;
+ SASL_INITIATE = 2;
This last change can be configured using “AlignConsecutiveAssignments:
true” but that creates a different set of diffs.
Change-Id: I0c2dcfc21fc8f9206adb64166fbd05580516791f
Reviewed-on: http://gerrit.cloudera.org:8080/12165
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
In order to attempt to get code like
double VeryLongFunctionNames(double x1, double x2, double x3,
double x4) {
return 1.0;
}
rather than
double VeryLongFunctionNames(
double x1, double x2, double x3, double x4) {
return 1.0;
}
I wrote a small set of programs to infer which .clang-format params
fit the current Impala codebase most closely; this patch is the
result.
This patch is the best the inferencer found (while maintaining certain
enforced parameters, like 90-character lines). It is about 10% closer
to Impala's current code base than the .clang-format that is checked
in at the moment, as measured by number of lines in the diff.
Change-Id: Iccaec6c1673c3e08d2c39200b0c84437af629aed
Reviewed-on: http://gerrit.cloudera.org:8080/4590
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Jim Apple <jbapple@cloudera.com>
The main outcome of this patch is to split out PhjBuilder from
PartitionedHashJoinNode, which manages the build partitions for the
join and implements the DataSink interface.
A lot of this work is fairly mechanical refactoring: dividing the state
between the two classes and duplicating it where appropriate. One major
change is that probe partitions need to be separated from build
partitions.
This required some significant algorithmic changes to memory management
for probe partition buffers: memory management for probe partitions was
entangled with the build partitions: small buffers were allocated for
the probe partitions even for non-spilling joins and the join could
spill additional partitions during probing if the probe partitions needed
to switch from small to I/O buffers. The changes made were:
- Probe partitions are only initialized after the build is partitioned, and
only for spilled build partitions.
- Probe partitions never use small buffers: once the initial write
buffer is allocated, appending to the probe partition never fails.
- All probe partition allocation is done after partitioning the build
and before processing the probe input during the same phase as hash
table building. (Aside from NAAJ partitions which are allocated
upfront).
The probe partition changes necessitated a change in
BufferedTupleStream: allocation of write blocks is now explicit via the
PrepareForWrite() API.
Testing:
Ran exhaustive build and local stress test.
Memory Usage:
Ran stress test binary search locally for TPC-DS SF-1 and TPC-H SF-20.
No regressions on TPC-DS. TPC-H either stayed the same or improved in
min memory requirement without spilling, but the min memory requirement
with spilling regressed in some cases. I investigated each of the
significant regressions on TPC-H and determined that they were all due
to exec nodes racing for spillable or non-spillable memory. None of them
were cases where exec nodes got their minimum reservation and failed to
execute the spilling algorithm correctly.
Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb
Reviewed-on: http://gerrit.cloudera.org:8080/3873
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
The description of DerivePointerAlignment, which is set to true in the
Google style, is
If true, analyze the formatted file for the most common alignment
of & and *. PointerAlignment is then used only as fallback.
(PointerAlignment is Left, meaning "char* x".)
The virtue of setting DerivePointerAlignment to false is that it will
prevent new contributors from setting the pointers in the wrong place
in a new file or in a file they are modifying heavily.
Change-Id: I713cdd68af741c7cba9e2c5704f20b8901c9e5f6
Reviewed-on: http://gerrit.cloudera.org:8080/4127
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins