Maven 3.9.x offers a new dependency resolver, HttpClient, which allows
downloading project dependencies in parallel.
This patch bumps the Maven version installed by bootstrap_system.sh to
v3.9.2, and adds the flags enabling the new resolver to download
dependencies (including POM files) in parallel. Parallelism is set to
10 threads.
The flags are added to a project-specific Maven setting file in the
newly created java/.mvn directory. The settings file is added to the
RAT exclusion list in bin/rat_exclude_files.txt.
The --show-version flag is added for debugging purposes.
The same flags are added to the JAMM subproject as well.
The new resolver in Maven 3.9 has also changed the warning message
emitted for missing component checksums, so the new warning string
is added to the filter in bin/mvn-quiet.sh
Unfortunately Maven 3.9 has also changed the way it responds to missing
checksum files: the resolver now emits a stack trace when checksums
cannot be determined, and missing checksums are not explicitly ignored.
Detailed documentation for the new Maven resolver in Maven 3.9.0+ is
located at:
https://maven.apache.org/guides/mini/guide-resolver-transport.html
resolver configuration reference:
https://maven.apache.org/resolver/configuration.html
Tests:
- verified in a core-mode test run with Maven 3.9.2 installed
- verified in a local build using an earlier version of Maven
to verify that the new default setting does not cause regressions
with the old dependency resolver.
Change-Id: I75d05215effc724f5bd471646fb352f37443e185
Reviewed-on: http://gerrit.cloudera.org:8080/20142
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Some Impala dependencies come from repositories that don't have
checksums available. During the build, this produces a large
number of messages like:
[WARNING] Checksum validation failed, no checksums available from the repository for ...
or:
[WARNING] Checksum validation failed, could not read expected checksum ...
These messages are not very useful, and they make it harder to search
the console output for failed tests. This filters them out of the maven
output. Differet versions of maven structure the messsages differently,
so this filters all the "Checksum validation failed" messages that happen
at WARNING level.
Testing:
- Ran core tests, verified the messages are gone
Change-Id: I19afbd157533e52ef3157730c7ec5159241749bc
Reviewed-on: http://gerrit.cloudera.org:8080/15775
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Anurag Mantripragada <anurag@cloudera.com>
The frontend build uses the maven-enforcer-plugin to ban
some dependencies or require specific versions of dependencies.
The messages look like:
Found Banned Dependency: foo.bar.baz:1.2.3
These are currently filtered by bin/mvn-quiet.sh. This adds
an exception for "Found Banned" so they are not filtered.
Testing:
- Ran on a branch with a known banned dependency and verified
the output
Change-Id: I24abe59ad6bffb28ac63d014aa0ec7388ef5478f
Reviewed-on: http://gerrit.cloudera.org:8080/15820
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: David Knupp <dknupp@cloudera.com>
The maven build downloads a large number of artifacts from
various maven repositories. When starting with an empty .m2
directory (like most upstream Jenkins jobs), downloading
all the artifacts can take up to 30 minutes. This has been
slowing down our precommit builds by 15-20 minutes.
This adds a script to archive the .m2 directory into a
tarball while excluding artifacts from impala.cdp.repo
and impala.cdh.repo. This will later be used to prepopulate
the .m2 directory for Jenkins jobs.
This adds a script to parse the maven log and output how
many maven artifacts are downloaded from each repository.
It also prints how many downloads were attempted for each
repository. This might aid in diagnosing slowness.
This also changes mvn-quiet.sh to add logging that prints
a timestamp. It also adds the -B flag to mvn, which causes
maven to run in batch mode. This makes the output easier
to parse, because maven omits special console formatting
characters such as ^M (carriange return).
This changes build-all-flag-combinations.sh to print the
maven statistics after each part of the build and call the
script to produce an m2 archive at the end.
Change-Id: I043912f5fbc7cf24ee80b2855354656aa587ca9f
Reviewed-on: http://gerrit.cloudera.org:8080/14562
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
"Could not transfer" warning messages are noisy. However, excluding
"Could not transfer" words can lead to actual error messages that
contain "Could not transfer" to not be shown in the stdout, which can
make debugging difficult. This patch updates mvn-quiet.sh to show
"Could not transfer" messages.
Testing:
- Ran FE build
Change-Id: Ide3367fd98abbbe11eec1fa86fbad8b32eeecb8d
Reviewed-on: http://gerrit.cloudera.org:8080/13647
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch bumps the CDP_BUILD_NUMBER to 1013201. This patch also
refactors the bootstrap_toolchain.py to be more generic for dealing with
CDP components, e.g. Ranger and Hive 3.
The patch also fixes some TODOs to replace the rangerPlugin.init() hack
with rangerPlugin.refreshPoliciesAndTags() API available in this Ranger
build.
Testing:
- Ran core tests
- Manually verified that no regression when starting Hive 3 with
USE_CDP_HIVE=true
Change-Id: I18c7274085be4f87ecdaf0cd29a601715f594ada
Reviewed-on: http://gerrit.cloudera.org:8080/13002
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Verified the changes to mvn-quiet.sh by trigerring an impala build by
running bootstrap_development.sh which invokes mvn-quiet at multiple
places. Verified the creation of mvn log file with the relevant content
in the $IMPALA_HOME/logs/mvn folder
Change-Id: I475b17a4dccfa624dda61402491b461c53473f8b
Reviewed-on: http://gerrit.cloudera.org:8080/9273
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
With this commit, $IMPALA_MAVEN_OPTIONS is used by bin/mvn-quiet.sh
to configure Maven slightly. The default is no extra options.
This is handy for giving Maven a settings file with the "-s" flag, to
control, for example, repositories and their mirrors. In fact, I
considered exposing IMPALA_MAVEN_SETTINGS_FILE explicitly, but decided
that the generic option would be as good.
It's useful to customize how Maven works, especially
to provide a settings file with repository mirrors.
Change-Id: I2c62185476fd2388c7cda8884276b79a77370127
Reviewed-on: http://gerrit.cloudera.org:8080/8496
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
Maven's INFO log level is very verbose and includes a lot of progress
information that is minimally useful.
Maven doesn't have an option to output only ERROR and WARNING log
messages. As a workaround, use grep to filter out the majority of the
output (only warnings, errors, tests, and success/failure).
Also add a header with relevant info about the maven command:
targets and working directory.
Change-Id: I828b870edc2fc80a6460e6ed594d507c46e69c82
Reviewed-on: http://gerrit.cloudera.org:8080/1752
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins