11 Commits

Author SHA1 Message Date
Laszlo Gaal
ee069687fc IMPALA-12212: Bump Maven to 3.9.2, pull dependencies in parallel
Maven 3.9.x offers a new dependency resolver, HttpClient, which allows
downloading project dependencies in parallel.

This patch bumps the Maven version installed by bootstrap_system.sh to
v3.9.2, and adds the flags enabling the new resolver to download
dependencies (including POM files) in parallel. Parallelism is set to
10 threads.

The flags are added to a project-specific Maven setting file in the
newly created java/.mvn directory. The settings file is added to the
RAT exclusion list in bin/rat_exclude_files.txt.

The --show-version flag is added for debugging purposes.

The same flags are added to the JAMM subproject as well.

The new resolver in Maven 3.9 has also changed the warning message
emitted for missing component checksums, so the new warning string
is added to the filter in bin/mvn-quiet.sh
Unfortunately Maven 3.9 has also changed the way it responds to missing
checksum files: the resolver now emits a stack trace when checksums
cannot be determined, and missing checksums are not explicitly ignored.

Detailed documentation for the new Maven resolver in Maven 3.9.0+ is
located at:
https://maven.apache.org/guides/mini/guide-resolver-transport.html
resolver configuration reference:
https://maven.apache.org/resolver/configuration.html

Tests:
- verified in a core-mode test run with Maven 3.9.2 installed
- verified in a local build using an earlier version of Maven
  to verify that the new default setting does not cause regressions
  with the old dependency resolver.

Change-Id: I75d05215effc724f5bd471646fb352f37443e185
Reviewed-on: http://gerrit.cloudera.org:8080/20142
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2023-07-24 18:50:34 +00:00
Joe McDonnell
03f2b559c3 Filter out "Checksum validation failed" messages during the maven build
Some Impala dependencies come from repositories that don't have
checksums available. During the build, this produces a large
number of messages like:
[WARNING] Checksum validation failed, no checksums available from the repository for ...
or:
[WARNING] Checksum validation failed, could not read expected checksum ...
These messages are not very useful, and they make it harder to search
the console output for failed tests. This filters them out of the maven
output. Differet versions of maven structure the messsages differently,
so this filters all the "Checksum validation failed" messages that happen
at WARNING level.

Testing:
 - Ran core tests, verified the messages are gone

Change-Id: I19afbd157533e52ef3157730c7ec5159241749bc
Reviewed-on: http://gerrit.cloudera.org:8080/15775
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Anurag Mantripragada <anurag@cloudera.com>
2020-06-05 18:27:02 +00:00
Joe McDonnell
afe765e3bd Don't filter maven messages about banned dependencies
The frontend build uses the maven-enforcer-plugin to ban
some dependencies or require specific versions of dependencies.
The messages look like:
Found Banned Dependency: foo.bar.baz:1.2.3

These are currently filtered by bin/mvn-quiet.sh. This adds
an exception for "Found Banned" so they are not filtered.

Testing:
 - Ran on a branch with a known banned dependency and verified
   the output

Change-Id: I24abe59ad6bffb28ac63d014aa0ec7388ef5478f
Reviewed-on: http://gerrit.cloudera.org:8080/15820
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: David Knupp <dknupp@cloudera.com>
2020-04-28 02:21:21 +00:00
David Knupp
850de91cc3 IMPALA-9107: Add timestamp to maven logging options.
We found that using awk to add a timestamp to the maven log can fail
if gawk is not installed. It seems better to configure maven to add
the timestamp itself.

Sample output:

========================================================================
Running mvn -U -Dorg.slf4j.simpleLogger.showDateTime=true -Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss -B install -DskipTests
Directory /home/dknupp/Impala/ext-data-source
========================================================================
16:37:16 [INFO] Scanning for projects...
16:37:16 [INFO] ------------------------------------------------------------------------
16:37:16 [INFO] Reactor Build Order:
16:37:16 [INFO]
16:37:16 [INFO] Apache Impala External Data Source                                 [pom]
16:37:16 [INFO] Apache Impala External Data Source API                             [jar]
16:37:16 [INFO] Apache Impala External Data Source Sample                          [jar]
16:37:16 [INFO] Apache Impala External Data Source Test Library                    [jar]
16:37:17 [INFO]
16:37:17 [INFO] ----------------< org.apache.impala:impala-data-source >----------------
16:37:17 [INFO] Building Apache Impala External Data Source 1.0-SNAPSHOT           [1/4]
16:37:17 [INFO] --------------------------------[ pom ]---------------------------------
[etc...]

Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e
Reviewed-on: http://gerrit.cloudera.org:8080/15537
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-03-25 16:39:32 +00:00
Joe McDonnell
603e5147d5 IMPALA-9107 (part 1): Add scripts to produce an m2 archive
The maven build downloads a large number of artifacts from
various maven repositories. When starting with an empty .m2
directory (like most upstream Jenkins jobs), downloading
all the artifacts can take up to 30 minutes. This has been
slowing down our precommit builds by 15-20 minutes.

This adds a script to archive the .m2 directory into a
tarball while excluding artifacts from impala.cdp.repo
and impala.cdh.repo. This will later be used to prepopulate
the .m2 directory for Jenkins jobs.

This adds a script to parse the maven log and output how
many maven artifacts are downloaded from each repository.
It also prints how many downloads were attempted for each
repository. This might aid in diagnosing slowness.

This also changes mvn-quiet.sh to add logging that prints
a timestamp. It also adds the -B flag to mvn, which causes
maven to run in batch mode. This makes the output easier
to parse, because maven omits special console formatting
characters such as ^M (carriange return).

This changes build-all-flag-combinations.sh to print the
maven statistics after each part of the build and call the
script to produce an m2 archive at the end.

Change-Id: I043912f5fbc7cf24ee80b2855354656aa587ca9f
Reviewed-on: http://gerrit.cloudera.org:8080/14562
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-03-22 21:08:06 +00:00
Fredy Wijaya
02a79822eb Remove "Could not transfer" exclusion in mvn-quiet.sh
"Could not transfer" warning messages are noisy. However, excluding
"Could not transfer" words can lead to actual error messages that
contain "Could not transfer" to not be shown in the stdout, which can
make debugging difficult. This patch updates mvn-quiet.sh to show
"Could not transfer" messages.

Testing:
- Ran FE build

Change-Id: Ide3367fd98abbbe11eec1fa86fbad8b32eeecb8d
Reviewed-on: http://gerrit.cloudera.org:8080/13647
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-06-14 21:50:18 +00:00
Fredy Wijaya
5fa076e95c IMPALA-8329: Bump CDP_BUILD_NUMBER to 1013201
This patch bumps the CDP_BUILD_NUMBER to 1013201. This patch also
refactors the bootstrap_toolchain.py to be more generic for dealing with
CDP components, e.g. Ranger and Hive 3.

The patch also fixes some TODOs to replace the rangerPlugin.init() hack
with rangerPlugin.refreshPoliciesAndTags() API available in this Ranger
build.

Testing:
- Ran core tests
- Manually verified that no regression when starting Hive 3 with
  USE_CDP_HIVE=true

Change-Id: I18c7274085be4f87ecdaf0cd29a601715f594ada
Reviewed-on: http://gerrit.cloudera.org:8080/13002
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-04-17 05:30:33 +00:00
njanarthanancldr
1b121c9f1b IMPALA-5139: Update mvn-quiet.sh to print execution content to log file
Verified the changes to mvn-quiet.sh by trigerring an impala build by
running bootstrap_development.sh which invokes mvn-quiet at multiple
places. Verified the creation of mvn log file with the relevant content
in the $IMPALA_HOME/logs/mvn folder

Change-Id: I475b17a4dccfa624dda61402491b461c53473f8b
Reviewed-on: http://gerrit.cloudera.org:8080/9273
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
2018-02-22 01:36:50 +00:00
Philip Zeyliger
a0be00ad6d Expose $IMPALA_MAVEN_OPTIONS for configuring Maven.
With this commit, $IMPALA_MAVEN_OPTIONS is used by bin/mvn-quiet.sh
to configure Maven slightly. The default is no extra options.

This is handy for giving Maven a settings file with the "-s" flag, to
control, for example, repositories and their mirrors. In fact, I
considered exposing IMPALA_MAVEN_SETTINGS_FILE explicitly, but decided
that the generic option would be as good.

It's useful to customize how Maven works, especially
to provide a settings file with repository mirrors.

Change-Id: I2c62185476fd2388c7cda8884276b79a77370127
Reviewed-on: http://gerrit.cloudera.org:8080/8496
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-14 01:29:56 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Tim Armstrong
f13dfcbddc Suppress maven info logging
Maven's INFO log level is very verbose and includes a lot of progress
information that is minimally useful.

Maven doesn't have an option to output only ERROR and WARNING log
messages. As a workaround, use grep to filter out the majority of the
output (only warnings, errors, tests, and success/failure).

Also add a header with relevant info about the maven command:
targets and working directory.

Change-Id: I828b870edc2fc80a6460e6ed594d507c46e69c82
Reviewed-on: http://gerrit.cloudera.org:8080/1752
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-01-15 19:38:46 +00:00