Files
impala/testdata/bin/setup-ranger.sh
Fang-Yu Rao 3a2f5f28c9 IMPALA-12921, IMPALA-12985: Support running Impala with locally built Ranger
The goals and non-goals of this patch could be summarized as follows.
Goals:
 - Add changes to the minicluster configuration that allow a non-default
   version of Ranger (possibly built locally) to run in the context of
   the minicluster, and to be used as the authorization server by
   Impala.
 - Switch to the new constructor when instantiating
   RangerAccessRequestImpl. This resolves IMPALA-12985 and also makes
   Impala compatible with Apache Ranger if RangerAccessRequestImpl from
   Apache Ranger is consumed.
 - Prepare Ranger and Impala patches as supplemental material to verify
   what authorization-related tests could be passed if Apache Ranger is
   the authorization provider. Merging IMPALA-12921_addendum.diff to
   the Impala repository is not in the scope of this patch in that the
   diff file changes the behavior of Impala and thus more discussion is
   required if we'd like to merge it in the future.

Non-goals:
 - Set up any automation for building Ranger from source.
 - Pass all Impala authorization-related tests with a non-default
   version of Ranger.

Instructions on running Impala with locally built Ranger:

Suppose the Ranger project is under the folder $RANGER_SRC_DIR. We could
execute the following to build Apache Ranger for easy reference. By
default, the compressed tarball is produced under
$RANGER_SRC_DIR/target.

mvn clean compile -B -nsu -DskipCheck=true -Dcheckstyle.skip=true \
package install -DskipITs -DskipTests -Dmaven.javadoc.skip=true

After building Ranger, we need to build Impala's Java code so that
Impala's Java code could consume the locally produced Ranger classes. We
will need to export the following environment variables before building
Impala. This prevents bootstrap_toolchain.py from trying to download the
compressed Ranger tarball.

1. export RANGER_VERSION_OVERRIDE=\
   $(mvn -f $RANGER_SRC_DIR/pom.xml -q help:evaluate \
   -Dexpression=project.version -DforceStdout)

2. export RANGER_HOME_OVERRIDE=$RANGER_SRC_DIR/target/\
   ranger-${RANGER_VERSION_OVERRIDE}-admin

It then suffices to execute the following to point
Impala to the locally built Ranger server before starting Impala.

1. source $IMPALA_HOME/bin/impala-config.sh

2. tar zxv -f $RANGER_SRC_DIR/target/\
   ranger-${IMPALA_RANGER_VERSION}-admin.tar.gz \
   -C $RANGER_SRC_DIR/target/

3. $IMPALA_HOME/bin/create-test-configuration.sh

4. $IMPALA_HOME/bin/create-test-configuration.sh \
   -create_ranger_policy_db

5. $IMPALA_HOME/testdata/bin/run-ranger.sh
   (run-all.sh has to be executed instead if other underlying services
   have not been started)

6. $IMPALA_HOME/testdata/bin/setup-ranger.sh

Testing:
 - Manually verified that we could point Impala to a locally built
   Apache Ranger on the master branch (with tip being
   https://github.com/apache/ranger/commit/4abb993).
 - Manually verified that with RANGER-4771.diff and
   IMPALA-12921_addendum.diff, only 3 authorization-related tests
   failed. They failed because the resource type of 'storage-type' is
   not supported in Apache Ranger yet and thus the test cases added in
   IMPALA-10436 could fail.
 - Manually verified that the log files of Apache and CDP Ranger's Admin
   server could be created under ${RANGER_LOG_DIR} after we start the
   Ranger service.
 - Verified that this patch passed the core tests when CDP Ranger is
   used.

Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27
Reviewed-on: http://gerrit.cloudera.org:8080/21160
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-06-15 10:25:13 +00:00

132 lines
5.5 KiB
Bash
Executable File

#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
set -euo pipefail
. $IMPALA_HOME/bin/report_build_error.sh
setup_report_build_error
set -x
function setup-ranger {
echo "SETTING UP RANGER"
RANGER_SETUP_DIR="${IMPALA_HOME}/testdata/cluster/ranger/setup"
perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \
"${RANGER_SETUP_DIR}/impala_group_owner.json.template" > \
"${RANGER_SETUP_DIR}/impala_group_owner.json"
GROUP_ID_OWNER=$(wget -qO - --auth-no-challenge --user=admin --password=admin \
--post-file="${RANGER_SETUP_DIR}/impala_group_owner.json" \
--header="accept:application/json" \
--header="Content-Type:application/json" \
http://localhost:6080/service/xusers/secure/groups |
python -c "import sys, json; print(json.load(sys.stdin)['id'])")
export GROUP_ID_OWNER
GROUP_ID_NON_OWNER=$(wget -qO - --auth-no-challenge --user=admin \
--password=admin --post-file="${RANGER_SETUP_DIR}/impala_group_non_owner.json" \
--header="accept:application/json" \
--header="Content-Type:application/json" \
http://localhost:6080/service/xusers/secure/groups |
python -c "import sys, json; print(json.load(sys.stdin)['id'])")
export GROUP_ID_NON_OWNER
GROUP_ID_NON_OWNER_2=$(wget -qO - --auth-no-challenge --user=admin \
--password=admin --post-file="${RANGER_SETUP_DIR}/impala_group_non_owner_2.json" \
--header="accept:application/json" \
--header="Content-Type:application/json" \
http://localhost:6080/service/xusers/secure/groups |
python -c "import sys, json; print(json.load(sys.stdin)['id'])")
export GROUP_ID_NON_OWNER_2
perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \
"${RANGER_SETUP_DIR}/impala_user_owner.json.template" > \
"${RANGER_SETUP_DIR}/impala_user_owner.json"
perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \
"${RANGER_SETUP_DIR}/impala_user_non_owner.json.template" > \
"${RANGER_SETUP_DIR}/impala_user_non_owner.json"
perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \
"${RANGER_SETUP_DIR}/impala_user_non_owner_2.json.template" > \
"${RANGER_SETUP_DIR}/impala_user_non_owner_2.json"
if grep "\${[A-Z_]*}" "${RANGER_SETUP_DIR}/impala_user_owner.json"; then
echo "Found undefined variables in ${RANGER_SETUP_DIR}/impala_user_owner.json."
exit 1
fi
if grep "\${[A-Z_]*}" "${RANGER_SETUP_DIR}/impala_user_non_owner.json"; then
echo "Found undefined variables in ${RANGER_SETUP_DIR}/impala_user_non_owner.json."
exit 1
fi
if grep "\${[A-Z_]*}" "${RANGER_SETUP_DIR}/impala_user_non_owner_2.json"; then
echo "Found undefined variables in ${RANGER_SETUP_DIR}/impala_user_non_owner_2.json."
exit 1
fi
wget -O /dev/null --auth-no-challenge --user=admin --password=admin \
--post-file="${RANGER_SETUP_DIR}/impala_user_owner.json" \
--header="Content-Type:application/json" \
http://localhost:6080/service/xusers/secure/users
wget -O /dev/null --auth-no-challenge --user=admin --password=admin \
--post-file="${RANGER_SETUP_DIR}/impala_user_non_owner.json" \
--header="Content-Type:application/json" \
http://localhost:6080/service/xusers/secure/users
wget -O /dev/null --auth-no-challenge --user=admin --password=admin \
--post-file="${RANGER_SETUP_DIR}/impala_user_non_owner_2.json" \
--header="Content-Type:application/json" \
http://localhost:6080/service/xusers/secure/users
wget -O /dev/null --auth-no-challenge --user=admin --password=admin \
--post-file="${RANGER_SETUP_DIR}/impala_service.json" \
--header="Content-Type:application/json" \
http://localhost:6080/service/public/v2/api/service
# The policy id corresponding to all the databases and tables is 4 in Apache Ranger,
# whereas it is 5 in CDP Ranger. Getting the policy id via the following API call
# makes this script more resilient to the change in the policy id.
ALL_DATABASE_POLICY_ID=$(curl -u admin:admin -X GET \
http://localhost:6080/service/public/v2/api/service/test_impala/policy/\
all%20-%20database \
-H 'accept: application/json' | \
python -c "import sys, json; print(json.load(sys.stdin)['id'])")
export ALL_DATABASE_POLICY_ID
perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \
"${RANGER_SETUP_DIR}/all_database_policy_revised.json.template" > \
"${RANGER_SETUP_DIR}/all_database_policy_revised.json"
if grep "\${[A-Z_]*}" "${RANGER_SETUP_DIR}/all_database_policy_revised.json"; then
echo "Found undefined variables in \
${RANGER_SETUP_DIR}/all_database_policy_revised.json."
exit 1
fi
curl -f -u admin:admin -H "Accept: application/json" \
-H "Content-Type: application/json" \
-X PUT http://localhost:6080/service/public/v2/api/policy/${ALL_DATABASE_POLICY_ID} \
-d @"${RANGER_SETUP_DIR}/all_database_policy_revised.json"
}
setup-ranger