Files
impala/docker/setup_build_context.py
Tim Armstrong ea826ca0d9 IMPALA-7948: part 1: initial docker container build
This builds an impala_base container that has all of the build artifacts
required to run the impala processes, then builds impalad, catalogd and
statestore containers based on that with the right ports exposed.
The images are based on the Ubuntu 16.04 image to align with the
most common development environment.

The container build process is integrated with CMake and is designed
to integrate with the rest of the build so that the container build
depends on the artifacts that will go into the container. You can
build the images with the following command, which will create
images called "impala_base", "impalad", "catalogd" and
"statestored":

  ninja -j $IMPALA_BUILD_THREADS docker_images

The images need some refinement to be truly useful.  The following
will be done in future patches:
* IMPALA-7947 - integrate with start-impala-cluster.py to
  automatically create docker network with containers running on it
* Mechanism to pass in command-line flags
* Mechanisms to update the various config files to point to the
  docker host rather than "localhost", which doesn't point to
  the right thing inside the container.
* Mechanisms to set mem_limit, JVM heap sizes, etc, automatically.

Testing:
Manually started up the containers connected to a user-defined bridge
network, tweaked the configurations to point to the HMS/HDFS/etc
running on my host. I then used "docker ps" to figure out the
port mappings for beeswax and debug webserver.

Confirmed that I could run a query and access debug pages:

  $ impala-shell.sh -i localhost:32860 -q "select coordinator()"
  Starting Impala Shell without Kerberos authentication
  Opened TCP connection to localhost:32860
  Connected to localhost:32860
  Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build
  d7870fe03645490f95bd5ffd4a2177f90eb2f3c0)
  Query: select coordinator()
  Query submitted at: 2018-12-11 15:51:04 (Coordinator:
  http://8063e77ce999:25000)
  Query progress can be monitored at:
  http://8063e77ce999:25000/query_plan?query_id=1b4d03f0f0f1fcfb:b0b37e5000000000
  +---------------+
  | coordinator() |
  +---------------+
  | 8063e77ce999  |
  +---------------+
  Fetched 1 row(s) in 0.11s

Change-Id: Ifea707aa3cc23e4facda8ac374160c6de23ffc4e
Reviewed-on: http://gerrit.cloudera.org:8080/12074
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2018-12-18 04:45:32 +00:00

76 lines
2.9 KiB
Python
Executable File

#!/usr/bin/env impala-python
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# Assembles the artifacts required to build docker containers into a single directory.
# Most artifacts are symlinked so need to be dereferenced (e.g. with tar -h) before
# being used as a build context.
import glob
import os
import shutil
IMPALA_HOME = os.environ["IMPALA_HOME"]
OUTPUT_DIR = os.path.join(IMPALA_HOME, "docker/build_context")
DOCKERFILE = os.path.join(IMPALA_HOME, "docker/impala_base/Dockerfile")
IMPALA_TOOLCHAIN = os.environ["IMPALA_TOOLCHAIN"]
IMPALA_GCC_VERSION = os.environ["IMPALA_GCC_VERSION"]
GCC_HOME = os.path.join(IMPALA_TOOLCHAIN, "gcc-{0}".format(IMPALA_GCC_VERSION))
# Ensure the output directory exists and is empty.
if os.path.exists(OUTPUT_DIR):
shutil.rmtree(OUTPUT_DIR)
os.makedirs(OUTPUT_DIR)
os.symlink(os.path.relpath(DOCKERFILE, OUTPUT_DIR),
os.path.join(OUTPUT_DIR, "Dockerfile"))
BIN_DIR = os.path.join(OUTPUT_DIR, "bin")
LIB_DIR = os.path.join(OUTPUT_DIR, "lib")
os.mkdir(BIN_DIR)
os.mkdir(LIB_DIR)
def symlink_file_into_dir(src_file, dst_dir):
"""Helper to symlink 'src_file' into 'dst_dir'."""
os.symlink(src_file, os.path.join(dst_dir, os.path.basename(src_file)))
# Impala binaries and native dependencies.
for bin in ["impalad", "statestored", "catalogd", "libfesupport.so"]:
symlink_file_into_dir(os.path.join(IMPALA_HOME, "be/build/latest/service", bin),
BIN_DIR)
for lib in ["libstdc++", "libgcc"]:
for so in glob.glob(os.path.join(GCC_HOME, "lib64/{0}*.so*".format(lib))):
symlink_file_into_dir(so, LIB_DIR)
os.symlink(os.environ["IMPALA_KUDU_HOME"], os.path.join(OUTPUT_DIR, "kudu"))
# Impala jars and dependencies.
for glob_pattern in [os.path.join(IMPALA_HOME, "fe/target/dependency/*.jar"),
os.path.join(IMPALA_HOME, "fe/target/impala-frontend-*.jar")]:
for jar in glob.glob(glob_pattern):
symlink_file_into_dir(jar, LIB_DIR)
# Templates for debug web pages.
os.symlink(os.path.join(IMPALA_HOME, "www"), os.path.join(OUTPUT_DIR, "www"))
# Scripts
symlink_file_into_dir(os.path.join(IMPALA_HOME, "docker/daemon_entrypoint.sh"), BIN_DIR)
# Minicluster configs
os.symlink(os.path.join(IMPALA_HOME, "fe/src/test/resources"),
os.path.join(OUTPUT_DIR, "conf"))