Files
impala/docker/quickstart_client/Dockerfile
Tim Armstrong eb85c6eeca IMPALA-9793: Impala quickstart cluster with docker-compose
What works:
* A single node cluster can be started up with docker-compose
* HMS data is stored in Derby database in a docker volume
* Filesystem data is stored in a shared docker volume, using the
  localfs support in the Hadoop client.
* A Kudu cluster with a single master can be optionally added on
  to the Impala cluster.
* TPC-DS data can be loaded automatically by a data loading container.

We need to set up a docker network called quickstart-network,
purely because docker-compose insists on generating network names
with underscores, which are part of the FQDN and end up causing
problems with Java's URL parsing, which rejects these technically
invalid domain names.

How to run:

Instructions for running the quickstart cluster are in
docker/README.md.

How to build containers:

  ./buildall.sh -release -noclean -notests -ninja
  ninja quickstart_hms_image quickstart_client_image docker_images

How to upload containers to dockerhub:

  IMPALA_QUICKSTART_IMAGE_PREFIX=timgarmstrong/
  for i in impalad_coord_exec impalad_coordinator statestored \
           impalad_executor catalogd impala_quickstart_client \
           impala_quickstart_hms
  do
    docker tag $i ${IMPALA_QUICKSTART_IMAGE_PREFIX}$i
    docker push ${IMPALA_QUICKSTART_IMAGE_PREFIX}$i
  done

I pushed containers build from commit f260cce22, which
was branched from 6cb7cecacf on master.

Misc other stuff:
* Added more metadata to all images.

TODO:
* Test and instructions to run against Kudu quickstart
* Upload latest version of containers before merging.

Change-Id: Ifc0b862af40a368381ada7ec2a355fe4b0aa778c
Reviewed-on: http://gerrit.cloudera.org:8080/15966
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-26 11:22:08 +00:00

71 lines
2.6 KiB
Docker

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# Build an image that runs a script to load data into the quickstart warehouse.
# The data load script is os-independent, so only build for a fixed OS.
ARG BASE_IMAGE=ubuntu:18.04
FROM ${BASE_IMAGE}
# Common label arguments.
ARG MAINTAINER
ARG URL
ARG VCS_REF
ARG VCS_TYPE
ARG VCS_URL
ARG VERSION
# Install useful utilities. Set to non-interactive to avoid issues when installing tzdata.
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y \
sudo netcat-openbsd less curl iproute2 vim iputils-ping \
libsasl2-dev libsasl2-2 libsasl2-modules libsasl2-modules-gssapi-mit \
tzdata krb5-user python-pip && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Install impala-shell from pip.
# TODO: consider if it would be better to use the latest impala-shell from the build
# environment.
RUN pip install impala-shell
# Use a non-privileged impala user to run the daemons in the container.
# That user should own everything in the /opt/impala and /var/lib/impala subdirectories
RUN groupadd -r impala -g 1000 && useradd --no-log-init -r -u 1000 -g 1000 impala && \
mkdir -p /opt/impala && chown impala /opt/impala
USER impala
# Copy the client entrypoint and dataload files.
WORKDIR /opt/impala
COPY --chown=impala data-load-entrypoint.sh /data-load-entrypoint.sh
COPY --chown=impala *.sql /opt/impala/sql/
USER impala
# Add the entrypoint.
ENTRYPOINT ["/data-load-entrypoint.sh"]
LABEL name="Apache Impala Quickstart Client" \
description="Client tools for Impala quickstart, including impala-shell and data loading utilities." \
# Common labels.
org.label-schema.maintainer=$MAINTAINER \
org.label-schema.url=$URL \
org.label-schema.vcs-ref=$VCS_REF \
org.label-schema.vcs-type=$VCS_TYPE \
org.label-schema.vcs-url=$VCS_URL \
org.label-schema.version=$VERSION