Find the nicer way to initialize off-the-shelf system in microservice architecture #236

tloubrieu-jpl · 2022-01-19T20:43:05Z

💪 Motivation

...so we can have a strategy for future developments

📖 Additional Details

As a use case, we want a consistent way to initialize rabbitmq and elasticsearch for the registry application.

⚙️ Engineering Details

The options (to be investigated and extended) are:

have init-docker containers
extend the existing docker images (of rabbitmq for example and add the code which creates the needed queues in the docker image)
have the initialization code called by the microservice component (e.g. the harvest service creates the rabbitmq queue needed if they don't exist yet)

A good example of initialization we want to manage is the creation of a database schema and users.

tloubrieu-jpl · 2022-01-27T19:39:00Z

@nutjob4life @tdddblog @ramesh-maddegoda can you write your inputs on this topics as comment in the ticket ?

nutjob4life · 2022-02-09T17:33:57Z

Multi-stage builds can be helpful here, letting you make an image that has not just the service but also the pre-configured data (database schema, expected records, etc.).

Here's an example Dockerfile that uses a multi-stage build that lets you make a new image for a hypothetical database called Persistence™ that's pre-loaded with a schema as well as database rows:

# Stage 1
# =======
#
# We start with the database image, in this case a hypothetical database
# called "persistence"

FROM persistenece:1.2.3 AS initialization


# Set up some defaults for this database

ENV PERSISTENCE_USERNAME="db"
ENV PERSISTENCE_PASSWORD="p455w0rd"


# This database expects initial schema to be in /var/persistence/init.d and
# initial data in /var/persistence/load.d

COPY etc/product-schema.sql etc/label-schema.sql /var/persistence/init.d/
COPY data/*.sql /var/persistence/load.d/
COPY data/blobs/ /var/persistence/load.d/lobs/


# This database has a special command-line option that tells it not to start
# up as a daemon process. Other databases may require you to modify the
# Docker entrypoint script with /usr/bin/sed or by forcing us to provde
# an alternative starutp scrupt (this turns out to be fairly common):

RUN : &&\
    /usr/local/bin/persistence-entrypoint.sh --load-only /tmp/dump &&\
    :


# For example, if this were Postgres, we'd do
#
# RUN : &&\
#     sed --in-place --expression='s/exec \"$@\"//' /usr/local/bin/docker-entrypoint.sh &&\
#     /usr/local/bin/docker-entrypoint.sh postgres &&\
#     :
#
# Solr has a special command-line option like `solr-create` which does
# something similar.


# Stage 2
# =======
#
# We go back to "persistence" again

FROM persistence:1.2.3


# But this time we can copy over the service's database files

COPY --from=initialization /tmp/dump /var/persistence/db

# There's no need for `RUN rm -rf/ /tmp/dump`. It doesn't exist in this layer!
# It's only in the `initialization` layer.

There's no special syntax needed to build this image; a single build command does it:

docker image build --tag pds-persistence .

ramesh-maddegoda · 2022-02-10T08:31:38Z

These are 2 approaches that I currently use to initialize Elasticsearch and RabbitMQ in docker compose.

To initialize Elasticsearch,
I created an elasticsearch-init service in docker compose as follows.

# Initializes Elasticsearch by creating registry and data dictionary indices by utilizing the Registry Loader
  init-elasticsearch:
    profiles: ["elastic", "big-data", "big-data-integration-test"]
    image: ${REG_LOADER_IMAGE}
    environment:
      - ES_URL=${ES_URL}
    volumes:
      - ./scripts/init-elasticsearch.sh:/usr/local/bin/init-elasticsearch.sh
    networks:
      - pds
    entrypoint: ["bash", "/usr/local/bin/init-elasticsearch.sh"]

After that, in the init-elasticsearch.sh (which is called in the endpoint above), I added the following code to wait for Elasticsearch.

# Check if the ES_URL environment variable is set
if [ -z "$ES_URL" ]; then
    echo "Error: 'ES_URL' (Elasticsearch URL) environment variable is not set. Use docker's -e option." 1>&2
    exit 1
fi

echo "Waiting for Elasticsearch to launch..."  1>&2
while ! curl --output /dev/null --silent --head --fail "$ES_URL"; do
  sleep 1
done

echo "Creating registry and data dictionary indices..." 1>&2
registry-manager create-registry -es "$ES_URL"

With above approach, we can wait for any service (which has a post to be opened) and execute a script once the service is available.

To initialize RabbitMQ,
I used a RabbitMQ definition file and initialized RabbitMQ in docker compose as follows.

# Starts RabbitMQ
  rabbit-mq:
    profiles: ["rabbitmq", "big-data", "big-data-integration-test"]
    image: rabbitmq:3.9-management
    ports:
      - "15672:15672"
      - "5672:5672"
    volumes:
      - ./default-config/rabbitmq.conf:/etc/rabbitmq/rabbitmq.conf:ro
      - ./default-config/rabbitmq-definitions.json:/etc/rabbitmq/definitions.json:ro
    networks:
      - pds

I think similar approaches can be used to initialize any micro-service, when there is a port number to wait for available.

tloubrieu-jpl · 2022-02-17T21:58:42Z

Thanks @nutjob4life @ramesh-maddegoda ,
I also like the 3rd approach (have the initialization code called by the microservice component (e.g. the harvest service creates the rabbitmq queue needed if they don't exist yet) because that makes the one who knows what is needed to create it, instead of creating tables or queue in one repository, and using them in another one.

What @nutjob4life proposes makes us create/maintain specific images... that might be what we need, I don't know.

tloubrieu-jpl · 2022-02-17T23:41:06Z

We could also think of using a message broker to make the components aware of their status.

But for a general approach we decide to use specialized docker images as proposed by Sean (option 2).

tloubrieu-jpl · 2022-02-18T00:58:43Z

@nutjob4life wrote on slack:
just want to temper what I said in breakout. My idea of intermediate builds really only applies if we are distributing re-usable images intended for general consumption if what we are doing is just saying "we need service x" "we need service y" and these are listed in compose then the extra overhead is maybe not worth it

tloubrieu-jpl · 2022-02-18T00:59:40Z

That will not be fixed for this build, no conclusion from our team of experts.

tloubrieu-jpl added enhancement New feature or request B12.1 skip-i&t labels Jan 19, 2022

tloubrieu-jpl assigned nutjob4life, tdddblog and ramesh-maddegoda Jan 19, 2022

tloubrieu-jpl added the sprint-backlog label Jan 27, 2022

tloubrieu-jpl closed this as completed Feb 17, 2022

tloubrieu-jpl reopened this Feb 18, 2022

jordanpadams added icebox p.should-have and removed sprint-backlog B12.1 labels Mar 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Find the nicer way to initialize off-the-shelf system in microservice architecture #236

Find the nicer way to initialize off-the-shelf system in microservice architecture #236

tloubrieu-jpl commented Jan 19, 2022 •

edited

Loading

tloubrieu-jpl commented Jan 27, 2022

nutjob4life commented Feb 9, 2022

ramesh-maddegoda commented Feb 10, 2022

tloubrieu-jpl commented Feb 17, 2022

tloubrieu-jpl commented Feb 17, 2022

tloubrieu-jpl commented Feb 18, 2022

tloubrieu-jpl commented Feb 18, 2022

Find the nicer way to initialize off-the-shelf system in microservice architecture #236

Find the nicer way to initialize off-the-shelf system in microservice architecture #236

Comments

tloubrieu-jpl commented Jan 19, 2022 • edited Loading

💪 Motivation

📖 Additional Details

⚙️ Engineering Details

tloubrieu-jpl commented Jan 27, 2022

nutjob4life commented Feb 9, 2022

ramesh-maddegoda commented Feb 10, 2022

tloubrieu-jpl commented Feb 17, 2022

tloubrieu-jpl commented Feb 17, 2022

tloubrieu-jpl commented Feb 18, 2022

tloubrieu-jpl commented Feb 18, 2022

tloubrieu-jpl commented Jan 19, 2022 •

edited

Loading