The goal of this project is to develop a plug&play containerised deployment and serving of the developed Machine Learning models.
It is done by using (for example, can be any) onnx format for models, Fast API for request, redis for keeping current data and docker for keeping it nice and tight. For now the toy model is CNN MNIST classifier but it can be easily extended for other models and/or regressors/other types.
Project contains three independent docker containers:
web_api
- front-end container (CLI or streamlit UI) of the project, exposed to outside world for users' requests/input. It's main role is to pass a request to proper, defined model waiting withinmodel_api
and return results. Take a look intoREADME.md
inweb_api
for more details.model_api
- responsible for model initialization (using onnx runtime session) and communication only withweb_api
for the inputs and results anddb_api
to store current results in redis db and persist them. Take a look intoREADME.md
inmodel_api
for more details.db_api
- database of the project responsible for keeping current (or all) models results. If one requests again the same input for the same model, thedb_api
will return it's previous value instead of inferencing again the input. It is basically a wrapper of the redis image which is configured for the current setup within network bridge.
Those containers can communicate only within specified docker network bridge stack_api
exposing only one port within the web_api
for the communication with outside world.
- Install docker, docker-compose etc. if necessary
- Create network bridge for containers:
docker network create stack_api
- Run:
chmod a+x build_all.sh start_all.sh check_health.sh stop_all.sh
to turn 'executability' ofsh
files - Build images by:
./build_all.sh
- Start containers by:
./start_all.sh APP_TYPE MODEL_NAME
, whereAPP_TYPE
andMODEL_NAME
variables are define below. Basically it starts:model_api
- container withMODEL_NAME
ready for inference in gpu (or cpu)onnx
session.db_api
-redis
container which saves inference results according to hexdigest of theAPP_TYPE
and selected image name so that if we made an inference twice on the same image we would make one prediction and one would be taken from redis to save for example time.web_api
-fast_api
ifAPP_TYPE=cli
orstreamlit
ifAPP_TYPE=ui
container for posting requests for inference and storing/getting them to/from db. Runs on port5000
locally. If:APP_TYPE=ui
selected you can go tolocalhost:5000
to use UIAPP_TYPE=cli
post requests via cli (for example:python web_api/tests/test_{MODEL_NAME}_cli_request.py {IMAGE_PATH}.png
)
- You can check whether the containers are running properly by:
./check_health.sh
or peek into them by:docker logs CONTAINER_NAME
- To test containers separately just run:
python tests/{TEST_NAME}.py
within containers dirs. - To stop containers just run:
./stop_all.sh
MODEL_NAME
for now is available in two options:mnist
andleukemia
if different name is specified container won't start.APP_TYPE
for now is available in two options:ui
(streamlit) andcli
(usual cmd) if different name is specified container won't start.
If you get error from model_api
while starting like: [...]CUDA failure 999: unknown error[...]
do the following:
sudo rmmod nvidia_uvm
sudo rmmod nvidia
sudo modprobe nvidia
sudo modprobe nvidia_uvm
And try again starting containers.