Merge pull request #208 from mlrun/0.9.x-dev

updating demos content
mlrun · Dec 27, 2021 · d018cde · d018cde
2 parents 3840eaa + 000987b
commit d018cde
Show file tree

Hide file tree

Showing 118 changed files with 7 additions and 48,443 deletions.
diff --git a/README.md b/README.md
@@ -7,15 +7,9 @@ The mlrun/demos repository provides demos that implement full end-to-end ML use-
 - [Overview](#overview)
   - [General ML Workflow](#general-ml-workflow)
 - [Prerequisites](#prerequisites)
-- [Mask Detection Demo](#mask-detection-demo)
-- [scikit-learn Demo: Full AutoML Pipeline](#demo-scikit-learn)
-- [Image Classification with Distributed Training Demo](#demo-image-classification)
-- [Faces Demo: Real-Time Image Recognition with Deep Learning](#demo-face-recognition)
-- [Churn Demo: Real-Time Customer-Churn Prediction](#demo-churn)
-- [NetOps Demo: Predictive Network Operations/Telemetry](#demo-netops)
-- [Stock-Analysis Demo](#demo-stocks)
-- [Model deployment Pipeline: Real-time operational Pipeline](#demo-model-deployment)
 - [How-To: Converting Existing ML Code to an MLRun Project](#howto-convert-to-mlrun)
+- [Mask Detection Demo](#mask-detection-demo)
+
 
 <a id="overview"></a>
 ## Overview
@@ -67,174 +61,10 @@ In this demo you will learn how to:
 * Test and deploy the serving graph.
 * Write and run an automatic pipeline workflow.
 
-<a id="demo-scikit-learn"></a>
-## scikit-learn Demo: Full AutoML Pipeline
-
-The [**scikit-learn-pipeline**](./scikit-learn-pipeline/README.md) demo demonstrates how to build a full end-to-end automated-ML (AutoML) pipeline using [scikit-learn](https://scikit-learn.org) and the UCI [Iris data set](http://archive.ics.uci.edu/ml/datasets/iris).
-
-The combined CI/data/ML pipeline includes the following steps:
-
-- Create an Iris data-set generator (ingestion) function.
-- Ingest the Iris data set.
-- Analyze the data-set features.
-- Train and test the model using multiple algorithms (AutoML).
-- Deploy the model as a real-time serverless function.
-- Test the serverless function's REST API with a test data set.
-
-To run the demo, download the [**sklearn-project.ipynb**](./scikit-learn-pipeline/sklearn-project.ipynb) notebook into an empty directory and run the code cells according to the instructions in the notebook.
-
-<p><img src="./docs/trees.png" alt="scikit-learn tress image" width="500"/></p>
-
-<a id="demo-scikit-learn-pipeline-output"></a>
-**Pipeline Output**
-
-The output plots can be viewed as static HTML files in the [scikit-learn-pipeline/plots](scikit-learn-pipeline/plots) directory.
-
 <p><img src="./docs/skpipe.png" alt="scikit-learn pipeline output" width="500"/></p>
 
-<a id="demo-image-classification"></a>
-## Image Classification with Distributed Training Demo
-
-The [**image-classification-with-distributed-training**](image-classification-with-distributed-training/README.md) demo demonstrates an end-to-end image-classification solution using [TensorFlow](https://www.tensorflow.org/) (versions 1 or 2), [Keras](https://keras.io/), [Horovod](https://eng.uber.com/horovod/), and [Nuclio](https://nuclio.io/).
-
-The demo consists of four MLRun and Nuclio functions and a Kubeflow Pipelines orchestration:
-
-1. **Download** &mdash; import an image archive from AWS S3 to your cluster's data store.
-2. **Label** &mdash; tag the images based on their name structure.
-3. **Training** &mdash; perform distributed training using TensorFlow, Keras, and Horovod.
-4. **Inference** &mdash; automate deployment of a Nuclio model-serving function.
-
-> **Note:** The demo supports both TensorFlow versions 1 and 2.
-> There's one shared notebook and two code files &mdash; one for each TensorFlow version.
-
-<a id="demo-image-classification-demo-flow"></a>
-**Demo Workflow**
-
-<p><img src="./docs/hvd-flow.png" alt="Image-classification demo workflow" width="600"/></p>
-
-<a id="demo-image-classification-pipeline-output"></a>
-**Pipeline Output**
-
 <p><img src="./docs/hvd-pipe.png" alt="Image-classification pipeline output" width="500"/></p>
 
-<a id="demo-face-recognition"></a>
-## Faces Demo: Real-Time Image Recognition with Deep Learning
-
-The [**faces**](realtime-face-recognition/README.md) demo demonstrates real-time capture, recognition, and classification of face images over a video stream, as well as location tracking of identities.
-
-This comprehensive demonstration includes multiple components:
-
-- A live image-capture utility.
-- Image identification and tracking using [OpenCV](https://opencv.org/).
-- A labeling application for tagging unidentified faces using [Streamlit](https://www.streamlit.io/).
-- Model training using [PyTorch](https://pytorch.org).
-- Automated model deployment using [Nuclio](https://nuclio.io/)
-
-<a id="demo-face-recognition-demo-flow"></a>
-**Demo Workflow**
-
-<p><img src="./realtime-face-recognition/workflow.png" alt="Face-recognition pipeline output" width="500"/></p>
-
-<a id="demo-churn"></a>
-## Churn Demo: Real-Time Customer-Churn Prediction
-
-The [**churn**](./customer-churn-prediction/README.md) demo demonstrates analysis of customer-churn data using the Kaggle [Telco Customer Churn data set](https://www.kaggle.com/blastchar/telco-customer-churn), model training and validation using [XGBoost](https://xgboost.readthedocs.io), and model serving using real-time Nuclio serverless functions.
-
-The demo consists of few MLRun and Nuclio functions and a Kubeflow Pipelines orchestration:
-
-1.  Write custom data encoders for processing raw data and categorizing or "binarizing" various features.
-2.  Summarize the data, examining parameters such as class balance and variable distributions.
-3.  Define parameters and hyperparameters for a generic XGBoost training function.
-4.  Train and test several models using XGBoost.
-5.  Identify the best model for your needs, and deploy it into production as a real-time Nuclio serverless function.
-6.  Test the model server.
-
-<a id="demo-churn-pipeline-output"></a>
-**Pipeline Output**
-
-<p><img src="./customer-churn-prediction/assets/pipeline-3.png" alt="Cutomer-churn pipeline output" width="500"/></p>
-
-<a id="demo-netops"></a>
-## NetOps Demo: Predictive Network Operations/Telemetry
-
-The [NetOps demo](network-operations/README.md) demo demonstrates how to build an automated ML pipeline for predicting network outages based on network-device telemetry, also known as Network Operations (NetOps).
-The demo implements both model training and inference, including model monitoring and concept-drift detection.
-The demo simulates telemetry network data for running the pipeline.
-
-The demo demonstrates how to
-
-- Manage MLRun projects.
-- Use GitHub as a source for functions to use in pipeline workflows.
-- Use MLRun logging to track results and artifacts.
-- Use MLRun to run a [Kubeflow Pipelines](https://www.kubeflow.org/docs/pipelines/) pipeline.
-- Deploy a live-endpoints production pipeline.
-- Deploy a concept-drift pipeline.
-
-The demo implements three pipelines:
-
-- **Training pipeline** &mdash; ingestion of telemetry data, exploratory data analysis, data preparation (aggregation), feature selection, and model training and testing.
-- **Production-deployment pipeline** &mdash; automated model deployment.
-- **Concept-drift pipeline** &mdash; streaming of concept-drift detectors using a live Nuclio endpoint, and application of multiple drift-magnitude metrics to asses the drift between a base data set and the latest data.
-
-<a id="demo-netops-pipeline-output"></a>
-**Pipeline Output**
-
-<p><img src="./docs/netops-pipe.png" alt="NetOps pipeline output" width="500"/></p>
-
-<a id="demo-stocks"></a>
-## Stock-Analysis Demo
-
-This demo tackles a common requirement of running a data-engineering pipeline as part of ML model serving by reading data from external data sources and generating insights using ML models.
-The demo reads stock data from an external source, analyzes the related market news, and visualizes the analyzed data in a Grafana dashboard.
-
-The demo demonstrates how to
-
-- Train a sentiment-analysis model using the Bidirectional Encoder Representations from Transformers ([BERT](https://github.com/google-research/bert)) natural language processing (NLP) technique, and deploy the model.
-- Deploy Python code to a scalable function using [Nuclio](https://nuclio.io/).
-- Integrate with the real-time multi-model data layer of the Iguazio Data Science Platform ("the platform") &mdash; time-series databases (TSDB) and NoSQL (key-value) storage.
-- Leverage machine learning to generate insights.
-- Process streaming data and visualize it on a user-friendly dashboard.
-
-The demo include the following steps:
-
-1.  **Training and validating the model** (BERT Model); can be skipped by downloading a pre-trained model (default).
-2.  **Deploying a sentiment-analysis model server**.
-3.  **Ingesting stock data**.
-4.  **Scraping stock news and analyzing sentiments** to generate sentiment predictions.
-5.  **Deploying a stream viewer** that reads data from the stocks news and sentiments stream and can be used as a Grafana data source.
-6.  **Visualizing the data on a Grafana dashboard**, using the stream viewer as a data source.
-
-<a id="demo-stocks-pipeline-output"></a>
-**Pipeline Output**
-
-<p><img src="./stock-analysis/assets/images/stocks-demo-pipeline.png" alt="Stock-analysis pipeline output" width="500"/></p>
-
-<a id="demo-model-deployment"></a>
-
-## Model deployment Pipeline: Real-time operational Pipeline
-
-This demo shows how to deploy a model with streaming information.
-
-This demo is comprised of several steps:
-
-<p><img src="./model-deployment-pipeline/assets/model-deployment-pipeline.png" alt="Model deployment Pipeline Real-time operational Pipeline" width="500"/></p>
-
-
-While this demo covers the use case of 1<sup>st</sup>-day churn, it is easy to replace the data, related features and training model and reuse the same workflow for different business cases.
-
-These steps are covered by the following pipeline:
-
-- **1. Data generator** — Generates events for the training and serving and Create an enrichment table (lookup values). 
-- **2. Event handler** - Receive data from the input. This is a common input stream for all the data. This way, one can easily replace the event source data (in this case we have a data generator) without affecting the rest of this flow. It also store all incoming data to parquet files.
-- **3. Stream to features** - Enrich the stream using the enrichment table and Update aggregation features using the incoming event handler.
-- **4. Optional model training steps -**
- - **4.1 Get Data Snapshot** - Takes a snapshot of the feature table for training.
-  - **4.2 Describe the Dataset** - Runs common analysis on the datasets and produces plots suche as histogram, feature importance, corollation and more.
-  - **4.3 Training** - Runing training with multiple classification models.
-  - **4.4 Testing** - Testing the best performing model.
-- **5. Serving** - Serve the model and process the data from the enriched stream and aggregation features.
-- **6. Inference logger** - We use the same event handler function from above but only its capability to store incoming data to parquet files.
-
 <a id="howto-convert-to-mlrun"></a>
 ## How-To: Converting Existing ML Code to an MLRun Project
 

diff --git a/customer-churn-prediction/.test b/customer-churn-prediction/.test