From 3178b77723dc264f527b11605e3e5667ced437ce Mon Sep 17 00:00:00 2001 From: Eyal Danieli Date: Wed, 25 Sep 2024 12:36:48 +0300 Subject: [PATCH] [Infra] Align master with dev functions (#828) * [onnx utils] update onnx utils packages * [Noise-reduction] Add new function to hub (#765) * [Noise-reduction] Add new function to hub * fix test * added multiprocessing and silence removal to function * delete `load_dask` (#822) * [feature selection] update function yaml * [feature selection] update function yaml * Revert "[onnx utils] update onnx utils packages" This reverts commit 88727986ffa91662593958023be8ac3ccef2cab0. * [feature selection] update function yaml * [feature selection] update function yaml * Delete unsupported functions from the hub (#824) * delete EOS functions * bring back validate_great_expectations * bring back load_dataset * Update feature_selection/test_feature_selection.py Co-authored-by: Eyal Danieli * Update item.yaml * Align to master branch (#826) * [Category] Fix and add categories to functions (#808) * [Category] Fix and add categories to functions * bump version in structured * test is not valid in huggingface_serving * Fix duplicated footer * Fix duplicated footer * revert python version change as it will be done in another PR * comments * comments * Bump python:3.6 to python:3.9 (#810) * [Describe] Align describe to new pandas version (#812) * [Describe] Align describe to new pandas version * minor test fix * update mlrun version * add dask to requirements * remove dask * update numpy version * debug * debug * debug * remove dask tests * remove debug code * [get_offline_features] Updated to mlrun 1.6.3 (#813) * [Feature-selection] Replace matplotlib with plotly (#815) * Iguazio-cicd user token updated Iguazio-cicd user token updated in repo secrets: https://github.com/mlrun/functions/settings/secrets/actions MARKETPLACE_ACCESS_TOKEN_V3 new token gh...Zmf was set around April * forcing iguazio-cicd auth forcing iguazio-cicd to deal with Author identity unknown * checkout@v3 to v4 and echo * [Mlflow_utils] - mlflow model server (#811) * mlflow server * small fix to test * small fixes to ms and nb * small fixes to mlrun version * update requirements lightgbm * added req * Added xgboost to req --------- Co-authored-by: Avi Asulin <34214569+aviaIguazio@users.noreply.github.com> * [Mlflow] Remove mlflow tag (#825) * remove mlflow tag * remove mlflow tag --------- Co-authored-by: Avi Asulin <34214569+aviaIguazio@users.noreply.github.com> * align feature_selection yaml --------- Co-authored-by: Avi Asulin <34214569+aviaIguazio@users.noreply.github.com> Co-authored-by: Yonatan Shelach <92271540+yonishelach@users.noreply.github.com> Co-authored-by: rokatyy Co-authored-by: Katerina Molchanova <35141662+rokatyy@users.noreply.github.com> Co-authored-by: nashpaz123 <44337075+nashpaz123@users.noreply.github.com> Co-authored-by: ZeevRispler <73653682+ZeevRispler@users.noreply.github.com> --------- Co-authored-by: Avi Asulin Co-authored-by: Yonatan Shelach <92271540+yonishelach@users.noreply.github.com> Co-authored-by: Avi Asulin <34214569+aviaIguazio@users.noreply.github.com> Co-authored-by: rokatyy Co-authored-by: Katerina Molchanova <35141662+rokatyy@users.noreply.github.com> Co-authored-by: nashpaz123 <44337075+nashpaz123@users.noreply.github.com> Co-authored-by: ZeevRispler <73653682+ZeevRispler@users.noreply.github.com> --- README.md | 1 - bert_embeddings/bert_embeddings.ipynb | 503 ---- bert_embeddings/bert_embeddings.py | 41 - bert_embeddings/function.yaml | 42 - bert_embeddings/item.yaml | 28 - bert_embeddings/requirements.txt | 1 - bert_embeddings/test_bert_embeddings.py | 32 - catalog.json | 2 +- catalog.yaml | 9 - concept_drift/README.md | 132 - concept_drift/concept_drift.ipynb | 793 ------ concept_drift/concept_drift.py | 147 - concept_drift/function.yaml | 112 - concept_drift/item.yaml | 27 - concept_drift/requirements.txt | 1 - .../concept_drift_streaming.ipynb | 480 ---- .../concept_drift_streaming.py | 157 - concept_drift_streaming/function.yaml | 48 - concept_drift_streaming/item.yaml | 29 - concept_drift_streaming/requirements.txt | 1 - feature_perms/README.ipynb | 788 ----- feature_perms/feature_perms.ipynb | 1106 ------- feature_perms/feature_perms.py | 174 -- feature_perms/function.yaml | 63 - feature_perms/item.yaml | 25 - feature_perms/requirements.txt | 5 - feature_perms/test_feature_perms.py | 134 - feature_selection/function.yaml | 56 +- feature_selection/item.yaml | 4 +- feature_selection/requirements.txt | 2 +- feature_selection/test_feature_selection.py | 1 + get_offline_features/function.yaml | 127 - .../get_offline_features.ipynb | 1536 ---------- get_offline_features/get_offline_features.py | 143 - get_offline_features/item.yaml | 26 - .../test_get_offline_features.py | 239 -- hugging_face_classifier_trainer/function.yaml | 370 --- .../hugging_face_classifier_trainer.ipynb | 2533 ----------------- .../hugging_face_classifier_trainer.py | 832 ------ hugging_face_classifier_trainer/item.yaml | 33 - .../requirements.txt | 6 - .../test_hugging_face_classifier_trainer.py | 145 - huggingface_auto_trainer/function.yaml | 327 --- .../huggingface_auto_trainer.ipynb | 195 -- .../huggingface_auto_trainer.py | 855 ------ huggingface_auto_trainer/item.yaml | 27 - huggingface_auto_trainer/requirements.txt | 5 - .../test_huggingface_auto_trainer.py | 42 - ingest/function.yaml | 87 - ingest/ingest.ipynb | 762 ----- ingest/ingest.py | 84 - ingest/item.yaml | 27 - ingest/test_ingest.py | 171 -- load_dask/function.yaml | 75 - load_dask/item.yaml | 25 - load_dask/load_dask.ipynb | 309 -- load_dask/load_dask.py | 68 - model_monitoring_stream/function.yaml | 267 -- model_monitoring_stream/item.yaml | 23 - .../model_monitoring_stream.ipynb | 178 -- .../model_monitoring_stream.py | 768 ----- model_monitoring_stream/requirements.txt | 3 - noise_reduction/data/test_data.mp3 | Bin 0 -> 27972 bytes noise_reduction/data/test_data.wav | Bin 0 -> 179672 bytes noise_reduction/function.yaml | 194 ++ noise_reduction/item.yaml | 29 + noise_reduction/noise_reduction.ipynb | 942 ++++++ noise_reduction/noise_reduction.py | 625 ++++ noise_reduction/requirements.txt | 5 + noise_reduction/test_noise_reduction.py | 75 + pandas_profiling_report/README.md | 26 - pandas_profiling_report/function.yaml | 40 - pandas_profiling_report/item.yaml | 25 - .../pandas_profiling_report.ipynb | 794 ------ .../pandas_profiling_report.py | 41 - project_runner/function.yaml | 53 - project_runner/project_runner.ipynb | 340 --- rnn_serving/function.yaml | 46 - rnn_serving/item.yaml | 25 - rnn_serving/requirements.txt | 2 - rnn_serving/rnn_serving.ipynb | 285 -- rnn_serving/rnn_serving.py | 35 - rnn_serving/test_rnn_serving.py | 74 - slack_notify/README.md | 1 - slack_notify/function.yaml | 48 - slack_notify/item.yaml | 25 - slack_notify/slack_notify.ipynb | 293 -- slack_notify/slack_notify.py | 48 - snowflake_dask/README.md | 38 - snowflake_dask/config-template.yaml | 5 - snowflake_dask/function.yaml | 81 - .../img/iguazio-project-secrets.png | Bin 105122 -> 0 bytes snowflake_dask/img/snowflake-dask.png | Bin 58722 -> 0 bytes snowflake_dask/item.yaml | 25 - snowflake_dask/requirements.txt | 2 - snowflake_dask/snowflake-dask-mlrun.ipynb | 437 --- snowflake_dask/snowflake_dask.py | 125 - snowflake_dask/test_snowflake_dask.py | 24 - sql_to_file/function.yaml | 47 - sql_to_file/item.yaml | 24 - sql_to_file/requirements.txt | 2 - sql_to_file/sql_to_file.ipynb | 1567 ---------- sql_to_file/sql_to_file.py | 45 - sql_to_file/test_sql_to_file.py | 31 - stream_to_parquet/function.yaml | 45 - stream_to_parquet/item.yaml | 28 - stream_to_parquet/stream_to_parquet.ipynb | 698 ----- stream_to_parquet/stream_to_parquet.py | 96 - tf1_serving/function.yaml | 48 - tf1_serving/item.yaml | 28 - tf1_serving/requirements.txt | 2 - tf1_serving/tf1_serving.ipynb | 567 ---- tf1_serving/tf1_serving.py | 87 - tf2_serving_v2/function.yaml | 45 - tf2_serving_v2/item.yaml | 28 - tf2_serving_v2/requirements.txt | 2 - tf2_serving_v2/tf2_serving_v2.ipynb | 545 ---- tf2_serving_v2/tf2_serving_v2.py | 71 - virtual_drift/README.md | 56 - virtual_drift/function.yaml | 129 - virtual_drift/item.yaml | 28 - virtual_drift/virtual_drift.ipynb | 935 ------ virtual_drift/virtual_drift.py | 206 -- xgb_custom/function.yaml | 241 -- xgb_custom/item.yaml | 26 - xgb_custom/requirements.txt | 7 - xgb_custom/test_xgb_custom.py | 50 - xgb_custom/xgb_custom.ipynb | 922 ------ xgb_custom/xgb_custom.py | 239 -- xgb_serving/function.yaml | 40 - xgb_serving/item.yaml | 29 - xgb_serving/requirements.txt | 7 - xgb_serving/test_xgb_serving.py | 67 - xgb_serving/xgb_serving.ipynb | 421 --- xgb_serving/xgb_serving.py | 33 - 135 files changed, 1897 insertions(+), 25585 deletions(-) delete mode 100644 bert_embeddings/bert_embeddings.ipynb delete mode 100644 bert_embeddings/bert_embeddings.py delete mode 100644 bert_embeddings/function.yaml delete mode 100644 bert_embeddings/item.yaml delete mode 100644 bert_embeddings/requirements.txt delete mode 100644 bert_embeddings/test_bert_embeddings.py delete mode 100644 concept_drift/README.md delete mode 100644 concept_drift/concept_drift.ipynb delete mode 100644 concept_drift/concept_drift.py delete mode 100644 concept_drift/function.yaml delete mode 100644 concept_drift/item.yaml delete mode 100644 concept_drift/requirements.txt delete mode 100644 concept_drift_streaming/concept_drift_streaming.ipynb delete mode 100644 concept_drift_streaming/concept_drift_streaming.py delete mode 100644 concept_drift_streaming/function.yaml delete mode 100644 concept_drift_streaming/item.yaml delete mode 100644 concept_drift_streaming/requirements.txt delete mode 100644 feature_perms/README.ipynb delete mode 100644 feature_perms/feature_perms.ipynb delete mode 100644 feature_perms/feature_perms.py delete mode 100644 feature_perms/function.yaml delete mode 100644 feature_perms/item.yaml delete mode 100644 feature_perms/requirements.txt delete mode 100644 feature_perms/test_feature_perms.py delete mode 100644 get_offline_features/function.yaml delete mode 100644 get_offline_features/get_offline_features.ipynb delete mode 100644 get_offline_features/get_offline_features.py delete mode 100644 get_offline_features/item.yaml delete mode 100644 get_offline_features/test_get_offline_features.py delete mode 100644 hugging_face_classifier_trainer/function.yaml delete mode 100644 hugging_face_classifier_trainer/hugging_face_classifier_trainer.ipynb delete mode 100755 hugging_face_classifier_trainer/hugging_face_classifier_trainer.py delete mode 100755 hugging_face_classifier_trainer/item.yaml delete mode 100644 hugging_face_classifier_trainer/requirements.txt delete mode 100644 hugging_face_classifier_trainer/test_hugging_face_classifier_trainer.py delete mode 100644 huggingface_auto_trainer/function.yaml delete mode 100644 huggingface_auto_trainer/huggingface_auto_trainer.ipynb delete mode 100644 huggingface_auto_trainer/huggingface_auto_trainer.py delete mode 100644 huggingface_auto_trainer/item.yaml delete mode 100644 huggingface_auto_trainer/requirements.txt delete mode 100644 huggingface_auto_trainer/test_huggingface_auto_trainer.py delete mode 100644 ingest/function.yaml delete mode 100644 ingest/ingest.ipynb delete mode 100644 ingest/ingest.py delete mode 100644 ingest/item.yaml delete mode 100644 ingest/test_ingest.py delete mode 100644 load_dask/function.yaml delete mode 100644 load_dask/item.yaml delete mode 100644 load_dask/load_dask.ipynb delete mode 100644 load_dask/load_dask.py delete mode 100644 model_monitoring_stream/function.yaml delete mode 100644 model_monitoring_stream/item.yaml delete mode 100644 model_monitoring_stream/model_monitoring_stream.ipynb delete mode 100644 model_monitoring_stream/model_monitoring_stream.py delete mode 100644 model_monitoring_stream/requirements.txt create mode 100644 noise_reduction/data/test_data.mp3 create mode 100644 noise_reduction/data/test_data.wav create mode 100644 noise_reduction/function.yaml create mode 100644 noise_reduction/item.yaml create mode 100644 noise_reduction/noise_reduction.ipynb create mode 100644 noise_reduction/noise_reduction.py create mode 100644 noise_reduction/requirements.txt create mode 100644 noise_reduction/test_noise_reduction.py delete mode 100644 pandas_profiling_report/README.md delete mode 100644 pandas_profiling_report/function.yaml delete mode 100644 pandas_profiling_report/item.yaml delete mode 100644 pandas_profiling_report/pandas_profiling_report.ipynb delete mode 100644 pandas_profiling_report/pandas_profiling_report.py delete mode 100644 project_runner/function.yaml delete mode 100644 project_runner/project_runner.ipynb delete mode 100644 rnn_serving/function.yaml delete mode 100644 rnn_serving/item.yaml delete mode 100644 rnn_serving/requirements.txt delete mode 100644 rnn_serving/rnn_serving.ipynb delete mode 100644 rnn_serving/rnn_serving.py delete mode 100644 rnn_serving/test_rnn_serving.py delete mode 100644 slack_notify/README.md delete mode 100644 slack_notify/function.yaml delete mode 100644 slack_notify/item.yaml delete mode 100644 slack_notify/slack_notify.ipynb delete mode 100644 slack_notify/slack_notify.py delete mode 100644 snowflake_dask/README.md delete mode 100644 snowflake_dask/config-template.yaml delete mode 100644 snowflake_dask/function.yaml delete mode 100644 snowflake_dask/img/iguazio-project-secrets.png delete mode 100644 snowflake_dask/img/snowflake-dask.png delete mode 100644 snowflake_dask/item.yaml delete mode 100644 snowflake_dask/requirements.txt delete mode 100644 snowflake_dask/snowflake-dask-mlrun.ipynb delete mode 100644 snowflake_dask/snowflake_dask.py delete mode 100644 snowflake_dask/test_snowflake_dask.py delete mode 100644 sql_to_file/function.yaml delete mode 100644 sql_to_file/item.yaml delete mode 100644 sql_to_file/requirements.txt delete mode 100644 sql_to_file/sql_to_file.ipynb delete mode 100644 sql_to_file/sql_to_file.py delete mode 100644 sql_to_file/test_sql_to_file.py delete mode 100644 stream_to_parquet/function.yaml delete mode 100644 stream_to_parquet/item.yaml delete mode 100644 stream_to_parquet/stream_to_parquet.ipynb delete mode 100644 stream_to_parquet/stream_to_parquet.py delete mode 100644 tf1_serving/function.yaml delete mode 100644 tf1_serving/item.yaml delete mode 100644 tf1_serving/requirements.txt delete mode 100644 tf1_serving/tf1_serving.ipynb delete mode 100644 tf1_serving/tf1_serving.py delete mode 100644 tf2_serving_v2/function.yaml delete mode 100644 tf2_serving_v2/item.yaml delete mode 100644 tf2_serving_v2/requirements.txt delete mode 100644 tf2_serving_v2/tf2_serving_v2.ipynb delete mode 100644 tf2_serving_v2/tf2_serving_v2.py delete mode 100644 virtual_drift/README.md delete mode 100644 virtual_drift/function.yaml delete mode 100644 virtual_drift/item.yaml delete mode 100644 virtual_drift/virtual_drift.ipynb delete mode 100644 virtual_drift/virtual_drift.py delete mode 100644 xgb_custom/function.yaml delete mode 100644 xgb_custom/item.yaml delete mode 100644 xgb_custom/requirements.txt delete mode 100644 xgb_custom/test_xgb_custom.py delete mode 100644 xgb_custom/xgb_custom.ipynb delete mode 100644 xgb_custom/xgb_custom.py delete mode 100644 xgb_serving/function.yaml delete mode 100644 xgb_serving/item.yaml delete mode 100644 xgb_serving/requirements.txt delete mode 100644 xgb_serving/test_xgb_serving.py delete mode 100644 xgb_serving/xgb_serving.ipynb delete mode 100644 xgb_serving/xgb_serving.py diff --git a/README.md b/README.md index 9a1e74821..1136c963d 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,6 @@ it is expected that contributors follow certain guidelines/protocols (please chi | [feature-selection](feature_selection/feature_selection.ipynb) | job | Select features through multiple Statistical and Model filters | data-prep, ml | | [gen-class-data](gen_class_data/gen_class_data.ipynb) | job | Create a binary classification sample dataset and save. | data-prep | | [github-utils](github_utils/github_utils.ipynb) | job | add comments to github pull request | notifications, utils | -| [load-dask](load_dask/load_dask.ipynb) | dask | load dask cluster with data | data-movement, utils | | [load-dataset](load_dataset/load_dataset.ipynb) | job | load a toy dataset from scikit-learn | data-source, ml | | [model-monitoring-batch](model_monitoring_batch/model_monitoring_batch.ipynb) | job | | | | [model-monitoring-stream](model_monitoring_stream/model_monitoring_stream.ipynb) | nuclio | | | diff --git a/bert_embeddings/bert_embeddings.ipynb b/bert_embeddings/bert_embeddings.ipynb deleted file mode 100644 index cb6d55841..000000000 --- a/bert_embeddings/bert_embeddings.ipynb +++ /dev/null @@ -1,503 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## BERT Embeddings Serverless Function\n", - "This notebook presents deployment of pretrained BERT model that outputs embeddings for given textual sequences as a serverless function. Embeddings are meaningful, contextual representations of text in the form of ndarrays that are used frequently as input to various learning tasks in the field of NLP." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Embeddings without bert" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "[One-Hot Encoding](https://en.wikipedia.org/wiki/One-hot) is a general method that can vectorize any categorical features. It is simple and fast to create and update the vectorization.
\n", - "in case of text embeddings, each row is a sentence and each column is a word/char/[n-gram](https://en.wikipedia.org/wiki/N-gram)." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# some sentences to do examine\n", - "sentences = ['the quick brown fox jumps over the lazy dog',\n", - " 'Hello I am Jacob',\n", - " 'Daniel visited Tel-Aviv last month']" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "lets see the difference between bert embeddings and one-hot encoding" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'lazy', 'dog', 'Hello', 'I', 'am', 'Jacob', 'Daniel', 'visited', 'Tel-Aviv', 'last', 'month']\n" - ] - } - ], - "source": [ - "# constructing a list of all the words (will be our columns) - make sure no duplicate words are set\n", - "tokens = []\n", - "for sentence in sentences:\n", - " for word in sentence.split():\n", - " tokens.append(word) if word not in tokens else \"\"\n", - "print(tokens)" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "# constructing the one hot vector\n", - "import pandas as pd\n", - "import numpy as np\n", - "\n", - "one_hot = pd.DataFrame(columns = range(len(tokens)))\n", - "# filling our empty dataframe with each sentence encoding\n", - "for sentence in sentences:\n", - " vector = np.zeros(len(tokens))\n", - " for word in sentence.split():\n", - " vector[tokens.index(word)]=1\n", - " one_hot = one_hot.append(pd.Series(vector),ignore_index=True)\n", - "one_hot.columns = tokens" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
thequickbrownfoxjumpsoverlazydogHelloIamJacobDanielvisitedTel-Avivlastmonth
01.01.01.01.01.01.01.01.00.00.00.00.00.00.00.00.00.0
10.00.00.00.00.00.00.00.01.01.01.01.00.00.00.00.00.0
20.00.00.00.00.00.00.00.00.00.00.00.01.01.01.01.01.0
\n", - "
" - ], - "text/plain": [ - " the quick brown fox jumps over lazy dog Hello I am Jacob \\\n", - "0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 \n", - "1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 \n", - "2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", - "\n", - " Daniel visited Tel-Aviv last month \n", - "0 0.0 0.0 0.0 0.0 0.0 \n", - "1 0.0 0.0 0.0 0.0 0.0 \n", - "2 1.0 1.0 1.0 1.0 1.0 " - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "one_hot" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The table above represents the one-hot encoding of our sentences, each row is a sentence and each column is a word.\n", - "this representation is very slim and will be a very weak learning dataset." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Introducing Bert embeddings" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import import_function, auto_mount" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "# importing the function from the hub\n", - "fn = import_function(\"hub://bert_embeddings\").apply(auto_mount())" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2023-02-02 09:29:59,002 [info] Starting remote function deploy\n", - "2023-02-02 09:29:59 (info) Deploying function\n", - "2023-02-02 09:29:59 (info) Building\n", - "2023-02-02 09:29:59 (info) Staging files and preparing base images\n", - "2023-02-02 09:29:59 (info) Building processor image\n", - "2023-02-02 09:32:09 (info) Build complete\n", - "2023-02-02 09:32:35 (info) Function deploy complete\n", - "> 2023-02-02 09:32:36,059 [info] successfully deployed function: {'internal_invocation_urls': ['nuclio-default-bert-embeddings.default-tenant.svc.cluster.local:8080'], 'external_invocation_urls': ['default-bert-embeddings-default.default-tenant.app.cto-office.iguazio-cd1.com/']}\n" - ] - } - ], - "source": [ - "# deploying the function\n", - "addr = fn.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "import requests\n", - "import json\n", - "# sending a request to the function endpoint to get the sentences' embeddings\n", - "resp = requests.post(addr, json=json.dumps(sentences))" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "import pickle\n", - "output_embeddings = pickle.loads(resp.content)" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "embeddings per token shape: (3, 11, 768), pooled embeddings shape: (3, 768)\n" - ] - } - ], - "source": [ - "print(f'embeddings per token shape: {output_embeddings[0].shape}, pooled embeddings shape: {output_embeddings[1].shape}')" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
0123456789...758759760761762763764765766767
0-0.733322-0.2235400.3424620.383463-0.1647960.0405220.8028450.1528420.331639-0.999779...0.2065640.2314150.1964330.7979080.4351750.7493700.2460980.427603-0.5773840.842063
1-0.953005-0.535132-0.7438220.8939340.646276-0.2793880.9435130.275504-0.555109-0.999992...0.582386-0.0046140.9760790.931517-0.3914420.5303840.675933-0.682721-0.7463390.957809
2-0.843678-0.453405-0.8260110.6508050.494036-0.1541170.8216420.349507-0.650629-0.999978...0.618286-0.3367000.9362620.857577-0.7874890.2461370.676243-0.612532-0.7087860.840879
\n", - "

3 rows × 768 columns

\n", - "
" - ], - "text/plain": [ - " 0 1 2 3 4 5 6 \\\n", - "0 -0.733322 -0.223540 0.342462 0.383463 -0.164796 0.040522 0.802845 \n", - "1 -0.953005 -0.535132 -0.743822 0.893934 0.646276 -0.279388 0.943513 \n", - "2 -0.843678 -0.453405 -0.826011 0.650805 0.494036 -0.154117 0.821642 \n", - "\n", - " 7 8 9 ... 758 759 760 761 \\\n", - "0 0.152842 0.331639 -0.999779 ... 0.206564 0.231415 0.196433 0.797908 \n", - "1 0.275504 -0.555109 -0.999992 ... 0.582386 -0.004614 0.976079 0.931517 \n", - "2 0.349507 -0.650629 -0.999978 ... 0.618286 -0.336700 0.936262 0.857577 \n", - "\n", - " 762 763 764 765 766 767 \n", - "0 0.435175 0.749370 0.246098 0.427603 -0.577384 0.842063 \n", - "1 -0.391442 0.530384 0.675933 -0.682721 -0.746339 0.957809 \n", - "2 -0.787489 0.246137 0.676243 -0.612532 -0.708786 0.840879 \n", - "\n", - "[3 rows x 768 columns]" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "pd.DataFrame(output_embeddings[1])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "we can see that the size of the first dimension of the outputs is three since we passed in three sequences. Also the intermediate dimension of the first output is the maximal number of tokens across all input sequences. Sequences with less tokens are padded with zero values.
\n", - "Note that the first input has an intermediate dimension of size 11 that corresponds to the number of max tokens in the input sequence after addition of two special tokens marking beginning and end of a sequence by the tokenizer.
\n", - "The last dimension for both is of size 768 which is the embedding dimension for this default configuration of bert.
\n", - "Now you tell me, which encoding are you gonna use in your project ??" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/bert_embeddings/bert_embeddings.py b/bert_embeddings/bert_embeddings.py deleted file mode 100644 index 109081b1b..000000000 --- a/bert_embeddings/bert_embeddings.py +++ /dev/null @@ -1,41 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import json -import pickle - -import torch -from transformers import BertModel, BertTokenizer - - -def init_context(context): - tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") - model = BertModel.from_pretrained("bert-base-uncased") - model.eval() - - setattr(context.user_data, "tokenizer", tokenizer) - setattr(context.user_data, "model", model) - - -def handler(context, event): - docs = json.loads(event.body) - docs = [doc.lower() for doc in docs] - docs = context.user_data.tokenizer.batch_encode_plus( - docs, pad_to_max_length=True, return_tensors="pt" - ) - - with torch.no_grad(): - embeddings = context.user_data.model(**docs) - embeddings = [embeddings[0].numpy(), embeddings[1].numpy()] - return pickle.dumps(embeddings) diff --git a/bert_embeddings/function.yaml b/bert_embeddings/function.yaml deleted file mode 100644 index 15319c160..000000000 --- a/bert_embeddings/function.yaml +++ /dev/null @@ -1,42 +0,0 @@ -kind: remote -metadata: - name: bert-embeddings - tag: '' - hash: ecf6647fe4716e0df54ce50278b735034536a568 - project: '' - labels: - framework: pytorch - categories: - - huggingface - - machine-learning - - data-preparation - - pytorch -spec: - command: '' - args: [] - image: mlrun/mlrun - build: - functionSourceCode: IyBDb3B5cmlnaHQgMjAxOSBJZ3VhemlvCiMKIyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKIyB5b3UgbWF5IG5vdCB1c2UgdGhpcyBmaWxlIGV4Y2VwdCBpbiBjb21wbGlhbmNlIHdpdGggdGhlIExpY2Vuc2UuCiMgWW91IG1heSBvYnRhaW4gYSBjb3B5IG9mIHRoZSBMaWNlbnNlIGF0CiMKIyAgICAgaHR0cDovL3d3dy5hcGFjaGUub3JnL2xpY2Vuc2VzL0xJQ0VOU0UtMi4wCiMKIyBVbmxlc3MgcmVxdWlyZWQgYnkgYXBwbGljYWJsZSBsYXcgb3IgYWdyZWVkIHRvIGluIHdyaXRpbmcsIHNvZnR3YXJlCiMgZGlzdHJpYnV0ZWQgdW5kZXIgdGhlIExpY2Vuc2UgaXMgZGlzdHJpYnV0ZWQgb24gYW4gIkFTIElTIiBCQVNJUywKIyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KIyBTZWUgdGhlIExpY2Vuc2UgZm9yIHRoZSBzcGVjaWZpYyBsYW5ndWFnZSBnb3Zlcm5pbmcgcGVybWlzc2lvbnMgYW5kCiMgbGltaXRhdGlvbnMgdW5kZXIgdGhlIExpY2Vuc2UuCiMKaW1wb3J0IGpzb24KaW1wb3J0IHBpY2tsZQoKaW1wb3J0IHRvcmNoCmZyb20gdHJhbnNmb3JtZXJzIGltcG9ydCBCZXJ0TW9kZWwsIEJlcnRUb2tlbml6ZXIKCgpkZWYgaW5pdF9jb250ZXh0KGNvbnRleHQpOgogICAgdG9rZW5pemVyID0gQmVydFRva2VuaXplci5mcm9tX3ByZXRyYWluZWQoImJlcnQtYmFzZS11bmNhc2VkIikKICAgIG1vZGVsID0gQmVydE1vZGVsLmZyb21fcHJldHJhaW5lZCgiYmVydC1iYXNlLXVuY2FzZWQiKQogICAgbW9kZWwuZXZhbCgpCgogICAgc2V0YXR0cihjb250ZXh0LnVzZXJfZGF0YSwgInRva2VuaXplciIsIHRva2VuaXplcikKICAgIHNldGF0dHIoY29udGV4dC51c2VyX2RhdGEsICJtb2RlbCIsIG1vZGVsKQoKCmRlZiBoYW5kbGVyKGNvbnRleHQsIGV2ZW50KToKICAgIGRvY3MgPSBqc29uLmxvYWRzKGV2ZW50LmJvZHkpCiAgICBkb2NzID0gW2RvYy5sb3dlcigpIGZvciBkb2MgaW4gZG9jc10KICAgIGRvY3MgPSBjb250ZXh0LnVzZXJfZGF0YS50b2tlbml6ZXIuYmF0Y2hfZW5jb2RlX3BsdXMoCiAgICAgICAgZG9jcywgcGFkX3RvX21heF9sZW5ndGg9VHJ1ZSwgcmV0dXJuX3RlbnNvcnM9InB0IgogICAgKQoKICAgIHdpdGggdG9yY2gubm9fZ3JhZCgpOgogICAgICAgIGVtYmVkZGluZ3MgPSBjb250ZXh0LnVzZXJfZGF0YS5tb2RlbCgqKmRvY3MpCiAgICBlbWJlZGRpbmdzID0gW2VtYmVkZGluZ3NbMF0ubnVtcHkoKSwgZW1iZWRkaW5nc1sxXS5udW1weSgpXQogICAgcmV0dXJuIHBpY2tsZS5kdW1wcyhlbWJlZGRpbmdzKQo= - commands: [] - code_origin: '' - origin_filename: '' - requirements: - - torch - description: Get BERT based embeddings for given text - default_handler: '' - disable_auto_mount: false - clone_target_dir: '' - env: - - name: MLRUN_HTTPDB__NUCLIO__EXPLICIT_ACK - value: enabled - priority_class_name: '' - preemption_mode: prevent - min_replicas: 1 - max_replicas: 4 - source: '' - function_handler: bert_embeddings:handler - base_image_pull: false - affinity: null - tolerations: null - security_context: {} -verbose: false diff --git a/bert_embeddings/item.yaml b/bert_embeddings/item.yaml deleted file mode 100644 index f96e54eae..000000000 --- a/bert_embeddings/item.yaml +++ /dev/null @@ -1,28 +0,0 @@ -apiVersion: v1 -categories: -- huggingface -- machine-learning -- data-preparation -- pytorch -description: Get BERT based embeddings for given text -doc: '' -example: bert_embeddings.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - framework: pytorch -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.4.1 -name: bert-embeddings -platformVersion: 3.5.3 -spec: - filename: bert_embeddings.py - handler: handler - image: mlrun/mlrun - kind: nuclio - requirements: - - torch -url: '' -version: 1.3.0 diff --git a/bert_embeddings/requirements.txt b/bert_embeddings/requirements.txt deleted file mode 100644 index 747b7aa97..000000000 --- a/bert_embeddings/requirements.txt +++ /dev/null @@ -1 +0,0 @@ -transformers \ No newline at end of file diff --git a/bert_embeddings/test_bert_embeddings.py b/bert_embeddings/test_bert_embeddings.py deleted file mode 100644 index 7ad9101cc..000000000 --- a/bert_embeddings/test_bert_embeddings.py +++ /dev/null @@ -1,32 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from bert_embeddings import init_context,handler -import nuclio -import json -import pickle -import numpy as np - -ARCHIVE = "https://archive.ics.uci.edu/ml/machine-learning-databases/00280/HIGGS.csv.gz" -ARTIFACTS_PATH = 'artifacts' - - -def test_bert_embeddings(): - event = nuclio.Event(body=json.dumps(['John loves Mary'])) - ctx = nuclio.Context() - init_context(ctx) - outputs = pickle.loads(handler(ctx, event)) - assert (True if abs(np.mean(outputs[0]) - -0.011996539) <= 0.0001 else False) is True - assert (True if abs(np.mean(outputs[0]) - -0.011996539) > 0 else False) is True - diff --git a/catalog.json b/catalog.json index 6fff9830c..4bcc4022d 100644 --- a/catalog.json +++ b/catalog.json @@ -1 +1 @@ -{"aggregate": {"description": "Rolling aggregation over Metrics and Lables according to specifications", "categories": ["data-prep"], "kind": "job", "docfile": "aggregate/aggregate.ipynb", "versions": {"latest": "aggregate/function.yaml"}}, "arc-to-parquet": {"description": "retrieve remote archive, open and save as parquet", "categories": ["data-movement", "utils"], "kind": "job", "docfile": "arc_to_parquet/arc_to_parquet.ipynb", "versions": {"latest": "arc_to_parquet/function.yaml"}}, "bert-embeddings": {"description": "Get BERT based embeddings for given text", "categories": ["NLP", "BERT", "embeddings"], "kind": "remote", "docfile": "bert_embeddings/bert_embeddings.ipynb", "versions": {"latest": "bert_embeddings/function.yaml"}}, "churn-server": {"description": "churn classification and predictor", "categories": ["serving", "ml"], "kind": "serving", "docfile": "churn_server/churn_server.ipynb", "versions": {"latest": "churn_server/function.yaml"}}, "concept-drift": {"description": "Deploy a streaming Concept Drift detector on a labeled stream", "categories": ["ml", "serve"], "kind": "job", "docfile": "concept_drift/concept_drift.ipynb", "versions": {"latest": "concept_drift/function.yaml"}}, "concept-drift-streaming": {"description": "Deploy a streaming Concept Drift detector on a labeled stream. the nuclio part of the concept_drift function", "categories": ["ml", "serve"], "kind": "remote", "docfile": "concept_drift_streaming/concept_drift_streaming.ipynb", "versions": {"latest": "concept_drift_streaming/function.yaml"}}, "coxph-test": {"description": "Test cox proportional hazards model", "categories": ["ml", "test"], "kind": "job", "docfile": "coxph_test/coxph_test.ipynb", "versions": {"latest": "coxph_test/function.yaml"}}, "coxph-trainer": {"description": "cox proportional hazards, kaplan meier plots", "categories": ["training", "ml"], "kind": "job", "docfile": "coxph_trainer/coxph_trainer.ipynb", "versions": {"latest": "coxph_trainer/function.yaml"}}, "describe": {"description": "describe and visualizes dataset stats", "categories": ["analysis"], "kind": "job", "docfile": "describe/describe.ipynb", "versions": {"latest": "describe/function.yaml"}}, "describe-dask": {"description": "describe and visualizes dataset stats", "categories": ["analysis"], "kind": "job", "docfile": "describe_dask/describe_dask.ipynb", "versions": {"latest": "describe_dask/function.yaml"}}, "describe-spark": {"description": "", "categories": [], "kind": "job", "docfile": "describe_spark/describe_spark.ipynb", "versions": {"latest": "describe_spark/function.yaml"}}, "feature-perms": {"description": "estimate feature importances using permutations", "categories": ["analysis"], "kind": "job", "docfile": "feature_perms/feature_perms.ipynb", "versions": {"latest": "feature_perms/function.yaml"}}, "feature-selection": {"description": "Select features through multiple Statistical and Model filters", "categories": ["data-prep", "ml"], "kind": "job", "docfile": "feature_selection/feature_selection.ipynb", "versions": {"latest": "feature_selection/function.yaml"}}, "gen-class-data": {"description": "Create a binary classification sample dataset and save.", "categories": ["data-prep"], "kind": "job", "docfile": "gen_class_data/gen_class_data.ipynb", "versions": {"latest": "gen_class_data/function.yaml"}}, "github-utils": {"description": "add comments to github pull request", "categories": ["notifications", "utils"], "kind": "job", "docfile": "github_utils/github_utils.ipynb", "versions": {"latest": "github_utils/function.yaml"}}, "load-dask": {"description": "load dask cluster with data", "categories": ["data-movement", "utils"], "kind": "dask", "docfile": "load_dask/load_dask.ipynb", "versions": {"latest": "load_dask/function.yaml"}}, "load-dataset": {"description": "load a toy dataset from scikit-learn", "categories": ["data-source", "ml"], "kind": "job", "docfile": "load_dataset/load_dataset.ipynb", "versions": {"latest": "load_dataset/function.yaml"}}, "model-monitoring-batch": {"description": "", "categories": [], "kind": "job", "docfile": "model_monitoring_batch/model_monitoring_batch.ipynb", "versions": {"latest": "model_monitoring_batch/function.yaml"}}, "model-monitoring-stream": {"description": "", "categories": [], "kind": "remote", "docfile": "model_monitoring_stream/model_monitoring_stream.ipynb", "versions": {"latest": "model_monitoring_stream/function.yaml"}}, "model-server": {"description": "generic sklearn model server", "categories": ["serving", "ml"], "kind": "remote", "docfile": "model_server/model_server.ipynb", "versions": {"latest": "model_server/function.yaml"}}, "model-server-tester": {"description": "test model servers", "categories": ["ml", "test"], "kind": "job", "docfile": "model_server_tester/model_server_tester.ipynb", "versions": {"latest": "model_server_tester/function.yaml"}}, "open-archive": {"description": "Open a file/object archive into a target directory", "categories": ["data-movement", "utils"], "kind": "job", "docfile": "open_archive/open_archive.ipynb", "versions": {"latest": "open_archive/function.yaml"}}, "pandas-profiling-report": {"description": "Create Pandas Profiling Report from Dataset", "categories": ["analysis"], "kind": "job", "docfile": "pandas_profiling_report/pandas_profiling_report.ipynb", "versions": {"latest": "pandas_profiling_report/function.yaml"}}, "project-runner": {"description": "Nuclio based - Cron scheduler for running your MLRun projects", "categories": ["utils"], "kind": "remote", "docfile": "project_runner/project_runner.ipynb", "versions": {"latest": "project_runner/function.yaml"}}, "rnn-serving": {"description": "deploy an rnn based stock analysis model server.", "categories": ["model-serving"], "kind": "serving", "docfile": "rnn_serving/rnn_serving.ipynb", "versions": {"latest": "rnn_serving/function.yaml"}}, "send-email": {"description": "Send Email messages through SMTP server", "categories": ["notifications"], "kind": "job", "docfile": "send_email/send_email.ipynb", "versions": {"latest": "send_email/function.yaml"}}, "sentiment-analysis-serving": {"description": "BERT based sentiment classification model", "categories": ["serving", "NLP", "BERT", "sentiment analysis"], "kind": "serving", "docfile": "sentiment_analysis_serving/sentiment_analysis_serving.ipynb", "versions": {"latest": "sentiment_analysis_serving/function.yaml"}}, "sklearn-classifier": {"description": "train any classifier using scikit-learn's API", "categories": ["ml", "training"], "kind": "job", "docfile": "sklearn_classifier/sklearn_classifier.ipynb", "versions": {"latest": "sklearn_classifier/function.yaml"}}, "sklearn-classifier-dask": {"description": "train any classifier using scikit-learn's API over Dask", "categories": ["ml", "training", "dask"], "kind": "job", "docfile": "sklearn_classifier_dask/sklearn_classifier_dask.ipynb", "versions": {"latest": "sklearn_classifier_dask/function.yaml"}}, "slack-notify": {"description": "Send Slack notification", "categories": ["ops"], "kind": "job", "docfile": "slack_notify/slack_notify.ipynb", "versions": {"latest": "slack_notify/function.yaml"}}, "spark-submit": {"description": "", "categories": [], "kind": "job", "docfile": "spark_submit/spark_submit.ipynb", "versions": {"latest": "spark_submit/function.yaml"}}, "sql-to-file": {"description": "SQL To File - Ingest data using SQL query", "categories": ["data-prep"], "kind": "job", "docfile": "sql_to_file/sql_to_file.ipynb", "versions": {"latest": "sql_to_file/function.yaml"}}, "stream-to-parquet": {"description": "Saves a stream to Parquet and can lunch drift detection task on it", "categories": ["ml", "serve"], "kind": "remote", "docfile": "stream_to_parquet/stream_to_parquet.ipynb", "versions": {"latest": "stream_to_parquet/function.yaml"}}, "test-classifier": {"description": "test a classifier using held-out or new data", "categories": ["ml", "test"], "kind": "job", "docfile": "test_classifier/test_classifier.ipynb", "versions": {"latest": "test_classifier/function.yaml"}}, "tf1-serving": {"description": "tf1 image classification server", "categories": ["serving", "dl"], "kind": "remote", "docfile": "tf1_serving/tf1_serving.ipynb", "versions": {"latest": "tf1_serving/function.yaml"}}, "tf2-serving": {"description": "tf2 image classification server", "categories": ["serving", "dl"], "kind": "remote", "docfile": "tf2_serving/tf2_serving.ipynb", "versions": {"latest": "tf2_serving/function.yaml"}}, "tf2-serving-v2": {"description": "tf2 image classification server v2", "categories": ["serving", "dl"], "kind": "serving", "docfile": "tf2_serving_v2/tf2_serving_v2.ipynb", "versions": {"latest": "tf2_serving_v2/function.yaml"}}, "v2-model-server": {"description": "generic sklearn model server", "categories": ["serving", "ml"], "kind": "serving", "docfile": "v2_model_server/v2_model_server.ipynb", "versions": {"latest": "v2_model_server/function.yaml"}}, "v2-model-tester": {"description": "test v2 model servers", "categories": ["ml", "test"], "kind": "job", "docfile": "v2_model_tester/v2_model_tester.ipynb", "versions": {"latest": "v2_model_tester/function.yaml"}}, "virtual-drift": {"description": "Compute drift magnitude between Time-Samples T and U", "categories": ["ml", "serve", "concept-drift"], "kind": "job", "docfile": "virtual_drift/virtual_drift.ipynb", "versions": {"latest": "virtual_drift/function.yaml"}}, "xgb-custom": {"description": "simulate data with outliers.", "categories": ["model-testing"], "kind": "job", "docfile": "xgb_custom/xgb_custom.ipynb", "versions": {"latest": "xgb_custom/function.yaml"}}, "xgb-serving": {"description": "deploy an XGBoost model server.", "categories": ["model-serving"], "kind": "remote", "docfile": "xgb_serving/xgb_serving.ipynb", "versions": {"latest": "xgb_serving/function.yaml"}}, "xgb-test": {"description": "Test one or more classifier models against held-out dataset.", "categories": ["model-test"], "kind": "job", "docfile": "xgb_test/xgb_test.ipynb", "versions": {"latest": "xgb_test/function.yaml"}}, "xgb-trainer": {"description": "train multiple model types using xgboost.", "categories": ["model-prep"], "kind": "job", "docfile": "xgb_trainer/xgb_trainer.ipynb", "versions": {"latest": "xgb_trainer/function.yaml"}}} \ No newline at end of file +{"aggregate": {"description": "Rolling aggregation over Metrics and Lables according to specifications", "categories": ["data-prep"], "kind": "job", "docfile": "aggregate/aggregate.ipynb", "versions": {"latest": "aggregate/function.yaml"}}, "arc-to-parquet": {"description": "retrieve remote archive, open and save as parquet", "categories": ["data-movement", "utils"], "kind": "job", "docfile": "arc_to_parquet/arc_to_parquet.ipynb", "versions": {"latest": "arc_to_parquet/function.yaml"}}, "bert-embeddings": {"description": "Get BERT based embeddings for given text", "categories": ["NLP", "BERT", "embeddings"], "kind": "remote", "docfile": "bert_embeddings/bert_embeddings.ipynb", "versions": {"latest": "bert_embeddings/function.yaml"}}, "churn-server": {"description": "churn classification and predictor", "categories": ["serving", "ml"], "kind": "serving", "docfile": "churn_server/churn_server.ipynb", "versions": {"latest": "churn_server/function.yaml"}}, "concept-drift": {"description": "Deploy a streaming Concept Drift detector on a labeled stream", "categories": ["ml", "serve"], "kind": "job", "docfile": "concept_drift/concept_drift.ipynb", "versions": {"latest": "concept_drift/function.yaml"}}, "concept-drift-streaming": {"description": "Deploy a streaming Concept Drift detector on a labeled stream. the nuclio part of the concept_drift function", "categories": ["ml", "serve"], "kind": "remote", "docfile": "concept_drift_streaming/concept_drift_streaming.ipynb", "versions": {"latest": "concept_drift_streaming/function.yaml"}}, "coxph-test": {"description": "Test cox proportional hazards model", "categories": ["ml", "test"], "kind": "job", "docfile": "coxph_test/coxph_test.ipynb", "versions": {"latest": "coxph_test/function.yaml"}}, "coxph-trainer": {"description": "cox proportional hazards, kaplan meier plots", "categories": ["training", "ml"], "kind": "job", "docfile": "coxph_trainer/coxph_trainer.ipynb", "versions": {"latest": "coxph_trainer/function.yaml"}}, "describe": {"description": "describe and visualizes dataset stats", "categories": ["analysis"], "kind": "job", "docfile": "describe/describe.ipynb", "versions": {"latest": "describe/function.yaml"}}, "describe-dask": {"description": "describe and visualizes dataset stats", "categories": ["analysis"], "kind": "job", "docfile": "describe_dask/describe_dask.ipynb", "versions": {"latest": "describe_dask/function.yaml"}}, "describe-spark": {"description": "", "categories": [], "kind": "job", "docfile": "describe_spark/describe_spark.ipynb", "versions": {"latest": "describe_spark/function.yaml"}}, "feature-perms": {"description": "estimate feature importances using permutations", "categories": ["analysis"], "kind": "job", "docfile": "feature_perms/feature_perms.ipynb", "versions": {"latest": "feature_perms/function.yaml"}}, "feature-selection": {"description": "Select features through multiple Statistical and Model filters", "categories": ["data-prep", "ml"], "kind": "job", "docfile": "feature_selection/feature_selection.ipynb", "versions": {"latest": "feature_selection/function.yaml"}}, "gen-class-data": {"description": "Create a binary classification sample dataset and save.", "categories": ["data-prep"], "kind": "job", "docfile": "gen_class_data/gen_class_data.ipynb", "versions": {"latest": "gen_class_data/function.yaml"}}, "github-utils": {"description": "add comments to github pull request", "categories": ["notifications", "utils"], "kind": "job", "docfile": "github_utils/github_utils.ipynb", "versions": {"latest": "github_utils/function.yaml"}}, "load-dataset": {"description": "load a toy dataset from scikit-learn", "categories": ["data-source", "ml"], "kind": "job", "docfile": "load_dataset/load_dataset.ipynb", "versions": {"latest": "load_dataset/function.yaml"}}, "model-monitoring-batch": {"description": "", "categories": [], "kind": "job", "docfile": "model_monitoring_batch/model_monitoring_batch.ipynb", "versions": {"latest": "model_monitoring_batch/function.yaml"}}, "model-monitoring-stream": {"description": "", "categories": [], "kind": "remote", "docfile": "model_monitoring_stream/model_monitoring_stream.ipynb", "versions": {"latest": "model_monitoring_stream/function.yaml"}}, "model-server": {"description": "generic sklearn model server", "categories": ["serving", "ml"], "kind": "remote", "docfile": "model_server/model_server.ipynb", "versions": {"latest": "model_server/function.yaml"}}, "model-server-tester": {"description": "test model servers", "categories": ["ml", "test"], "kind": "job", "docfile": "model_server_tester/model_server_tester.ipynb", "versions": {"latest": "model_server_tester/function.yaml"}}, "open-archive": {"description": "Open a file/object archive into a target directory", "categories": ["data-movement", "utils"], "kind": "job", "docfile": "open_archive/open_archive.ipynb", "versions": {"latest": "open_archive/function.yaml"}}, "pandas-profiling-report": {"description": "Create Pandas Profiling Report from Dataset", "categories": ["analysis"], "kind": "job", "docfile": "pandas_profiling_report/pandas_profiling_report.ipynb", "versions": {"latest": "pandas_profiling_report/function.yaml"}}, "project-runner": {"description": "Nuclio based - Cron scheduler for running your MLRun projects", "categories": ["utils"], "kind": "remote", "docfile": "project_runner/project_runner.ipynb", "versions": {"latest": "project_runner/function.yaml"}}, "rnn-serving": {"description": "deploy an rnn based stock analysis model server.", "categories": ["model-serving"], "kind": "serving", "docfile": "rnn_serving/rnn_serving.ipynb", "versions": {"latest": "rnn_serving/function.yaml"}}, "send-email": {"description": "Send Email messages through SMTP server", "categories": ["notifications"], "kind": "job", "docfile": "send_email/send_email.ipynb", "versions": {"latest": "send_email/function.yaml"}}, "sentiment-analysis-serving": {"description": "BERT based sentiment classification model", "categories": ["serving", "NLP", "BERT", "sentiment analysis"], "kind": "serving", "docfile": "sentiment_analysis_serving/sentiment_analysis_serving.ipynb", "versions": {"latest": "sentiment_analysis_serving/function.yaml"}}, "sklearn-classifier": {"description": "train any classifier using scikit-learn's API", "categories": ["ml", "training"], "kind": "job", "docfile": "sklearn_classifier/sklearn_classifier.ipynb", "versions": {"latest": "sklearn_classifier/function.yaml"}}, "sklearn-classifier-dask": {"description": "train any classifier using scikit-learn's API over Dask", "categories": ["ml", "training", "dask"], "kind": "job", "docfile": "sklearn_classifier_dask/sklearn_classifier_dask.ipynb", "versions": {"latest": "sklearn_classifier_dask/function.yaml"}}, "slack-notify": {"description": "Send Slack notification", "categories": ["ops"], "kind": "job", "docfile": "slack_notify/slack_notify.ipynb", "versions": {"latest": "slack_notify/function.yaml"}}, "spark-submit": {"description": "", "categories": [], "kind": "job", "docfile": "spark_submit/spark_submit.ipynb", "versions": {"latest": "spark_submit/function.yaml"}}, "sql-to-file": {"description": "SQL To File - Ingest data using SQL query", "categories": ["data-prep"], "kind": "job", "docfile": "sql_to_file/sql_to_file.ipynb", "versions": {"latest": "sql_to_file/function.yaml"}}, "stream-to-parquet": {"description": "Saves a stream to Parquet and can lunch drift detection task on it", "categories": ["ml", "serve"], "kind": "remote", "docfile": "stream_to_parquet/stream_to_parquet.ipynb", "versions": {"latest": "stream_to_parquet/function.yaml"}}, "test-classifier": {"description": "test a classifier using held-out or new data", "categories": ["ml", "test"], "kind": "job", "docfile": "test_classifier/test_classifier.ipynb", "versions": {"latest": "test_classifier/function.yaml"}}, "tf1-serving": {"description": "tf1 image classification server", "categories": ["serving", "dl"], "kind": "remote", "docfile": "tf1_serving/tf1_serving.ipynb", "versions": {"latest": "tf1_serving/function.yaml"}}, "tf2-serving": {"description": "tf2 image classification server", "categories": ["serving", "dl"], "kind": "remote", "docfile": "tf2_serving/tf2_serving.ipynb", "versions": {"latest": "tf2_serving/function.yaml"}}, "tf2-serving-v2": {"description": "tf2 image classification server v2", "categories": ["serving", "dl"], "kind": "serving", "docfile": "tf2_serving_v2/tf2_serving_v2.ipynb", "versions": {"latest": "tf2_serving_v2/function.yaml"}}, "v2-model-server": {"description": "generic sklearn model server", "categories": ["serving", "ml"], "kind": "serving", "docfile": "v2_model_server/v2_model_server.ipynb", "versions": {"latest": "v2_model_server/function.yaml"}}, "v2-model-tester": {"description": "test v2 model servers", "categories": ["ml", "test"], "kind": "job", "docfile": "v2_model_tester/v2_model_tester.ipynb", "versions": {"latest": "v2_model_tester/function.yaml"}}, "virtual-drift": {"description": "Compute drift magnitude between Time-Samples T and U", "categories": ["ml", "serve", "concept-drift"], "kind": "job", "docfile": "virtual_drift/virtual_drift.ipynb", "versions": {"latest": "virtual_drift/function.yaml"}}, "xgb-custom": {"description": "simulate data with outliers.", "categories": ["model-testing"], "kind": "job", "docfile": "xgb_custom/xgb_custom.ipynb", "versions": {"latest": "xgb_custom/function.yaml"}}, "xgb-serving": {"description": "deploy an XGBoost model server.", "categories": ["model-serving"], "kind": "remote", "docfile": "xgb_serving/xgb_serving.ipynb", "versions": {"latest": "xgb_serving/function.yaml"}}, "xgb-test": {"description": "Test one or more classifier models against held-out dataset.", "categories": ["model-test"], "kind": "job", "docfile": "xgb_test/xgb_test.ipynb", "versions": {"latest": "xgb_test/function.yaml"}}, "xgb-trainer": {"description": "train multiple model types using xgboost.", "categories": ["model-prep"], "kind": "job", "docfile": "xgb_trainer/xgb_trainer.ipynb", "versions": {"latest": "xgb_trainer/function.yaml"}}} \ No newline at end of file diff --git a/catalog.yaml b/catalog.yaml index 2b9d7c6b0..c3364fefa 100644 --- a/catalog.yaml +++ b/catalog.yaml @@ -128,15 +128,6 @@ github-utils: kind: job versions: latest: github_utils/function.yaml -load-dask: - categories: - - data-movement - - utils - description: load dask cluster with data - docfile: load_dask/load_dask.ipynb - kind: dask - versions: - latest: load_dask/function.yaml load-dataset: categories: - data-source diff --git a/concept_drift/README.md b/concept_drift/README.md deleted file mode 100644 index 92e6d893e..000000000 --- a/concept_drift/README.md +++ /dev/null @@ -1,132 +0,0 @@ -# Concept Drift - -**Concept drift** is a change in the statistical properties of the **target variable** over time. - -When deploying our models to production, we must ensure our models perform as we expect them to - reaching the same level of performence we have seen on our test sets or at least performing in the same quality as when they were deployed. - -However, often this is not the case. there are many factors that can affect our model's performance like seasonality or any unkown root causes that will change the laws underlying our data and invalidate some assumptions made by the model. - -We offer this function to help combat Concept Drift with implementation of streaming DDM, EDDM and PH concept drift detectors. - -## How to integrate - -This function is made of two parts: - -1. Kubernetes job to instantiate the selected models with a provided base dataset (the test dataset could be used) -2. [Nuclio serverless function](../concept_drift_streaming/concept_drift_streaming.ipynb) listed on a _labeled stream_, which will be deployed from this function after the models initialization and run the models per event and provide necessary alerts. - -There are two steps to integrate sucessfully with your workflow: - -1. Provide a stream where each event containes the joined **label** and **prediction** for that specific event. -2. Add this function to the workflow with the following params: - -```markdown -:param context: MLRun context -:param base_dataset: Dataset containing label_col and prediction_col to initialize the detectors -:param input_stream: labeled stream to track. - Should contain label_col and prediction_col -:param output_stream: Output stream to push the detector's alerts -:param output_tsdb: Output TSDB table to allow analysis and display -:param tsdb_batch_size: Batch size of alerts to buffer before pushing to the TSDB -:param callbacks: Additional rest endpoints to send the alert data to -:param models: List of the detectors to deploy - Defaults to ['ddm', 'eddm', 'pagehinkley']. -:param models_dest: Location for saving the detectors - Defaults to 'models' (in relation to artifact_path). -:param pagehinkley_threshold: Drift level threshold for PH detector Defaults to 10. -:param ddm_warning_level: Warning level alert for DDM detector Defaults to 2. -:param ddm_out_control_level: Drift level alert for DDM detector Defaults to 3. -:param label_col: Label column to be used on base_dataset and input_stream - Defaults to 'label'. -:param prediction_col: Prediction column to be used on base_dataset and input_stream - Defaults to 'prediction'. -:param hub_url: hub_url in case the default is not used, concept_drift_streaming will be loaded - by this url - Defaults to mlconf.hub_url. -:param fn_tag: hub tag to use - Defaults to 'master' -``` - -## Algorithms - -We offer to deploy up to 3 concept drift streaming detectors - -### DDM - Drift Detection Method - -Models the **Number of errors** as a **binomial** variable. This enables us to confine the expected number of errors in a prediction stream window to within some standard deviation. - -- Good for **abrupt** drift changes - -
- -![$mu=np_t$](https://latex.codecogs.com/svg.latex?mu=np_t) - -![$\sigma=\sqrt{\frac{p_t(1-p_t)}{n}}$]() - -
- -**Alert** when: - -
- -![$p_t+\sigma_t\ge{p_{min}+3\sigma_{min}}$](https://latex.codecogs.com/svg.latex?p_t+\sigma_t\ge{p_{min}+3\sigma_{min}}) - -
- -### EDDM - Early Drift Detection Method - -Uses the distance between two consecutive errors. - -- works better for **gradual** drift changes. -- More sensitive then DDM for noise -- Requires Minimal number of errors to initialize the statistics. - -**Warning**: - -
- -![$\frac{p_t+2\sigma_t}{p_{max}+2\sigma_{max}}<0.95$](https://latex.codecogs.com/svg.latex?\frac{p_t+2\sigma_t}{p_{max}+2\sigma_{max}}<0.95) - -
- -**Alert**: - -
- -![$\frac{p_t+2\sigma_t}{p_{max}+2\sigma_{max}}<0.90$](https://latex.codecogs.com/svg.latex?\frac{p_t+2\sigma_t}{p_{max}+2\sigma_{max}}<0.90) - -
- -### PageHinkley Test: - -The PageHinkley test is a sequential analysis technique typically used for monitoring change detection. (The test was designed to detect change in avg. of a Gaussian signal). In this test we use: -x*1*, ..., x*n* - labeled dataset -δ - magnitude threshold -λ - detection threshold - -
- -![$\hat{x_T}=\frac{1}{T}\sum_{t=1}^{t}{x_t}$](https://latex.codecogs.com/svg.latex?\hat{x_T}=\frac{1}{T}\sum_{t=1}^{t}{x_t}) - -![$\sum_{t=1}^T{x_t-\hat{x_T}-\delta}$](https://latex.codecogs.com/svg.latex?U_T=\sum_{t=1}^T{x_t-\hat{x_T}-\delta}) - -![$m_T=min(U_t,t=1..T)$]() - -
- -**Alert**: - -
- -![$U_T-m_T>\lambda$](https://latex.codecogs.com/svg.latex?U_T-m_T>\lambda) - -
- -## Additional resources -[A Study on Change Detection Methods](https://pdfs.semanticscholar.org/bb6e/8a44c0efcd725aae1c0b1817561f6e278c2c.pdf), Raquel Sebasti˜ao1,2 and Jo˜ao Gama1,3, 1 LIAAD-INESC Porto L.A., University of Porto -Rua de Ceuta, 118 - 6, 4050-190 Porto, Portugal -2 Faculty of Science, University of Porto -3 Faculty of Economics, University of Porto -{raquel,jgama}@liaad.up.pt - -[MLOps Live #4 - How to Detect & Remediate Drift in Production with MLOps Automation](https://www.youtube.com/watch?v=66_Q7mJZOSc&t=1296s) diff --git a/concept_drift/concept_drift.ipynb b/concept_drift/concept_drift.ipynb deleted file mode 100644 index e9c063b66..000000000 --- a/concept_drift/concept_drift.ipynb +++ /dev/null @@ -1,793 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Concept Drift - Deployer\n", - "Deploy a streaming Concept Drift detector on a labeled stream. \n", - "It will initialize the selected drift detectors with the base_dataset's statistics and deploy the [concept_drift_streaming](https://github.com/mlrun/functions/blob/master/concept_drift_streaming/concept_drift_streaming.ipynb) function from the hub.
\n", - "adding [V3IOStreamTrigger](https://nuclio.io/docs/latest/reference/triggers/v3iostream/) in order to listen to the input_stream." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Steps**\n", - "\n", - "1. [Data exploration](#Data-exploration)\n", - "2. [Creating the input stream](#Creating-the-input-stream)\n", - "3. [Importing the function](#Importing-the-function)\n", - "4. [Running the function remotely](#Running-the-function-remotely)\n", - "5. [Testing the function](#Testing-the-function)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Data exploration**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In order to know about the performance of a drift detector by measuring the different detection metrics, we need to know beforehand where a real drift occurs.
\n", - "This is only possible with synthetic datasets.
The scikit-multiflow framework allows generating several kinds of synthetic data to simulate the occurrence of drifts.
\n", - "[Harvard dataverse](https://dataverse.harvard.edu) provides futher explanations on the [used dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/5OWRGB) along with different kinds of drifted datasets.
\n", - "mixed_0101_abrupto has 4 concepts and 3 drifts at time steps 10000, 20000, and 30000.
\n", - "Our dataset will be train-test-splitted, the train part (first 5000 examples) is used to train the model (that is generated easly using [sklearn_classifer](https://github.com/mlrun/functions/blob/master/sklearn_classifier/sklearn_classifier.ipynb)).
\n", - "The test part (which is already predicted by the model) will be pushed to the input stream in order to detect drifts." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
X1X2X3X4class
00.01.00.4601010.5927441.0
11.01.00.5887880.5749840.0
20.00.00.4016410.6793251.0
31.01.00.3060760.1821080.0
40.00.00.9628470.5792451.0
\n", - "
" - ], - "text/plain": [ - " X1 X2 X3 X4 class\n", - "0 0.0 1.0 0.460101 0.592744 1.0\n", - "1 1.0 1.0 0.588788 0.574984 0.0\n", - "2 0.0 0.0 0.401641 0.679325 1.0\n", - "3 1.0 1.0 0.306076 0.182108 0.0\n", - "4 0.0 0.0 0.962847 0.579245 1.0" - ] - }, - "execution_count": 1, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import pandas as pd\n", - "data_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/concept_drift/mixed_0101_abrupto.csv'\n", - "predicted_train_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/concept_drift/predicted_abrupto_train.csv'\n", - "predicted_test_data_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/concept_drift/predicted_abrupto_test.csv'\n", - "# You can find the model used here\n", - "models_path = 'https://s3.wasabisys.com/iguazio/models/function-marketplace-models/concept_drift/concept_drift_random_forest.pkl'\n", - "original_data = pd.read_csv(data_path)\n", - "original_data.head()" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
X1X2X3X4classpredicted_col
349950.00.00.0101060.6472690.01.0
349961.01.00.2936510.7372911.00.0
349970.00.00.8485460.5523370.01.0
349981.01.00.6147540.8598961.00.0
349991.00.00.2653060.8437160.01.0
\n", - "
" - ], - "text/plain": [ - " X1 X2 X3 X4 class predicted_col\n", - "34995 0.0 0.0 0.010106 0.647269 0.0 1.0\n", - "34996 1.0 1.0 0.293651 0.737291 1.0 0.0\n", - "34997 0.0 0.0 0.848546 0.552337 0.0 1.0\n", - "34998 1.0 1.0 0.614754 0.859896 1.0 0.0\n", - "34999 1.0 0.0 0.265306 0.843716 0.0 1.0" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "predicted_test = pd.read_csv(predicted_test_data_path)\n", - "predicted_test.tail()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Creating the input stream**" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "import os \n", - "\n", - "container = os.path.join('/',os.environ['V3IO_HOME'].split('/')[0])\n", - "user = os.environ[\"V3IO_USERNAME\"]\n", - "rel_path = os.getcwd()[6:] + '/artifacts'\n", - "\n", - "base_input_stream = os.path.join(user,rel_path) + \"/inputs_stream\"\n", - "base_output_stream = os.path.join(user,rel_path) + \"/output_stream\"\n", - "input_stream = os.path.join(container,base_input_stream)\n", - "output_stream = os.path.join(container,user,rel_path) + \"/output_stream\"\n", - "tsdb_path = os.path.join(container,user,rel_path) + \"/output_tsdb\"\n", - "\n", - "stream_consumer_group = 'cg45'" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "import v3io.dataplane\n", - "\n", - "client = v3io.dataplane.Client()\n", - "response = client.stream.create(container = container,\n", - " stream_path=base_input_stream,\n", - " shard_count=1,\n", - " raise_for_status = v3io.dataplane.RaiseForStatus.never)\n", - "response.raise_for_status([409, 204])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Importing the function**" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-25 10:27:04,105 [info] created and saved project function-marketplace\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Importing the function\n", - "import mlrun\n", - "mlrun.set_environment(project='function-marketplace')\n", - "\n", - "fn = mlrun.import_function(\"hub://concept_drift:development\")\n", - "fn.apply(mlrun.auto_mount())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Running the function remotely**" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-25 10:27:04,567 [info] starting run concept_drift uid=fa07c222e77d4eac86d2ce9317aaded1 DB=http://mlrun-api:8080\n", - "> 2021-10-25 10:27:04,709 [info] Job is running in the background, pod: concept-drift-ggxgb\n", - "> 2021-10-25 10:27:11,199 [info] Loading base dataset\n", - "> 2021-10-25 10:27:13,227 [info] Creating models\n", - "> 2021-10-25 10:27:13,227 [info] Streaming data to models\n", - "> 2021-10-25 10:27:13,347 [info] Logging ready models\n", - "> 2021-10-25 10:27:13,487 [info] Deploying Concept Drift Streaming function\n", - "> 2021-10-25 10:27:13,490 [info] Starting remote function deploy\n", - "2021-10-25 10:27:13 (info) Deploying function\n", - "2021-10-25 10:27:13 (info) Building\n", - "2021-10-25 10:27:13 (info) Staging files and preparing base images\n", - "2021-10-25 10:27:13 (info) Building processor image\n", - "2021-10-25 10:27:15 (info) Build complete\n", - "2021-10-25 10:27:21 (info) Function deploy complete\n", - "> 2021-10-25 10:27:21,797 [info] successfully deployed function: {'internal_invocation_urls': ['nuclio-function-marketplace-concept-drift-streaming.default-tenant.svc.cluster.local:8080'], 'external_invocation_urls': ['default-tenant.app.dev39.lab.iguazeng.com:31143']}\n", - "> 2021-10-25 10:27:21,868 [info] run executed, status=completed\n", - "final state: completed\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
function-marketplace0Oct 25 10:27:10completedconcept_drift
v3io_user=dani
kind=job
owner=dani
host=concept-drift-ggxgb
base_dataset
input_stream=/users/dani/test/functions/concept_drift/artifacts/inputs_stream
consumer_group=cg45
output_stream=/users/dani/test/functions/concept_drift/artifacts/output_stream
output_tsdb=/users/dani/test/functions/concept_drift/artifacts/output_tsdb
tsdb_batch_size=1
models=['ddm', 'eddm', 'pagehinkley']
label_col=class
prediction_col=predicted_col
fn_tag=development
eddm_concept_drift
pagehinkley_concept_drift
ddm_concept_drift
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/html": [ - " > to track results use the .show() or .logs() methods or click here to open in UI" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-25 10:27:23,031 [info] run executed, status=completed\n" - ] - } - ], - "source": [ - "drift_run = fn.run(name='concept_drift',\n", - " params={'input_stream' : input_stream,\n", - " 'consumer_group' : stream_consumer_group,\n", - " 'output_stream' : output_stream,\n", - " 'output_tsdb' : tsdb_path,\n", - " 'tsdb_batch_size' : 1,\n", - " 'models' : ['ddm', 'eddm', 'pagehinkley'], # defaults\n", - " 'label_col' : 'class',\n", - " 'prediction_col' : 'predicted_col',\n", - " 'fn_tag' : 'development'},\n", - " inputs={'base_dataset' : predicted_train_path},\n", - " artifact_path = os.path.join(os.getcwd(), 'artifacts'))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Testing the function**\n", - "> Mark that we are testing the deployed function - concept_drift_streaming" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'data': '{\"class\": 1.0, \"request\": {\"instances\": [{\"X1\": 0.0, \"X2\": 0.0, \"X3\": 0.0634475073, \"X4\": 0.4136568818}]}, \"resp\": [1], \"when\": \"2021-10-25 10:27:23.152584\", \"model\": \"sklearn.ensemble.RandomForestClassifier\"}'}" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import json\n", - "import datetime\n", - "\n", - "# Reshaping the data to V3IOStream format.\n", - "def restructure_stream_event(context, event):\n", - " instances = [dict()]\n", - " for key in predicted_test.keys():\n", - " if key not in ['when', 'class', 'model', 'worker', 'hostname', 'predicted_col']:\n", - " instances[0].update({key: event.pop(key)})\n", - " event['request'] = {'instances': instances}\n", - " event['resp'] = [int(event.pop('predicted_col'))]\n", - " event['when'] = datetime.datetime.strftime(datetime.datetime.now(), format=\"%Y-%m-%d %H:%M:%S.%f\")\n", - " event['model'] = 'sklearn.ensemble.RandomForestClassifier'\n", - " return event\n", - " \n", - " \n", - "records = json.loads(predicted_test.to_json(orient='records'))\n", - "records = [{'data': json.dumps(restructure_stream_event(context, record))} for record in records]\n", - "\n", - "# showing first record\n", - "records[0]" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "# Creating v3io client\n", - "v3io_client = v3io.dataplane.Client()\n", - "\n", - "# Pushing some undrifted data to the input stream\n", - "response = v3io_client.stream.put_records(container=container,\n", - " stream_path=base_input_stream, \n", - " records=records[4900:5100])" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'SequenceNumber': 200,\n", - " 'Data': 'eyJjbGFzcyI6IDAuMCwgInJlcXVlc3QiOiB7Imluc3RhbmNlcyI6IFt7IlgxIjogMC4wLCAiWDIiOiAwLjAsICJYMyI6IDAuMzMzMTYzNjk4OSwgIlg0IjogMC40MjE2NzY1Njg3fV19LCAicmVzcCI6IFsxXSwgIndoZW4iOiAiMjAyMS0xMC0yNSAxMDoyNzoyMy4yOTM3OTgiLCAibW9kZWwiOiAic2tsZWFybi5lbnNlbWJsZS5SYW5kb21Gb3Jlc3RDbGFzc2lmaWVyIn0=',\n", - " 'ArrivalTimeSec': 1635157644,\n", - " 'ArrivalTimeNSec': 395309631}" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Getting earliest location in the shard\n", - "location = json.loads(v3io_client.stream.seek(container=container,\n", - " stream_path=base_input_stream,\n", - " shard_id=0,\n", - " seek_type='EARLIEST').body)['Location']\n", - "# Getting records from input stream\n", - "response = v3io_client.stream.get_records(container=container,\n", - " stream_path=base_input_stream,\n", - " shard_id=0, location=location)\n", - "# Showing the last sequence that is written to the input stream\n", - "json.loads(response.body)['Records'][-1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Make sure some time has passed - the function needs to be triggered by the input stream, then it'll write to the output stream" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "# Getting earliest location in the shard\n", - "location = json.loads(v3io_client.stream.seek(container=container,\n", - " stream_path=base_output_stream,\n", - " shard_id=0,\n", - " seek_type='EARLIEST').body)['Location']\n", - "# Getting records from output stream\n", - "response = v3io_client.stream.get_records(container=container,\n", - " stream_path=base_output_stream,\n", - " shard_id=0, location=location)" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sequence number : 106, data : {'class': 0.0, 'request': {'instances': [{'X1': 0.0, 'X2': 0.0, 'X3': 0.9628473804, 'X4': 0.5792453402}]}, 'resp': [1], 'when': '2021-10-25 10:27:23.291145', 'model': 'sklearn.ensemble.RandomForestClassifier', 'ddm_warning_zone': 0, 'ddm_drift': 1, 'eddm_warning_zone': 0, 'eddm_drift': 0}\n", - "sequence number : 122, data : {'class': 0.0, 'request': {'instances': [{'X1': 0.0, 'X2': 0.0, 'X3': 0.4969765505, 'X4': 0.9784738351}]}, 'resp': [1], 'when': '2021-10-25 10:27:23.291558', 'model': 'sklearn.ensemble.RandomForestClassifier', 'ddm_warning_zone': 0, 'ddm_drift': 0, 'eddm_warning_zone': 0, 'eddm_drift': 1}\n" - ] - } - ], - "source": [ - "# Showing changed detected\n", - "import base64\n", - "for instance in json.loads(response.body)['Records']:\n", - " seq = instance[\"SequenceNumber\"]\n", - " data = json.loads(base64.b64decode(instance['Data']))\n", - " if(data['ddm_drift']==1 or data['eddm_drift']==1):\n", - " print(f'sequence number : {seq}, data : {data}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can see that the system detected a change in the 106 instance, which is 10006 instance in the real dataset -
\n", - "5000 first instances are for train, we started pushing data from the 4900 instance of the test dataset (9900 from the real dataset), and we pushed only 200 instances.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "[Back to the top](#Concept-Drift---Deployer)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python [conda env:root] *", - "language": "python", - "name": "conda-root-py" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/concept_drift/concept_drift.py b/concept_drift/concept_drift.py deleted file mode 100644 index 03355d3b5..000000000 --- a/concept_drift/concept_drift.py +++ /dev/null @@ -1,147 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import skmultiflow.drift_detection # We will grab our PH, DDM, EDDM algorithms from here -import numpy as np -import pandas as pd -import os -from cloudpickle import dumps, load, dump - -from nuclio.triggers import V3IOStreamTrigger -from mlrun import DataItem, import_function, mlconf, MLClientCtx, mount_v3io - -import random - - -def concept_drift_deployer( - context: MLClientCtx, - base_dataset: DataItem, - input_stream: str, - consumer_group: str, - output_stream: str, - output_tsdb: str, - tsdb_batch_size: int, - callbacks: list, - models: list = ["ddm", "eddm", "pagehinkley"], - models_dest="models", - pagehinkley_threshold: float = 10, - ddm_warning_level: float = 2, - ddm_out_control_level: float = 3, - label_col="label", - prediction_col="prediction", - hub_url: str = mlconf.hub_url, - fn_tag: str = "master", -): - """Deploy a streaming Concept Drift detector on a labeled stream - This function is the Deployment step for the Streaming Concept Drift Detector. - It will load the selected drift detectors and initialize them with the - base_dataset's statistics. Then it will deploy the concept_drift_streaming - function and pass the models to it for streaming concept-drift detection on top - of a labeled stream. - - :param context: MLRun context - :param base_dataset: Dataset containing label_col and prediction_col to initialize the detectors - :param input_stream: labeled stream to track. - Should contain label_col and prediction_col - :param output_stream: Output stream to push the detector's alerts - :param output_tsdb: Output TSDB table to allow analysis and display - :param tsdb_batch_size: Batch size of alerts to buffer before pushing to the TSDB - :param callbacks: Additional rest endpoints to send the alert data to - :param models: List of the detectors to deploy - Defaults to ['ddm', 'eddm', 'pagehinkley']. - :param models_dest: Location for saving the detectors - Defaults to 'models' (in relation to artifact_path). - :param pagehinkley_threshold: Drift level threshold for PH detector Defaults to 10. - :param ddm_warning_level: Warning level alert for DDM detector Defaults to 2. - :param ddm_out_control_level: Drift level alert for DDM detector Defaults to 3. - :param label_col: Label column to be used on base_dataset and input_stream - Defaults to 'label'. - :param prediction_col: Prediction column to be used on base_dataset and input_stream - Defaults to 'prediction'. - :param hub_url: hub_url in case the default is not used, concept_drift_streaming will be loaded - by this url - Defaults to mlconf.hub_url. - :param fn_tag: hub tag to use - Defaults to 'master' - """ - - mlconf.dbpath = mlconf.dbpath or "http://mlrun-api:8080" - mlconf.hub_url = hub_url - fn = import_function(url=f"hub://concept_drift_streaming:{fn_tag}") - - context.logger.info("Loading base dataset") - base_df = base_dataset.as_df() - error_stream = np.where( - base_df[prediction_col].values == base_df[label_col].values, 0, 1 - ) - - context.logger.info("Creating models") - models = [ - model.strip() - for model in os.getenv("models", "pagehinkley, ddm, eddm").split(",") - ] - models = { - "eddm": skmultiflow.drift_detection.EDDM(), - "pagehinkley": skmultiflow.drift_detection.PageHinkley( - min_instances=len(error_stream), threshold=pagehinkley_threshold - ), - "ddm": skmultiflow.drift_detection.DDM( - min_num_instances=len(error_stream), - warning_level=ddm_warning_level, - out_control_level=ddm_out_control_level, - ), - } - - context.logger.info("Streaming data to models") - for i in range(len(error_stream)): - for model_name, model in models.items(): - model.add_element(error_stream[i]) - - context.logger.info("Logging ready models") - for name, model in models.items(): - data = dumps(model) - model_file = f"{name}.pkl" - context.log_model( - f"{name}_concept_drift", - body=data, - labels={"framework": "skmultiflow", "workflow": "concept-drift"}, - model_file=model_file, - model_dir=models_dest, - tag="latest", - ) - fn.set_envs( - { - f"{name}_model_path": os.path.join( - context.artifact_path, models_dest, model_file - ) - } - ) - - context.logger.info("Deploying Concept Drift Streaming function") - fn.set_envs( - { - "label_col": label_col, - "prediction_col": prediction_col, - "drift_stream": output_stream, - "tsdb_table": output_tsdb, - "pagehinkley_threshold": pagehinkley_threshold, - "ddm_warning_level": ddm_warning_level, - "ddm_out_control": ddm_out_control_level, - } - ) - fn.add_v3io_stream_trigger(stream_path = input_stream, name = 'stream', group = consumer_group) - fn.apply(mount_v3io()) - fn.deploy(project=context.project) diff --git a/concept_drift/function.yaml b/concept_drift/function.yaml deleted file mode 100644 index 071111c78..000000000 --- a/concept_drift/function.yaml +++ /dev/null @@ -1,112 +0,0 @@ -kind: job -metadata: - name: concept-drift - tag: '' - hash: 935da41196802875e19948974f32b6f00c29feb2 - project: '' - labels: - author: orz - framework: sklearn - categories: - - machine-learning - - model-serving -spec: - command: '' - args: [] - image: mlrun/ml-models - env: [] - default_handler: concept_drift_deployer - entry_points: - concept_drift_deployer: - name: concept_drift_deployer - doc: "Deploy a streaming Concept Drift detector on a labeled stream\n This\ - \ function is the Deployment step for the Streaming Concept Drift Detector.\n\ - \ It will load the selected drift detectors and initialize them with the\n\ - \ base_dataset's statistics. Then it will deploy the concept_drift_streaming\n\ - \ function and pass the models to it for streaming concept-drift detection\ - \ on top\n of a labeled stream." - parameters: - - name: context - type: MLClientCtx - doc: MLRun context - default: '' - - name: base_dataset - type: DataItem - doc: Dataset containing label_col and prediction_col to initialize the detectors - default: '' - - name: input_stream - type: str - doc: labeled stream to track. Should contain label_col and prediction_col - default: '' - - name: consumer_group - type: str - default: '' - - name: output_stream - type: str - doc: Output stream to push the detector's alerts - default: '' - - name: output_tsdb - type: str - doc: Output TSDB table to allow analysis and display - default: '' - - name: tsdb_batch_size - type: int - doc: Batch size of alerts to buffer before pushing to the TSDB - default: '' - - name: callbacks - type: list - doc: Additional rest endpoints to send the alert data to - default: '' - - name: models - type: list - doc: List of the detectors to deploy Defaults to ['ddm', 'eddm', 'pagehinkley']. - default: - - ddm - - eddm - - pagehinkley - - name: models_dest - doc: Location for saving the detectors Defaults to 'models' (in relation to - artifact_path). - default: models - - name: pagehinkley_threshold - type: float - doc: Drift level threshold for PH detector Defaults to 10. - default: 10 - - name: ddm_warning_level - type: float - doc: Warning level alert for DDM detector Defaults to 2. - default: 2 - - name: ddm_out_control_level - type: float - doc: Drift level alert for DDM detector Defaults to 3. - default: 3 - - name: label_col - doc: Label column to be used on base_dataset and input_stream Defaults to - 'label'. - default: label - - name: prediction_col - doc: Prediction column to be used on base_dataset and input_stream Defaults - to 'prediction'. - default: prediction - - name: hub_url - type: str - doc: hub_url in case the default is not used, concept_drift_streaming will - be loaded by this url Defaults to mlconf.hub_url. - default: <_ast.Name object at 0x7f48eda946d0> - - name: fn_tag - type: str - doc: hub tag to use Defaults to 'master' - default: master - outputs: - - default: '' - lineno: 15 - description: Deploy a streaming Concept Drift detector on a labeled stream - build: - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IHNrbXVsdGlmbG93LmRyaWZ0X2RldGVjdGlvbiAgIyBXZSB3aWxsIGdyYWIgb3VyIFBILCBERE0sIEVERE0gYWxnb3JpdGhtcyBmcm9tIGhlcmUKaW1wb3J0IG51bXB5IGFzIG5wCmltcG9ydCBwYW5kYXMgYXMgcGQKaW1wb3J0IG9zCmZyb20gY2xvdWRwaWNrbGUgaW1wb3J0IGR1bXBzLCBsb2FkLCBkdW1wCgpmcm9tIG51Y2xpby50cmlnZ2VycyBpbXBvcnQgVjNJT1N0cmVhbVRyaWdnZXIKZnJvbSBtbHJ1biBpbXBvcnQgRGF0YUl0ZW0sIGltcG9ydF9mdW5jdGlvbiwgbWxjb25mLCBNTENsaWVudEN0eCwgbW91bnRfdjNpbwoKaW1wb3J0IHJhbmRvbQoKCmRlZiBjb25jZXB0X2RyaWZ0X2RlcGxveWVyKAogICAgY29udGV4dDogTUxDbGllbnRDdHgsCiAgICBiYXNlX2RhdGFzZXQ6IERhdGFJdGVtLAogICAgaW5wdXRfc3RyZWFtOiBzdHIsCiAgICBjb25zdW1lcl9ncm91cDogc3RyLAogICAgb3V0cHV0X3N0cmVhbTogc3RyLAogICAgb3V0cHV0X3RzZGI6IHN0ciwKICAgIHRzZGJfYmF0Y2hfc2l6ZTogaW50LAogICAgY2FsbGJhY2tzOiBsaXN0LAogICAgbW9kZWxzOiBsaXN0ID0gWyJkZG0iLCAiZWRkbSIsICJwYWdlaGlua2xleSJdLAogICAgbW9kZWxzX2Rlc3Q9Im1vZGVscyIsCiAgICBwYWdlaGlua2xleV90aHJlc2hvbGQ6IGZsb2F0ID0gMTAsCiAgICBkZG1fd2FybmluZ19sZXZlbDogZmxvYXQgPSAyLAogICAgZGRtX291dF9jb250cm9sX2xldmVsOiBmbG9hdCA9IDMsCiAgICBsYWJlbF9jb2w9ImxhYmVsIiwKICAgIHByZWRpY3Rpb25fY29sPSJwcmVkaWN0aW9uIiwKICAgIGh1Yl91cmw6IHN0ciA9IG1sY29uZi5odWJfdXJsLAogICAgZm5fdGFnOiBzdHIgPSAibWFzdGVyIiwKKToKICAgICIiIkRlcGxveSBhIHN0cmVhbWluZyBDb25jZXB0IERyaWZ0IGRldGVjdG9yIG9uIGEgbGFiZWxlZCBzdHJlYW0KICAgICAgIFRoaXMgZnVuY3Rpb24gaXMgdGhlIERlcGxveW1lbnQgc3RlcCBmb3IgdGhlIFN0cmVhbWluZyBDb25jZXB0IERyaWZ0IERldGVjdG9yLgogICAgICAgSXQgd2lsbCBsb2FkIHRoZSBzZWxlY3RlZCBkcmlmdCBkZXRlY3RvcnMgYW5kIGluaXRpYWxpemUgdGhlbSB3aXRoIHRoZQogICAgICAgYmFzZV9kYXRhc2V0J3Mgc3RhdGlzdGljcy4gIFRoZW4gaXQgd2lsbCBkZXBsb3kgdGhlIGNvbmNlcHRfZHJpZnRfc3RyZWFtaW5nCiAgICAgICBmdW5jdGlvbiBhbmQgcGFzcyB0aGUgbW9kZWxzIHRvIGl0IGZvciBzdHJlYW1pbmcgY29uY2VwdC1kcmlmdCBkZXRlY3Rpb24gb24gdG9wCiAgICAgICBvZiBhIGxhYmVsZWQgc3RyZWFtLgoKICAgIDpwYXJhbSBjb250ZXh0OiAgICAgICAgIE1MUnVuIGNvbnRleHQKICAgIDpwYXJhbSBiYXNlX2RhdGFzZXQ6ICAgIERhdGFzZXQgY29udGFpbmluZyBsYWJlbF9jb2wgYW5kIHByZWRpY3Rpb25fY29sIHRvIGluaXRpYWxpemUgdGhlIGRldGVjdG9ycwogICAgOnBhcmFtIGlucHV0X3N0cmVhbTogICAgbGFiZWxlZCBzdHJlYW0gdG8gdHJhY2suCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBTaG91bGQgY29udGFpbiBsYWJlbF9jb2wgYW5kIHByZWRpY3Rpb25fY29sCiAgICA6cGFyYW0gb3V0cHV0X3N0cmVhbTogICBPdXRwdXQgc3RyZWFtIHRvIHB1c2ggdGhlIGRldGVjdG9yJ3MgYWxlcnRzCiAgICA6cGFyYW0gb3V0cHV0X3RzZGI6ICAgICBPdXRwdXQgVFNEQiB0YWJsZSB0byBhbGxvdyBhbmFseXNpcyBhbmQgZGlzcGxheQogICAgOnBhcmFtIHRzZGJfYmF0Y2hfc2l6ZTogQmF0Y2ggc2l6ZSBvZiBhbGVydHMgdG8gYnVmZmVyIGJlZm9yZSBwdXNoaW5nIHRvIHRoZSBUU0RCCiAgICA6cGFyYW0gY2FsbGJhY2tzOiAgICAgICBBZGRpdGlvbmFsIHJlc3QgZW5kcG9pbnRzIHRvIHNlbmQgdGhlIGFsZXJ0IGRhdGEgdG8KICAgIDpwYXJhbSBtb2RlbHM6ICAgICAgICAgIExpc3Qgb2YgdGhlIGRldGVjdG9ycyB0byBkZXBsb3kKICAgICAgICAgICAgICAgICAgICAgICAgICAgIERlZmF1bHRzIHRvIFsnZGRtJywgJ2VkZG0nLCAncGFnZWhpbmtsZXknXS4KICAgIDpwYXJhbSBtb2RlbHNfZGVzdDogICAgIExvY2F0aW9uIGZvciBzYXZpbmcgdGhlIGRldGVjdG9ycwogICAgICAgICAgICAgICAgICAgICAgICAgICAgRGVmYXVsdHMgdG8gJ21vZGVscycgKGluIHJlbGF0aW9uIHRvIGFydGlmYWN0X3BhdGgpLgogICAgOnBhcmFtIHBhZ2VoaW5rbGV5X3RocmVzaG9sZDogIERyaWZ0IGxldmVsIHRocmVzaG9sZCBmb3IgUEggZGV0ZWN0b3IgRGVmYXVsdHMgdG8gMTAuCiAgICA6cGFyYW0gZGRtX3dhcm5pbmdfbGV2ZWw6ICAgICAgV2FybmluZyBsZXZlbCBhbGVydCBmb3IgRERNIGRldGVjdG9yIERlZmF1bHRzIHRvIDIuCiAgICA6cGFyYW0gZGRtX291dF9jb250cm9sX2xldmVsOiAgRHJpZnQgbGV2ZWwgYWxlcnQgZm9yIERETSBkZXRlY3RvciBEZWZhdWx0cyB0byAzLgogICAgOnBhcmFtIGxhYmVsX2NvbDogICAgICAgTGFiZWwgY29sdW1uIHRvIGJlIHVzZWQgb24gYmFzZV9kYXRhc2V0IGFuZCBpbnB1dF9zdHJlYW0KICAgICAgICAgICAgICAgICAgICAgICAgICAgIERlZmF1bHRzIHRvICdsYWJlbCcuCiAgICA6cGFyYW0gcHJlZGljdGlvbl9jb2w6ICBQcmVkaWN0aW9uIGNvbHVtbiB0byBiZSB1c2VkIG9uIGJhc2VfZGF0YXNldCBhbmQgaW5wdXRfc3RyZWFtCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBEZWZhdWx0cyB0byAncHJlZGljdGlvbicuCiAgICA6cGFyYW0gaHViX3VybDogICAgICAgICBodWJfdXJsIGluIGNhc2UgdGhlIGRlZmF1bHQgaXMgbm90IHVzZWQsIGNvbmNlcHRfZHJpZnRfc3RyZWFtaW5nIHdpbGwgYmUgbG9hZGVkCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBieSB0aGlzIHVybAogICAgICAgICAgICAgICAgICAgICAgICAgICAgRGVmYXVsdHMgdG8gbWxjb25mLmh1Yl91cmwuCiAgICA6cGFyYW0gZm5fdGFnOiAgICAgICAgICBodWIgdGFnIHRvIHVzZQogICAgICAgICAgICAgICAgICAgICAgICAgICAgRGVmYXVsdHMgdG8gJ21hc3RlcicKICAgICIiIgoKICAgIG1sY29uZi5kYnBhdGggPSBtbGNvbmYuZGJwYXRoIG9yICJodHRwOi8vbWxydW4tYXBpOjgwODAiCiAgICBtbGNvbmYuaHViX3VybCA9IGh1Yl91cmwKICAgIGZuID0gaW1wb3J0X2Z1bmN0aW9uKHVybD1mImh1YjovL2NvbmNlcHRfZHJpZnRfc3RyZWFtaW5nOntmbl90YWd9IikKCiAgICBjb250ZXh0LmxvZ2dlci5pbmZvKCJMb2FkaW5nIGJhc2UgZGF0YXNldCIpCiAgICBiYXNlX2RmID0gYmFzZV9kYXRhc2V0LmFzX2RmKCkKICAgIGVycm9yX3N0cmVhbSA9IG5wLndoZXJlKAogICAgICAgIGJhc2VfZGZbcHJlZGljdGlvbl9jb2xdLnZhbHVlcyA9PSBiYXNlX2RmW2xhYmVsX2NvbF0udmFsdWVzLCAwLCAxCiAgICApCgogICAgY29udGV4dC5sb2dnZXIuaW5mbygiQ3JlYXRpbmcgbW9kZWxzIikKICAgIG1vZGVscyA9IFsKICAgICAgICBtb2RlbC5zdHJpcCgpCiAgICAgICAgZm9yIG1vZGVsIGluIG9zLmdldGVudigibW9kZWxzIiwgInBhZ2VoaW5rbGV5LCBkZG0sIGVkZG0iKS5zcGxpdCgiLCIpCiAgICBdCiAgICBtb2RlbHMgPSB7CiAgICAgICAgImVkZG0iOiBza211bHRpZmxvdy5kcmlmdF9kZXRlY3Rpb24uRURETSgpLAogICAgICAgICJwYWdlaGlua2xleSI6IHNrbXVsdGlmbG93LmRyaWZ0X2RldGVjdGlvbi5QYWdlSGlua2xleSgKICAgICAgICAgICAgbWluX2luc3RhbmNlcz1sZW4oZXJyb3Jfc3RyZWFtKSwgdGhyZXNob2xkPXBhZ2VoaW5rbGV5X3RocmVzaG9sZAogICAgICAgICksCiAgICAgICAgImRkbSI6IHNrbXVsdGlmbG93LmRyaWZ0X2RldGVjdGlvbi5ERE0oCiAgICAgICAgICAgIG1pbl9udW1faW5zdGFuY2VzPWxlbihlcnJvcl9zdHJlYW0pLAogICAgICAgICAgICB3YXJuaW5nX2xldmVsPWRkbV93YXJuaW5nX2xldmVsLAogICAgICAgICAgICBvdXRfY29udHJvbF9sZXZlbD1kZG1fb3V0X2NvbnRyb2xfbGV2ZWwsCiAgICAgICAgKSwKICAgIH0KCiAgICBjb250ZXh0LmxvZ2dlci5pbmZvKCJTdHJlYW1pbmcgZGF0YSB0byBtb2RlbHMiKQogICAgZm9yIGkgaW4gcmFuZ2UobGVuKGVycm9yX3N0cmVhbSkpOgogICAgICAgIGZvciBtb2RlbF9uYW1lLCBtb2RlbCBpbiBtb2RlbHMuaXRlbXMoKToKICAgICAgICAgICAgbW9kZWwuYWRkX2VsZW1lbnQoZXJyb3Jfc3RyZWFtW2ldKQoKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oIkxvZ2dpbmcgcmVhZHkgbW9kZWxzIikKICAgIGZvciBuYW1lLCBtb2RlbCBpbiBtb2RlbHMuaXRlbXMoKToKICAgICAgICBkYXRhID0gZHVtcHMobW9kZWwpCiAgICAgICAgbW9kZWxfZmlsZSA9IGYie25hbWV9LnBrbCIKICAgICAgICBjb250ZXh0LmxvZ19tb2RlbCgKICAgICAgICAgICAgZiJ7bmFtZX1fY29uY2VwdF9kcmlmdCIsCiAgICAgICAgICAgIGJvZHk9ZGF0YSwKICAgICAgICAgICAgbGFiZWxzPXsiZnJhbWV3b3JrIjogInNrbXVsdGlmbG93IiwgIndvcmtmbG93IjogImNvbmNlcHQtZHJpZnQifSwKICAgICAgICAgICAgbW9kZWxfZmlsZT1tb2RlbF9maWxlLAogICAgICAgICAgICBtb2RlbF9kaXI9bW9kZWxzX2Rlc3QsCiAgICAgICAgICAgIHRhZz0ibGF0ZXN0IiwKICAgICAgICApCiAgICAgICAgZm4uc2V0X2VudnMoCiAgICAgICAgICAgIHsKICAgICAgICAgICAgICAgIGYie25hbWV9X21vZGVsX3BhdGgiOiBvcy5wYXRoLmpvaW4oCiAgICAgICAgICAgICAgICAgICAgY29udGV4dC5hcnRpZmFjdF9wYXRoLCBtb2RlbHNfZGVzdCwgbW9kZWxfZmlsZQogICAgICAgICAgICAgICAgKQogICAgICAgICAgICB9CiAgICAgICAgKQoKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oIkRlcGxveWluZyBDb25jZXB0IERyaWZ0IFN0cmVhbWluZyBmdW5jdGlvbiIpCiAgICBmbi5zZXRfZW52cygKICAgICAgICB7CiAgICAgICAgICAgICJsYWJlbF9jb2wiOiBsYWJlbF9jb2wsCiAgICAgICAgICAgICJwcmVkaWN0aW9uX2NvbCI6IHByZWRpY3Rpb25fY29sLAogICAgICAgICAgICAiZHJpZnRfc3RyZWFtIjogb3V0cHV0X3N0cmVhbSwKICAgICAgICAgICAgInRzZGJfdGFibGUiOiBvdXRwdXRfdHNkYiwKICAgICAgICAgICAgInBhZ2VoaW5rbGV5X3RocmVzaG9sZCI6IHBhZ2VoaW5rbGV5X3RocmVzaG9sZCwKICAgICAgICAgICAgImRkbV93YXJuaW5nX2xldmVsIjogZGRtX3dhcm5pbmdfbGV2ZWwsCiAgICAgICAgICAgICJkZG1fb3V0X2NvbnRyb2wiOiBkZG1fb3V0X2NvbnRyb2xfbGV2ZWwsCiAgICAgICAgfQogICAgKQogICAgZm4uYWRkX3YzaW9fc3RyZWFtX3RyaWdnZXIoc3RyZWFtX3BhdGggPSBpbnB1dF9zdHJlYW0sIG5hbWUgPSAnc3RyZWFtJywgZ3JvdXAgPSBjb25zdW1lcl9ncm91cCkKICAgIGZuLmFwcGx5KG1vdW50X3YzaW8oKSkKICAgIGZuLmRlcGxveShwcm9qZWN0PWNvbnRleHQucHJvamVjdCkK - commands: - - python -m pip install scikit-multiflow - code_origin: https://github.com/daniels290813/functions.git#82bbfde4afa2eae77059e05c70bbebacf530fd0d:/User/test/functions/concept_drift/concept_drift.py - origin_filename: /User/test/functions/concept_drift/concept_drift.py - disable_auto_mount: false - affinity: null -verbose: false diff --git a/concept_drift/item.yaml b/concept_drift/item.yaml deleted file mode 100644 index 2ee37e386..000000000 --- a/concept_drift/item.yaml +++ /dev/null @@ -1,27 +0,0 @@ -apiVersion: v1 -categories: -- machine-learning -- model-serving -description: Deploy a streaming Concept Drift detector on a labeled stream -doc: '' -example: concept_drift.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: orz - framework: sklearn -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: concept-drift -platformVersion: 3.5.0 -spec: - filename: concept_drift.py - handler: concept_drift_deployer - image: mlrun/ml-models - kind: job - requirements: - - scikit-multiflow -url: '' -version: 1.1.0 diff --git a/concept_drift/requirements.txt b/concept_drift/requirements.txt deleted file mode 100644 index fa0fddd88..000000000 --- a/concept_drift/requirements.txt +++ /dev/null @@ -1 +0,0 @@ -skmultiflow \ No newline at end of file diff --git a/concept_drift_streaming/concept_drift_streaming.ipynb b/concept_drift_streaming/concept_drift_streaming.ipynb deleted file mode 100644 index b916cb7a2..000000000 --- a/concept_drift_streaming/concept_drift_streaming.ipynb +++ /dev/null @@ -1,480 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Concept Drift Streaming" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "from pprint import pprint" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "%%nuclio cmd -c\n", - "python -m pip install scikit-multiflow==0.4.1\n", - "python -m pip install v3io_frames" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting kind to 'nuclio'\n", - "%nuclio: setting spec.build.baseImage to 'mlrun/ml-models'\n" - ] - } - ], - "source": [ - "# Define function spec\n", - "%nuclio config kind = \"nuclio\"\n", - "%nuclio config spec.build.baseImage = \"mlrun/ml-models\"\n", - "\n", - "# Add V3IO Mount\n", - "# %nuclio env %v3io" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "env = {'label_col': 'resp',\n", - " 'prediction_col': 'prediction',\n", - " 'drift_stream': '/bigdata/network-operations/drift_stream',\n", - " 'tsdb_table': 'network-operations/drift_tsdb',\n", - " 'pagehinkley_threshold': 10,\n", - " 'models': ['pagehinkley', 'ddm', 'eddm'],\n", - " 'window_size': 10}\n", - "config = {'kind': 'nuclio',\n", - " 'spec.build.baseImage': 'mlrun/ml-models'}\n", - "cmd = ['python -m pip install scikit-multiflow',\n", - " 'python -m pip install v3io_frames']\n", - "v3io = True\n", - "config = nuclio.ConfigSpec(env=env,\n", - " config=config,\n", - " cmd=cmd,\n", - " v3io=v3io)" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: start-code" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "import skmultiflow.drift_detection\n", - "import numpy as np\n", - "import pandas as pd\n", - "import os\n", - "import json\n", - "import v3io.dataplane\n", - "import v3io_frames as v3f\n", - "import requests\n", - "from cloudpickle import load\n", - "\n", - "# For testing\n", - "import random" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "def split_path(mntpath=''):\n", - " if mntpath[0] == '/':\n", - " mntpath = mntpath[1:]\n", - " paths = mntpath.split('/')\n", - " container = paths[0]\n", - " subpath = ''\n", - " if len(paths) > 1:\n", - " subpath = mntpath[len(container):]\n", - " return container, subpath\n", - "\n", - "\n", - "def create_stream(context, path, shards=1):\n", - " # create a stream w/8 shards\n", - " container, stream_path = split_path(path)\n", - " context.logger.info(f'Creating stream in Container: {container} & Path {stream_path}')\n", - " response = context.v3io_client.create_stream(container=container,\n", - " path=stream_path, \n", - " shard_count=shards,\n", - " raise_for_status=v3io.dataplane.RaiseForStatus.never)\n", - " response.raise_for_status([409, 204])\n", - " \n", - " \n", - "def push_to_stream(context, stream_path, data):\n", - " records = [{'data': json.dumps(rec)} for rec in data]\n", - " container, stream_path = split_path(stream_path)\n", - " response = context.v3io_client.put_records(container=container,\n", - " path=stream_path, \n", - " records=records)\n", - "\n", - "\n", - "def construct_record(record):\n", - " label_col = os.getenv('label_col', 'label')\n", - " prediction_col = os.getenv('prediction_col', 'prediction')\n", - " res = dict([(k, record[k]) for k in ['when', 'class', 'model', 'resp', 'request']])\n", - " res['feature_vector'] = res.pop('request')['instances'][0]\n", - " res['timestamp'] = res.pop('when')\n", - " res['prediction'] = res['resp'][0]\n", - " return res" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "def init_context(context):\n", - " # create a v3io context object\n", - " v3io_client = v3io.dataplane.Client()\n", - " setattr(context, \"v3io_client\", v3io_client)\n", - " \n", - " # Setup windowing for TSDB writer\n", - " v3f_client = v3f.Client('framesd:8081', container='bigdata')\n", - " setattr(context, \"v3f\", v3f_client)\n", - " window = []\n", - " setattr(context, 'window', window)\n", - " setattr(context, 'window_size', int(os.getenv('window_size', 10)))\n", - " setattr(context, 'tsdb_table', os.getenv('tsdb_table', 'concept_drift_tsdb_1'))\n", - " try:\n", - " context.v3f.create('tsdb', context.tsdb_table, rate='1/s', if_exists=1)\n", - " except Exception as e:\n", - " context.logger.info(f'Creating context with rate= faile for {e}')\n", - " context.v3f.create('tsdb', context.tsdb_table, attrs={'rate': '1/s'}, if_exists=1)\n", - " \n", - " # Setup callbacks\n", - " callbacks = [callback.strip() for callback in os.getenv('callbacks', '').split(',')]\n", - " setattr(context, 'callbacks', callbacks)\n", - " \n", - " # Setup drift stream\n", - " setattr(context, 'drift_stream', os.getenv('drift_stream', '/bigdata/drift_stream'))\n", - " try:\n", - " create_stream(context, context.drift_stream, int(os.getenv('drift_stream_shards', 1)))\n", - " except:\n", - " context.logger.info(f'{context.drift_stream} already exists')\n", - " \n", - " # Load models\n", - " models = {}\n", - " model_types = ['pagehinkely', 'ddm', 'eddm']\n", - " path_suffix = '_model_path'\n", - " for model in model_types:\n", - " model_env = f'{model}{path_suffix}'\n", - " if model_env in os.environ:\n", - " with open(os.environ[model_env], 'rb') as f:\n", - " models[model] = load(f)\n", - " setattr(context, 'models', models)\n", - " \n", - " # Columns to check\n", - " setattr(context, 'label_col', os.getenv('label_col', 'label'))\n", - " setattr(context, 'prediction_col', os.getenv('prediction_col', 'prediction'))" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "def handler(context, event):\n", - " # Construct event\n", - " context.logger.info(f'event: {event.body}')\n", - " full_event = json.loads(event.body)\n", - " record = construct_record(full_event)\n", - " \n", - " # Is our prediction wrong?\n", - " is_error = record[context.label_col] != record[context.prediction_col]\n", - " context.logger.info(f'Adding {is_error}')\n", - " \n", - " # Process the {is_error} element with our algorithms\n", - " for name, model in context.models.items():\n", - " # Add element\n", - " results = {'timestamp': record['timestamp']}\n", - " results['algorithm'] = name\n", - " model.add_element(is_error)\n", - " \n", - " # Detect warning zone (if applicable to the algorithm)\n", - " if hasattr(model, 'detected_warning_zone') and model.detected_warning_zone():\n", - " context.logger.info(f'{name}\\tWarning zone detected')\n", - " results['warning_zone'] = 1\n", - " full_event[f'{name}_warning_zone'] = 1\n", - " else:\n", - " results['warning_zone'] = 0\n", - " full_event[f'{name}_warning_zone'] = 0\n", - " \n", - " # Detect drift\n", - " if model.detected_change():\n", - " context.logger.info('Change Detected')\n", - " results['change_detected'] = 1\n", - " full_event[f'{name}_drift'] = 1\n", - " else:\n", - " results['change_detected'] = 0\n", - " full_event[f'{name}_drift'] = 0\n", - " context.window.append(results)\n", - " \n", - " # Return results\n", - " # Write to stream\n", - " push_to_stream(context, context.drift_stream, [full_event])\n", - " \n", - " # Add to callbacks\n", - " if context.callbacks != ['']:\n", - " for callback in context.callbacks:\n", - " requests.post(url=callback,\n", - " json=full_event)\n", - " \n", - " if (len(context.window) / len(context.models)) >= context.window_size:\n", - " df = pd.DataFrame(context.window)\n", - " df['timestamp'] = pd.to_datetime(df['timestamp'])\n", - " df = df.set_index(['timestamp', 'algorithm'])\n", - " context.v3f.write('tsdb', context.tsdb_table, df)\n", - " context.window = []" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Test " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "init_context(context)\n", - "event = nuclio.Event(body=json.dumps({'prediction': 0,\n", - " 'when': 'now',\n", - " 'class': 'ClassModel', \n", - " 'model': 'tester_v1', \n", - " 'resp': [0], \n", - " 'request': {'instances': [[1, 1.2, 3]]}}))\n", - "out = handler(context, event)\n", - "out" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Cluster" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%nuclio deploy -n network-operations-concept-drift -p network-operations" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Save function yaml" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "from os import path\n", - "from mlrun import run_local, NewTask, mlconf, import_function, mount_v3io, code_to_function, get_run_db\n", - "mlconf.dbpath = mlconf.dbpath or 'http://mlrun-api:8080'" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-07-14 13:49:22,720 function spec saved to path: /User/functions/concept_drift_streaming/function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# create job function object from notebook code\n", - "fn = code_to_function(\"concept_drift_streaming\", kind='nuclio')\n", - "\n", - "# add metadata (for templates and reuse)\n", - "fn.spec.default_handler = \"handler\"\n", - "fn.spec.description = \"Deploy a streaming Concept Drift detector on a labeled stream. the nuclio part of the concept_drift function\"\n", - "fn.metadata.categories = [\"ml\", \"serve\"]\n", - "fn.metadata.labels = {\"author\": \"orz\", \"framework\": \"sklearn\"}\n", - "fn.export(\"/User/functions/concept_drift_streaming/function.yaml\")" - ] - }, - { - "cell_type": "code", - "execution_count": 120, - "metadata": {}, - "outputs": [], - "source": [ - "stream_trigger = nuclio.triggers.V3IOStreamTrigger(url='/bigdata/network-operations/inference_stream@cd2')" - ] - }, - { - "cell_type": "code", - "execution_count": 121, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 121, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.add_trigger('labeled_stream', stream_trigger)" - ] - }, - { - "cell_type": "code", - "execution_count": 122, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 122, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.apply(mount_v3io()).with_v3io()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fn.export(\"function.yaml\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Stream testing" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [], - "source": [ - "fn = import_function('./function.yaml')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fn.deploy(project='network-operations')" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/concept_drift_streaming/concept_drift_streaming.py b/concept_drift_streaming/concept_drift_streaming.py deleted file mode 100644 index ebcbf8a1b..000000000 --- a/concept_drift_streaming/concept_drift_streaming.py +++ /dev/null @@ -1,157 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import skmultiflow.drift_detection -import numpy as np -import pandas as pd -import os -import json -import v3io.dataplane -import v3io_frames as v3f -import requests -from cloudpickle import load - -import random - - -def split_path(mntpath=""): - if mntpath[0] == "/": - mntpath = mntpath[1:] - paths = mntpath.split("/") - container = paths[0] - subpath = "" - if len(paths) > 1: - subpath = mntpath[len(container) :] - return container, subpath - - -def create_stream(context, path, shards=1): - container, stream_path = split_path(path) - context.logger.info( - f"Creating stream in Container: {container} & Path {stream_path}" - ) - response = context.v3io_client.create_stream( - container=container, - path=stream_path, - shard_count=shards, - raise_for_status=v3io.dataplane.RaiseForStatus.never, - ) - response.raise_for_status([409, 204]) - - -def push_to_stream(context, stream_path, data): - records = [{"data": json.dumps(rec)} for rec in data] - container, stream_path = split_path(stream_path) - response = context.v3io_client.put_records( - container=container, path=stream_path, records=records - ) - - -def construct_record(record): - label_col = os.getenv("label_col", "label") - prediction_col = os.getenv("prediction_col", "prediction") - res = dict([(k, record[k]) for k in ["when", "class", "model", "resp", "request"]]) - res["feature_vector"] = res.pop("request")["instances"][0] - res["timestamp"] = res.pop("when") - res[prediction_col] = res["resp"][0] - return res - - -def init_context(context): - v3io_client = v3io.dataplane.Client() - setattr(context, "v3io_client", v3io_client) - - v3f_client = v3f.Client("framesd:8081", container="bigdata") - setattr(context, "v3f", v3f_client) - window = [] - setattr(context, "window", window) - setattr(context, "window_size", int(os.getenv("window_size", 10))) - setattr(context, "tsdb_table", os.getenv("tsdb_table", "concept_drift_tsdb_1")) - try: - context.v3f.create("tsdb", context.tsdb_table, rate="1/s", if_exists=1) - except Exception as e: - context.logger.info(f"Creating context with rate= faile for {e}") - context.v3f.create( - "tsdb", context.tsdb_table, attrs={"rate": "1/s"}, if_exists=1 - ) - - callbacks = [callback.strip() for callback in os.getenv("callbacks", "").split(",")] - setattr(context, "callbacks", callbacks) - - setattr(context, "drift_stream", os.getenv("drift_stream", "/bigdata/drift_stream")) - try: - create_stream( - context, context.drift_stream, int(os.getenv("drift_stream_shards", 1)) - ) - except: - context.logger.info(f"{context.drift_stream} already exists") - - models = {} - model_types = ["pagehinkely", "ddm", "eddm"] - path_suffix = "_model_path" - for model in model_types: - model_env = f"{model}{path_suffix}" - if model_env in os.environ: - with open(os.environ[model_env], "rb") as f: - models[model] = load(f) - setattr(context, "models", models) - - setattr(context, "label_col", os.getenv("label_col", "label")) - setattr(context, "prediction_col", os.getenv("prediction_col", "prediction")) - - -def handler(context, event): - context.logger.info(f"event: {event.body}") - full_event = json.loads(event.body) - record = construct_record(full_event) - - is_error = record[context.label_col] != record[context.prediction_col] - context.logger.info(f"Adding {is_error}") - - for name, model in context.models.items(): - results = {"timestamp": record["timestamp"]} - results["algorithm"] = name - model.add_element(is_error) - - if hasattr(model, "detected_warning_zone") and model.detected_warning_zone(): - context.logger.info(f"{name}\tWarning zone detected") - results["warning_zone"] = 1 - full_event[f"{name}_warning_zone"] = 1 - else: - results["warning_zone"] = 0 - full_event[f"{name}_warning_zone"] = 0 - - if model.detected_change(): - context.logger.info("Change Detected") - results["change_detected"] = 1 - full_event[f"{name}_drift"] = 1 - else: - results["change_detected"] = 0 - full_event[f"{name}_drift"] = 0 - context.window.append(results) - - push_to_stream(context, context.drift_stream, [full_event]) - - if context.callbacks != [""]: - for callback in context.callbacks: - requests.post(url=callback, json=full_event) - - if (len(context.window) / len(context.models)) >= context.window_size: - df = pd.DataFrame(context.window) - df["timestamp"] = pd.to_datetime(df["timestamp"]) - df = df.set_index(["timestamp", "algorithm"]) - context.v3f.write("tsdb", context.tsdb_table, df) - context.window = [] diff --git a/concept_drift_streaming/function.yaml b/concept_drift_streaming/function.yaml deleted file mode 100644 index bf1171680..000000000 --- a/concept_drift_streaming/function.yaml +++ /dev/null @@ -1,48 +0,0 @@ -kind: remote -metadata: - name: concept-drift-streaming - tag: '' - hash: dc41ff41149be69f19b91a6d78a06571937063ae - project: '' - labels: - author: orz - framework: sklearn - categories: - - machine-learning - - monitoring -spec: - command: '' - args: [] - image: mlrun/ml-models - description: Deploy a streaming Concept Drift detector on a labeled stream. the - nuclio part of the concept_drift function - min_replicas: 1 - max_replicas: 4 - env: [] - base_spec: - apiVersion: nuclio.io/v1 - kind: Function - metadata: - name: concept-drift-streaming - labels: {} - annotations: - nuclio.io/generated_by: function generated from /User/test/functions/concept_drift_streaming/concept_drift_streaming.py - spec: - runtime: python:3.9 - handler: concept_drift_streaming:handler - env: [] - volumes: [] - build: - commands: [] - noBaseImagesPull: true - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IHNrbXVsdGlmbG93LmRyaWZ0X2RldGVjdGlvbgppbXBvcnQgbnVtcHkgYXMgbnAKaW1wb3J0IHBhbmRhcyBhcyBwZAppbXBvcnQgb3MKaW1wb3J0IGpzb24KaW1wb3J0IHYzaW8uZGF0YXBsYW5lCmltcG9ydCB2M2lvX2ZyYW1lcyBhcyB2M2YKaW1wb3J0IHJlcXVlc3RzCmZyb20gY2xvdWRwaWNrbGUgaW1wb3J0IGxvYWQKCmltcG9ydCByYW5kb20KCgpkZWYgc3BsaXRfcGF0aChtbnRwYXRoPSIiKToKICAgIGlmIG1udHBhdGhbMF0gPT0gIi8iOgogICAgICAgIG1udHBhdGggPSBtbnRwYXRoWzE6XQogICAgcGF0aHMgPSBtbnRwYXRoLnNwbGl0KCIvIikKICAgIGNvbnRhaW5lciA9IHBhdGhzWzBdCiAgICBzdWJwYXRoID0gIiIKICAgIGlmIGxlbihwYXRocykgPiAxOgogICAgICAgIHN1YnBhdGggPSBtbnRwYXRoW2xlbihjb250YWluZXIpIDpdCiAgICByZXR1cm4gY29udGFpbmVyLCBzdWJwYXRoCgoKZGVmIGNyZWF0ZV9zdHJlYW0oY29udGV4dCwgcGF0aCwgc2hhcmRzPTEpOgogICAgY29udGFpbmVyLCBzdHJlYW1fcGF0aCA9IHNwbGl0X3BhdGgocGF0aCkKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oCiAgICAgICAgZiJDcmVhdGluZyBzdHJlYW0gaW4gQ29udGFpbmVyOiB7Y29udGFpbmVyfSAmIFBhdGgge3N0cmVhbV9wYXRofSIKICAgICkKICAgIHJlc3BvbnNlID0gY29udGV4dC52M2lvX2NsaWVudC5jcmVhdGVfc3RyZWFtKAogICAgICAgIGNvbnRhaW5lcj1jb250YWluZXIsCiAgICAgICAgcGF0aD1zdHJlYW1fcGF0aCwKICAgICAgICBzaGFyZF9jb3VudD1zaGFyZHMsCiAgICAgICAgcmFpc2VfZm9yX3N0YXR1cz12M2lvLmRhdGFwbGFuZS5SYWlzZUZvclN0YXR1cy5uZXZlciwKICAgICkKICAgIHJlc3BvbnNlLnJhaXNlX2Zvcl9zdGF0dXMoWzQwOSwgMjA0XSkKCgpkZWYgcHVzaF90b19zdHJlYW0oY29udGV4dCwgc3RyZWFtX3BhdGgsIGRhdGEpOgogICAgcmVjb3JkcyA9IFt7ImRhdGEiOiBqc29uLmR1bXBzKHJlYyl9IGZvciByZWMgaW4gZGF0YV0KICAgIGNvbnRhaW5lciwgc3RyZWFtX3BhdGggPSBzcGxpdF9wYXRoKHN0cmVhbV9wYXRoKQogICAgcmVzcG9uc2UgPSBjb250ZXh0LnYzaW9fY2xpZW50LnB1dF9yZWNvcmRzKAogICAgICAgIGNvbnRhaW5lcj1jb250YWluZXIsIHBhdGg9c3RyZWFtX3BhdGgsIHJlY29yZHM9cmVjb3JkcwogICAgKQoKCmRlZiBjb25zdHJ1Y3RfcmVjb3JkKHJlY29yZCk6CiAgICBsYWJlbF9jb2wgPSBvcy5nZXRlbnYoImxhYmVsX2NvbCIsICJsYWJlbCIpCiAgICBwcmVkaWN0aW9uX2NvbCA9IG9zLmdldGVudigicHJlZGljdGlvbl9jb2wiLCAicHJlZGljdGlvbiIpCiAgICByZXMgPSBkaWN0KFsoaywgcmVjb3JkW2tdKSBmb3IgayBpbiBbIndoZW4iLCAiY2xhc3MiLCAibW9kZWwiLCAicmVzcCIsICJyZXF1ZXN0Il1dKQogICAgcmVzWyJmZWF0dXJlX3ZlY3RvciJdID0gcmVzLnBvcCgicmVxdWVzdCIpWyJpbnN0YW5jZXMiXVswXQogICAgcmVzWyJ0aW1lc3RhbXAiXSA9IHJlcy5wb3AoIndoZW4iKQogICAgcmVzW3ByZWRpY3Rpb25fY29sXSA9IHJlc1sicmVzcCJdWzBdCiAgICByZXR1cm4gcmVzCgoKZGVmIGluaXRfY29udGV4dChjb250ZXh0KToKICAgIHYzaW9fY2xpZW50ID0gdjNpby5kYXRhcGxhbmUuQ2xpZW50KCkKICAgIHNldGF0dHIoY29udGV4dCwgInYzaW9fY2xpZW50IiwgdjNpb19jbGllbnQpCgogICAgdjNmX2NsaWVudCA9IHYzZi5DbGllbnQoImZyYW1lc2Q6ODA4MSIsIGNvbnRhaW5lcj0iYmlnZGF0YSIpCiAgICBzZXRhdHRyKGNvbnRleHQsICJ2M2YiLCB2M2ZfY2xpZW50KQogICAgd2luZG93ID0gW10KICAgIHNldGF0dHIoY29udGV4dCwgIndpbmRvdyIsIHdpbmRvdykKICAgIHNldGF0dHIoY29udGV4dCwgIndpbmRvd19zaXplIiwgaW50KG9zLmdldGVudigid2luZG93X3NpemUiLCAxMCkpKQogICAgc2V0YXR0cihjb250ZXh0LCAidHNkYl90YWJsZSIsIG9zLmdldGVudigidHNkYl90YWJsZSIsICJjb25jZXB0X2RyaWZ0X3RzZGJfMSIpKQogICAgdHJ5OgogICAgICAgIGNvbnRleHQudjNmLmNyZWF0ZSgidHNkYiIsIGNvbnRleHQudHNkYl90YWJsZSwgcmF0ZT0iMS9zIiwgaWZfZXhpc3RzPTEpCiAgICBleGNlcHQgRXhjZXB0aW9uIGFzIGU6CiAgICAgICAgY29udGV4dC5sb2dnZXIuaW5mbyhmIkNyZWF0aW5nIGNvbnRleHQgd2l0aCByYXRlPSBmYWlsZSBmb3Ige2V9IikKICAgICAgICBjb250ZXh0LnYzZi5jcmVhdGUoCiAgICAgICAgICAgICJ0c2RiIiwgY29udGV4dC50c2RiX3RhYmxlLCBhdHRycz17InJhdGUiOiAiMS9zIn0sIGlmX2V4aXN0cz0xCiAgICAgICAgKQoKICAgIGNhbGxiYWNrcyA9IFtjYWxsYmFjay5zdHJpcCgpIGZvciBjYWxsYmFjayBpbiBvcy5nZXRlbnYoImNhbGxiYWNrcyIsICIiKS5zcGxpdCgiLCIpXQogICAgc2V0YXR0cihjb250ZXh0LCAiY2FsbGJhY2tzIiwgY2FsbGJhY2tzKQoKICAgIHNldGF0dHIoY29udGV4dCwgImRyaWZ0X3N0cmVhbSIsIG9zLmdldGVudigiZHJpZnRfc3RyZWFtIiwgIi9iaWdkYXRhL2RyaWZ0X3N0cmVhbSIpKQogICAgdHJ5OgogICAgICAgIGNyZWF0ZV9zdHJlYW0oCiAgICAgICAgICAgIGNvbnRleHQsIGNvbnRleHQuZHJpZnRfc3RyZWFtLCBpbnQob3MuZ2V0ZW52KCJkcmlmdF9zdHJlYW1fc2hhcmRzIiwgMSkpCiAgICAgICAgKQogICAgZXhjZXB0OgogICAgICAgIGNvbnRleHQubG9nZ2VyLmluZm8oZiJ7Y29udGV4dC5kcmlmdF9zdHJlYW19IGFscmVhZHkgZXhpc3RzIikKCiAgICBtb2RlbHMgPSB7fQogICAgbW9kZWxfdHlwZXMgPSBbInBhZ2VoaW5rZWx5IiwgImRkbSIsICJlZGRtIl0KICAgIHBhdGhfc3VmZml4ID0gIl9tb2RlbF9wYXRoIgogICAgZm9yIG1vZGVsIGluIG1vZGVsX3R5cGVzOgogICAgICAgIG1vZGVsX2VudiA9IGYie21vZGVsfXtwYXRoX3N1ZmZpeH0iCiAgICAgICAgaWYgbW9kZWxfZW52IGluIG9zLmVudmlyb246CiAgICAgICAgICAgIHdpdGggb3Blbihvcy5lbnZpcm9uW21vZGVsX2Vudl0sICJyYiIpIGFzIGY6CiAgICAgICAgICAgICAgICBtb2RlbHNbbW9kZWxdID0gbG9hZChmKQogICAgc2V0YXR0cihjb250ZXh0LCAibW9kZWxzIiwgbW9kZWxzKQoKICAgIHNldGF0dHIoY29udGV4dCwgImxhYmVsX2NvbCIsIG9zLmdldGVudigibGFiZWxfY29sIiwgImxhYmVsIikpCiAgICBzZXRhdHRyKGNvbnRleHQsICJwcmVkaWN0aW9uX2NvbCIsIG9zLmdldGVudigicHJlZGljdGlvbl9jb2wiLCAicHJlZGljdGlvbiIpKQoKCmRlZiBoYW5kbGVyKGNvbnRleHQsIGV2ZW50KToKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oZiJldmVudDoge2V2ZW50LmJvZHl9IikKICAgIGZ1bGxfZXZlbnQgPSBqc29uLmxvYWRzKGV2ZW50LmJvZHkpCiAgICByZWNvcmQgPSBjb25zdHJ1Y3RfcmVjb3JkKGZ1bGxfZXZlbnQpCgogICAgaXNfZXJyb3IgPSByZWNvcmRbY29udGV4dC5sYWJlbF9jb2xdICE9IHJlY29yZFtjb250ZXh0LnByZWRpY3Rpb25fY29sXQogICAgY29udGV4dC5sb2dnZXIuaW5mbyhmIkFkZGluZyB7aXNfZXJyb3J9IikKCiAgICBmb3IgbmFtZSwgbW9kZWwgaW4gY29udGV4dC5tb2RlbHMuaXRlbXMoKToKICAgICAgICByZXN1bHRzID0geyJ0aW1lc3RhbXAiOiByZWNvcmRbInRpbWVzdGFtcCJdfQogICAgICAgIHJlc3VsdHNbImFsZ29yaXRobSJdID0gbmFtZQogICAgICAgIG1vZGVsLmFkZF9lbGVtZW50KGlzX2Vycm9yKQoKICAgICAgICBpZiBoYXNhdHRyKG1vZGVsLCAiZGV0ZWN0ZWRfd2FybmluZ196b25lIikgYW5kIG1vZGVsLmRldGVjdGVkX3dhcm5pbmdfem9uZSgpOgogICAgICAgICAgICBjb250ZXh0LmxvZ2dlci5pbmZvKGYie25hbWV9XHRXYXJuaW5nIHpvbmUgZGV0ZWN0ZWQiKQogICAgICAgICAgICByZXN1bHRzWyJ3YXJuaW5nX3pvbmUiXSA9IDEKICAgICAgICAgICAgZnVsbF9ldmVudFtmIntuYW1lfV93YXJuaW5nX3pvbmUiXSA9IDEKICAgICAgICBlbHNlOgogICAgICAgICAgICByZXN1bHRzWyJ3YXJuaW5nX3pvbmUiXSA9IDAKICAgICAgICAgICAgZnVsbF9ldmVudFtmIntuYW1lfV93YXJuaW5nX3pvbmUiXSA9IDAKCiAgICAgICAgaWYgbW9kZWwuZGV0ZWN0ZWRfY2hhbmdlKCk6CiAgICAgICAgICAgIGNvbnRleHQubG9nZ2VyLmluZm8oIkNoYW5nZSBEZXRlY3RlZCIpCiAgICAgICAgICAgIHJlc3VsdHNbImNoYW5nZV9kZXRlY3RlZCJdID0gMQogICAgICAgICAgICBmdWxsX2V2ZW50W2Yie25hbWV9X2RyaWZ0Il0gPSAxCiAgICAgICAgZWxzZToKICAgICAgICAgICAgcmVzdWx0c1siY2hhbmdlX2RldGVjdGVkIl0gPSAwCiAgICAgICAgICAgIGZ1bGxfZXZlbnRbZiJ7bmFtZX1fZHJpZnQiXSA9IDAKICAgICAgICBjb250ZXh0LndpbmRvdy5hcHBlbmQocmVzdWx0cykKCiAgICBwdXNoX3RvX3N0cmVhbShjb250ZXh0LCBjb250ZXh0LmRyaWZ0X3N0cmVhbSwgW2Z1bGxfZXZlbnRdKQoKICAgIGlmIGNvbnRleHQuY2FsbGJhY2tzICE9IFsiIl06CiAgICAgICAgZm9yIGNhbGxiYWNrIGluIGNvbnRleHQuY2FsbGJhY2tzOgogICAgICAgICAgICByZXF1ZXN0cy5wb3N0KHVybD1jYWxsYmFjaywganNvbj1mdWxsX2V2ZW50KQoKICAgIGlmIChsZW4oY29udGV4dC53aW5kb3cpIC8gbGVuKGNvbnRleHQubW9kZWxzKSkgPj0gY29udGV4dC53aW5kb3dfc2l6ZToKICAgICAgICBkZiA9IHBkLkRhdGFGcmFtZShjb250ZXh0LndpbmRvdykKICAgICAgICBkZlsidGltZXN0YW1wIl0gPSBwZC50b19kYXRldGltZShkZlsidGltZXN0YW1wIl0pCiAgICAgICAgZGYgPSBkZi5zZXRfaW5kZXgoWyJ0aW1lc3RhbXAiLCAiYWxnb3JpdGhtIl0pCiAgICAgICAgY29udGV4dC52M2Yud3JpdGUoInRzZGIiLCBjb250ZXh0LnRzZGJfdGFibGUsIGRmKQogICAgICAgIGNvbnRleHQud2luZG93ID0gW10K - source: '' - build: - commands: - - python -m pip install scikit-multiflow==0.4.1 v3io_frames - code_origin: https://github.com/daniels290813/functions.git#d96059851b5d51fd4583e982483eb973fccc47d2:/User/test/functions/concept_drift_streaming/concept_drift_streaming.py - origin_filename: /User/test/functions/concept_drift_streaming/concept_drift_streaming.py - default_handler: handler - disable_auto_mount: false - affinity: null -verbose: false diff --git a/concept_drift_streaming/item.yaml b/concept_drift_streaming/item.yaml deleted file mode 100644 index 91dcb9f4f..000000000 --- a/concept_drift_streaming/item.yaml +++ /dev/null @@ -1,29 +0,0 @@ -apiVersion: v1 -categories: -- machine-learning -- monitoring -description: Deploy a streaming Concept Drift detector on a labeled stream. the nuclio - part of the concept_drift function -doc: '' -example: concept_drift_streaming.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: orz - framework: sklearn -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: concept-drift-streaming -platformVersion: 3.5.0 -spec: - filename: concept_drift_streaming.py - handler: handler - image: mlrun/ml-models - kind: nuclio - requirements: - - scikit-multiflow==0.4.1 - - v3io_frames -url: '' -version: 1.1.0 diff --git a/concept_drift_streaming/requirements.txt b/concept_drift_streaming/requirements.txt deleted file mode 100644 index fa0fddd88..000000000 --- a/concept_drift_streaming/requirements.txt +++ /dev/null @@ -1 +0,0 @@ -skmultiflow \ No newline at end of file diff --git a/feature_perms/README.ipynb b/feature_perms/README.ipynb deleted file mode 100644 index 0929a6f6a..000000000 --- a/feature_perms/README.ipynb +++ /dev/null @@ -1,788 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# feature importances" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "There are a number of ways to compute feature importances and **the default estimates reported by scikit learn can be shown to be biased** under certain circumstances. In addition, many non-tree algorithms do not provide conveniently calculated feature importance estimates. The following demonstration is based on material that draws heavily from the following sources:" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## references\n", - "\n", - "\n", - "### repos\n", - "\n", - "* **[Feature importances for scikit-learn machine learning models](https://github.com/parrt/random-forest-importances)**, [MIT License](https://github.com/parrt/random-forest-importances/blob/master/LICENSE)\n", - "* **[Scikit-Learn ensemble module - forests](https://github.com/scikit-learn/scikit-learn/blob/0.23.1/sklearn/ensemble/_forest.py)**, [BSD License](https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/ensemble/_forest.py#L40)\n", - "* **[ELI5 - Permutation Importance](https://eli5.readthedocs.io/en/latest/blackbox/permutation_importance.html)** \n", - "\n", - "### articles\n", - "\n", - "Strobl, C., Boulesteix, A., Zeileis, A. et al. **[Bias in random forest variable importance measures: Illustrations, sources and a solution](https://link.springer.com/article/10.1186/1471-2105-8-25#citeas)**. BMC Bioinformatics 8, 25 (2007). https://doi.org/10.1186/1471-2105-8-25 \n", - "\n", - "Strobl, C., Boulesteix, A., Kneib, T. et al. **[Conditional variable importance for random forests](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-307#citeas)**. BMC Bioinformatics 9, 307 (2008). https://doi.org/10.1186/1471-2105-9-307 " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## what we'll do\n", - "\n", - "* demonstrate an issue with default feature importance estimates \n", - "* provide alternatives and compare to the default \n", - "* create a new function `feature_perms` that implements a computationally simple algorithm \n", - "* create a new function `dropcol_importances` that implements a computationally intensive algorithm that is more accurate\n", - "* test our new functions\n", - "\n", - "It should be noted that although we are developing this notebook using a classification example, an almost identical presentation can be done for regression." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## imports" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "\n", - "import sklearn\n", - "from sklearn.base import clone\n", - "\n", - "from sklearn.ensemble import RandomForestClassifier as SomeModel\n", - "\n", - "import matplotlib.pyplot as plt\n", - "import seaborn as sns\n", - "\n", - "from typing import Union, Callable, List" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## default feature importances\n", - "\n", - "This is a function that plots default feature importances from an estimated model object when available. It is taken from mlrun's current source-code implementation:" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "def feature_importances(\n", - " model: SomeModel,\n", - " header: List[str], \n", - " figsz=(10, 5)\n", - ") -> None:\n", - " \"\"\"Display default model feature importances\n", - "\n", - " Only works for models with attribute 'feature_importances_`\n", - "\n", - " :param model: fitted model with a feature_importances_ attribute\n", - " :param header: feature labels\n", - " :param figsz: matplotlib figure size\n", - " \"\"\"\n", - " if not hasattr(model, \"feature_importances_\"):\n", - " raise Exception(\n", - " \"feature importances are only available for some models\")\n", - "\n", - " # create a feature importance table with desired labels\n", - " zipped = zip(model.feature_importances_, header)\n", - " feature_imp = pd.DataFrame(\n", - " sorted(zipped), columns=[\"freq\", \"feature\"]).sort_values(\n", - " by=\"freq\", ascending=False)\n", - "\n", - " plt.clf()\n", - " plt.figure(figsize=figsz)\n", - " sns.barplot(x=\"freq\", y=\"feature\", data=feature_imp)\n", - " plt.title(\"features\")\n", - " plt.tight_layout();\n", - " \n", - " return feature_imp" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## permuted features\n", - "\n", - "A proposed solution that has general applicability is randomly permuted features**[refs](#references)**: \n", - "* loop through the feature set \n", - "* shuffle one feature \n", - "* run predict\n", - "* compare the (marginal) change in accuracy (or other metric of interest) \n", - "\n", - "This approach is computationally more demanding than relying on the default values, however it can be easily parallelized. To perform the estimation we only need an estimated model and a held-out test set. The following was proposed in **[Beware Default Random Forest Importances](https://explained.ai/rf-importance/index.html)**:\n", - "\n", - "( the following 3 glue functions will no longer be publicly visible in the sklearn package from 0.24 onwards, consider this a temporary hack while we refactor these away)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**the following has been refactored in final version of function:**" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "from distutils.version import LooseVersion\n", - "import numpy as np\n", - "from sklearn.utils import check_random_state\n", - "\n", - "def _generate_sample_indices(random_state: int, n_samples: int, n_samples_bootstrap: int):\n", - " \"\"\"\n", - " Private function used to _parallel_build_trees function.\n", - " taken from:\n", - " https://github.com/scikit-learn/scikit-learn/blob/2253807bb488b6de73796aef2de38a6dcf282d86/sklearn/ensemble/_forest.py#L116\n", - " (public availability to be deprecated by sklearn v0.24)\n", - " \"\"\"\n", - " random_instance = check_random_state(random_state)\n", - " sample_indices = random_instance.randint(0, n_samples, n_samples_bootstrap)\n", - "\n", - " return sample_indices\n", - "\n", - "def _generate_unsampled_indices(random_state: int, n_samples: int, n_samples_bootstrap: int):\n", - " \"\"\"\n", - " Private function used to forest._set_oob_score function.\n", - " taken from: \n", - " https://github.com/scikit-learn/scikit-learn/blob/2253807bb488b6de73796aef2de38a6dcf282d86/sklearn/ensemble/_forest.py#L126\n", - " (public availability to be deprecated by sklearn v0.24)\n", - " \"\"\"\n", - " sample_indices = _generate_sample_indices(random_state, n_samples,\n", - " n_samples_bootstrap)\n", - " sample_counts = np.bincount(sample_indices, minlength=n_samples)\n", - " unsampled_mask = sample_counts == 0\n", - " indices_range = np.arange(n_samples)\n", - " unsampled_indices = indices_range[unsampled_mask]\n", - "\n", - " return unsampled_indices\n", - "\n", - "def _get_unsampled_indices(tree, n_samples: int):\n", - " \"\"\"\n", - " An interface to get unsampled indices regardless of sklearn version.\n", - " \"\"\"\n", - " import warnings\n", - " warnings.simplefilter(action=\"ignore\", category=FutureWarning)\n", - " if LooseVersion(sklearn.__version__) >= LooseVersion(\"0.22\"):\n", - " # Version 0.22 or newer uses 3 arguments.\n", - " from sklearn.ensemble.forest import _get_n_samples_bootstrap\n", - " n_samples_bootstrap = _get_n_samples_bootstrap(n_samples, n_samples)\n", - " return _generate_unsampled_indices(tree.random_state, n_samples,\n", - " n_samples_bootstrap)\n", - " else:\n", - " # Version 0.21 or older uses only two arguments.\n", - " return _generate_unsampled_indices(tree.random_state, n_samples)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The following function estimates classifier accuracy and has been borrowed from **[references](#references)**. See **[breitman on oob](https://www.stat.berkeley.edu/~breiman/OOBestimation.pdf)** for details on out-of-bag estimation:" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "def oob_classifier_accuracy(rf, X_train: np.array, y_train: np.array) -> float:\n", - " \"\"\"\n", - " Compute out-of-bag (OOB) accuracy for a scikit-learn forest classifier.\n", - " \n", - " https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/ensemble/forest.py#L425\n", - " \"\"\"\n", - " X = X_train.values\n", - " y = y_train.values\n", - "\n", - " n_samples = len(X)\n", - " n_classes = len(np.unique(y))\n", - " predictions = np.zeros((n_samples, n_classes))\n", - " for tree in rf.estimators_:\n", - " unsampled_indices = _get_unsampled_indices(tree, n_samples)\n", - " tree_preds = tree.predict_proba(X[unsampled_indices, :])\n", - " predictions[unsampled_indices] += tree_preds\n", - "\n", - " predicted_class_indexes = np.argmax(predictions, axis=1)\n", - " predicted_classes = [rf.classes_[i] for i in predicted_class_indexes]\n", - "\n", - " oob_score = np.mean(y == predicted_classes)\n", - " \n", - " return oob_score" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Putting it all together:" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "def permutation_importances(\n", - " model, \n", - " X_train: np.array,\n", - " y_train: np.array, \n", - " header: List[str],\n", - " metric: Callable = oob_classifier_accuracy,\n", - " figsz=(10, 5)\n", - ") -> np.array:\n", - " \"\"\"calculate change in metric from permuting feature columns\n", - " \n", - " modified from https://explained.ai/rf-importance/index.html\n", - " \n", - " uses a pre-estimated model\n", - "\n", - " :param X_train: training set features\n", - " :param y_train: training set ground truths, regression targets\n", - " :param header: column labels for X_train\n", - " :param figsz: matplotlib figure size\n", - " \n", - " \"\"\"\n", - " baseline = metric(model, X_train, y_train)\n", - " imp = []\n", - " for col in X_train.columns:\n", - " save = X_train[col].copy()\n", - " X_train[col] = np.random.permutation(X_train[col])\n", - " m = metric(model, X_train, y_train)\n", - " X_train[col] = save\n", - " imp.append(baseline - m)\n", - " \n", - " # create a feature importance table with desired labels\n", - " zipped = zip(imp, header)\n", - " feature_imp = pd.DataFrame(sorted(zipped), columns=[\"importance\", \"feature\"])\n", - " feature_imp.sort_values(by=\"importance\", ascending=False, inplace=True)\n", - "\n", - " plt.clf()\n", - " plt.figure(figsize=figsz)\n", - " sns.barplot(x=\"importance\", y=\"feature\", data=feature_imp)\n", - " plt.title(\"feature permutation importances\")\n", - " plt.tight_layout()\n", - "\n", - " return np.array(feature_imp)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## drop-column importances\n", - "\n", - "According to our **[references](#references)** a more accurate measure of feature importance would have us re-estimate the model after dropping a column. This is considered as being close to \"ideal\". Unfortunately, the entire model needs to be re-estimated for each column and without some approximating shortcut this is likely to be infeasible for large datasets.\n", - "\n", - "Here is the suggested implementation and **don't run this on big models!**:" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "def dropcol_importances(\n", - " model, \n", - " X_train: np.array,\n", - " y_train: np.array,\n", - " header: List[str] = [],\n", - " random_state: int = 1994,\n", - " figsz=(10, 5)\n", - ") -> pd.DataFrame:\n", - " \"\"\"drop columns and re-estimate model\n", - " \n", - " modified from https://explained.ai/rf-importance/index.html\n", - " \n", - " :param rf: model to fit\n", - " :param X_train: training set features\n", - " :param y_train: training set ground truth labels\n", - "\n", - " Returns:\n", - " pd.DataFrame: table of diffs vs baseline metric\n", - " \"\"\"\n", - " # cloning makes copy of model pre-fit\n", - " # calculate a baseline with all features\n", - " model_ = clone(model)\n", - " model_.random_state = random_state\n", - " model_.fit(X_train, y_train)\n", - " baseline = model_.oob_score_\n", - " \n", - " # now drop each colum, refit model and calc metric\n", - " imp = []\n", - " for col in X_train.columns:\n", - " X = X_train.drop(col, axis=1)\n", - " model_ = clone(model)\n", - " model_.random_state = random_state\n", - " model_.fit(X, y_train)\n", - " o = model_.oob_score_\n", - " imp.append(baseline - o)\n", - " \n", - " # put it all in a table\n", - " imp = np.array(imp)\n", - " feature_imps = pd.DataFrame(\n", - " data={'feature': X_train.columns,\n", - " 'importance': imp})\n", - " #feature_imps.set_index('feature', inplace=True)\n", - " feature_imps.sort_values('importance', ascending=True, inplace=True)\n", - " \n", - " plt.clf()\n", - " plt.figure(figsize=figsz)\n", - " sns.barplot(x=\"importance\", y=\"feature\", data=feature_imps)\n", - " plt.title(\"drop column feature importances\")\n", - " plt.tight_layout()\n", - " \n", - " return feature_imps" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## demonstration\n", - "\n", - "In this demonstratuon we are going to take a fraction of a fraction of **[Kaggle's RentHop rental listing interest competition](https://www.kaggle.com/c/two-sigma-connect-rental-listing-inquiries)**--the complete dataset is presently >80GB, we'll be looking at 5K rows. \n", - "\n", - "The competition's **[goal](https://www.kaggle.com/c/two-sigma-connect-rental-listing-inquiries)** was\n", - "> to predict the number of inquiries a new listing receives based on the listing’s creation date and other features. \n", - "\n", - "Doing so would help **[RentHop](https://www.renthop.com/)**\n", - "> better handle fraud control, identify potential listing quality issues, and allow owners and agents to better understand renters’ needs and preferences." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "data = \"/User/artifacts/two-sigma-connect-rental-listing-inquiries/\"\n", - "NFRAC = 0.1" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sample dimensions (4935, 6)\n" - ] - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
bathroomsbedroomspricelongitudelatitudeinterest_level
182351.001800-73.996040.71972
421401.023000-73.987940.76533
46771.011350-73.899640.85492
\n", - "
" - ], - "text/plain": [ - " bathrooms bedrooms price longitude latitude interest_level\n", - "18235 1.0 0 1800 -73.9960 40.7197 2\n", - "42140 1.0 2 3000 -73.9879 40.7653 3\n", - "4677 1.0 1 1350 -73.8996 40.8549 2" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "df = pd.read_csv(data + 'rent.csv').sample(frac=NFRAC)\n", - "print(\"sample dimensions\", df.shape)\n", - "df.head(3)" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [], - "source": [ - "features = ['bathrooms', 'bedrooms', 'longitude', 'latitude', 'price']\n", - "dfr = df[features]\n", - "\n", - "# drop price column\n", - "X_train, y_train = dfr.drop('price', axis=1), dfr['price']\n", - "\n", - "# insert column with random values\n", - "X_train['random'] = np.random.random(size=len(X_train))\n", - "features = ['bathrooms', 'bedrooms', 'longitude', 'latitude', 'random']" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "RandomForestClassifier(n_jobs=-1, oob_score=True)" - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# define model\n", - "model_params = {\n", - " \"n_estimators\" : 100, \n", - " \"min_samples_leaf\" : 1,\n", - " \"n_jobs\" : -1,\n", - " \"oob_score\" : True\n", - "}\n", - "\n", - "model = SomeModel(**model_params)\n", - "\n", - "# estimate\n", - "model.fit(X_train, y_train)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### to run this the model needs a default attribute" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "default feature_importances [0.01683784 0.03215169 0.29983429 0.30418813 0.34698806]\n" - ] - }, - { - "data": { - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "if hasattr(model, \"feature_importances_\"):\n", - " print(\"default feature_importances\", model.feature_importances_)\n", - " feature_importances(model, features)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## permutation importances\n", - "\n", - "No need to check for default attributes or functions, this can be run on any kind of model:" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "array([[0.06545086119554205, 'longitude'],\n", - " [0.06160081053698076, 'latitude'],\n", - " [0.053495440729483285, 'bedrooms'],\n", - " [0.021681864235055734, 'bathrooms'],\n", - " [0.0004052684903748799, 'random']], dtype=object)" - ] - }, - "execution_count": 20, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "pi = permutation_importances(model, X_train, y_train, features)\n", - "pi" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## drop-column importances" - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
featureimportance
4random-0.042756
0bathrooms0.001824
1bedrooms0.023100
2longitude0.049240
3latitude0.051874
\n", - "
" - ], - "text/plain": [ - " feature importance\n", - "4 random -0.042756\n", - "0 bathrooms 0.001824\n", - "1 bedrooms 0.023100\n", - "2 longitude 0.049240\n", - "3 latitude 0.051874" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "dc = dropcol_importances(model, X_train, y_train)\n", - "dc" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## conclusions" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "So I would say location is a prime factor, then the number of bedrooms. Bathrooms often is gte bedrooms, and is likely correlated so one of them should likely be dropped." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - }, - "toc-autonumbering": false, - "toc-showcode": false, - "toc-showmarkdowntxt": false - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/feature_perms/feature_perms.ipynb b/feature_perms/feature_perms.ipynb deleted file mode 100644 index 77da7b553..000000000 --- a/feature_perms/feature_perms.ipynb +++ /dev/null @@ -1,1106 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# permutation_importances as reusable function" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## function code" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "import numbers\n", - "\n", - "import sklearn\n", - "from sklearn.base import clone\n", - "from sklearn.utils import check_random_state\n", - "\n", - "import matplotlib.pyplot as plt\n", - "import seaborn as sns\n", - "\n", - "from cloudpickle import load\n", - "\n", - "from mlrun.execution import MLClientCtx\n", - "from mlrun.datastore import DataItem\n", - "from mlrun.artifacts import get_model, PlotArtifact\n", - "from typing import Union, Callable, List\n", - "\n", - "def _get_n_samples_bootstrap(n_samples, max_samples) -> int:\n", - " \"\"\"get the number of samples in a bootstrap sample\n", - " \n", - " returns the total number of samples to draw for the bootstrap sample\n", - " \n", - " private api in sklearn >= v0.24, taken from sklearn.ensemble._forest.py\n", - "\n", - " :param n_samples: Number of samples in the dataset.\n", - " :param max_samples: \n", - " The maximum number of samples to draw from the total available:\n", - " - if float, this indicates a fraction of the total and should be\n", - " the interval `(0, 1)`;\n", - " - if int, this indicates the exact number of samples;\n", - " - if None, this indicates the total number of samples.\n", - " \"\"\"\n", - " if max_samples is None:\n", - " return n_samples\n", - "\n", - " if isinstance(max_samples, numbers.Integral):\n", - " if not (1 <= max_samples <= n_samples):\n", - " msg = \"`max_samples` must be in range 1 to {} but got value {}\"\n", - " raise ValueError(msg.format(n_samples, max_samples))\n", - " return max_samples\n", - "\n", - " if isinstance(max_samples, numbers.Real):\n", - " if not (0 < max_samples < 1):\n", - " msg = \"`max_samples` must be in range (0, 1) but got value {}\"\n", - " raise ValueError(msg.format(max_samples))\n", - " return int(round(n_samples * max_samples))\n", - "\n", - " msg = \"`max_samples` should be int or float, but got type '{}'\"\n", - " raise TypeError(msg.format(type(max_samples)))\n", - "\n", - "def _get_unsampled_ix(random_state, n_samples: int) -> np.array:\n", - " \"\"\"\n", - " future-proof get unsampled indices\n", - " \"\"\"\n", - " n_bootstrap = _get_n_samples_bootstrap(n_samples, n_samples)\n", - " random_instance = check_random_state(random_state)\n", - " sample_indices = random_instance.randint(0, n_samples, n_bootstrap)\n", - " sample_counts = np.bincount(sample_indices, minlength=n_samples)\n", - "\n", - " return np.arange(n_samples)[sample_counts==0]\n", - "\n", - "def _oob_classifier_accuracy(rf, X_train, y_train) -> float:\n", - " \"\"\"\n", - " Compute out-of-bag (OOB) accuracy for a scikit-learn forest classifier.\n", - " \n", - " https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/ensemble/forest.py#L425\n", - " \"\"\"\n", - " X = X_train.values if isinstance(X_train, pd.DataFrame) else X_train\n", - " y = y_train.values if isinstance(y_train, pd.Series) else y_train\n", - "\n", - " n_samples = len(X)\n", - " n_classes = len(np.unique(y))\n", - " predictions = np.zeros((n_samples, n_classes))\n", - " for tree in rf.estimators_:\n", - " unsampled_indices = _get_unsampled_ix(tree.random_state, n_samples)\n", - " tree_preds = tree.predict_proba(X[unsampled_indices, :])\n", - " predictions[unsampled_indices] += tree_preds\n", - "\n", - " predicted_class_indexes = np.argmax(predictions, axis=1)\n", - " predicted_classes = [rf.classes_[i] for i in predicted_class_indexes]\n", - "\n", - " oob_score = np.mean(y == predicted_classes)\n", - " \n", - " return oob_score\n", - "\n", - "def permutation_importances(\n", - " context: MLClientCtx,\n", - " model: DataItem,\n", - " dataset: DataItem,\n", - " labels: str,\n", - " figsz=(10, 5),\n", - " plots_dest: str = \"plots\",\n", - " fitype: str = \"permute\"\n", - ") -> pd.DataFrame:\n", - " \"\"\"calculate change in metric\n", - " \n", - " type 'permute' uses a pre-estimated model\n", - " type 'dropcol' uses a re-estimates model\n", - " \n", - " :param context: the function's execution context\n", - " :param model: a trained model\n", - " :param dataset: features and ground truths, regression targets\n", - " :param labels name of the ground truths column\n", - " :param figsz: matplotlib figure size\n", - " :param plots_dest: path within artifact store\n", - " :\n", - " \"\"\"\n", - " model_file, model_data, _ = get_model(model.url, suffix='.pkl')\n", - " model = load(open(str(model_file), \"rb\"))\n", - " \n", - " X = dataset.as_df()\n", - " y = X.pop(labels)\n", - " header = X.columns\n", - " \n", - " # this will be paramettrized next version, and include regression\n", - " metric = _oob_classifier_accuracy\n", - " \n", - " baseline = metric(model, X, y)\n", - " \n", - " imp = []\n", - " for col in X.columns:\n", - " if fitype is \"permute\":\n", - " save = X[col].copy()\n", - " X[col] = np.random.permutation(X[col])\n", - " m = metric(model, X, y)\n", - " X[col] = save\n", - " imp.append(baseline - m)\n", - " elif fitype is \"dropcol\":\n", - " X_ = X.drop(col, axis=1)\n", - " model_ = clone(model)\n", - " model_.random_state = random_state\n", - " model_.fit(X_, y)\n", - " o = model_.oob_score_\n", - " imp.append(baseline - o)\n", - " else:\n", - " raise ValueError(\"unknown fitype, only 'permute' or 'dropcol' permitted\")\n", - "\n", - " # create a feature importance table with desired labels\n", - " zipped = zip(imp, header)\n", - " feature_imp = pd.DataFrame(sorted(zipped), columns=[\"importance\", \"feature\"])\n", - " feature_imp.sort_values(by=\"importance\", ascending=False, inplace=True)\n", - "\n", - " plt.clf()\n", - " plt.figure(figsize=figsz)\n", - " sns.barplot(x=\"importance\", y=\"feature\", data=feature_imp)\n", - " plt.title(f\"feature importances-{fitype}\")\n", - " plt.tight_layout()\n", - "\n", - " context.log_artifact(PlotArtifact(f\"feature importances-{fitype}\", body=plt.gcf()),\n", - " local_path=f\"{plots_dest}/feature-permutations.html\")\n", - " context.log_dataset(f\"feature-importances-{fitype}-tbl\", df=feature_imp, index=False)" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## save function" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-07 19:58:25,298 function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from mlrun import code_to_function\n", - "from mlrun.platforms.other import auto_mount\n", - "\n", - "gpus = False\n", - "\n", - "# create job function object from notebook code\n", - "fn_params = {\n", - " \"name\" : \"feature-perms\",\n", - " \"handler\" : \"permutation_importances\",\n", - " \"kind\" : \"job\",\n", - " \"image\" : \"mlrun/ml-models\" if not gpus else \"mlrun/ml-models-gpu\",\n", - " \"description\" : \"estimate feature importances using permutations\",\n", - " \"categories\" : [\"analysis\"],\n", - " \"labels\" : {\"author\": \"yjb\"}\n", - "}\n", - "\n", - "perms_fn = code_to_function(**fn_params)\n", - "perms_fn.apply(auto_mount())\n", - "perms_fn.export(\"function.yaml\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## tests" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import import_function\n", - "from mlrun import NewTask, mlconf" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### get some data" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-07 19:58:25,352 starting run tasks arc-to-parq uid=e9bc67f2189c418d96bfde754d369956 -> http://mlrun-api:8080\n", - "[mlrun] 2020-06-07 19:58:25,486 Job is running in the background, pod: tasks-arc-to-parq-xqkr7\n", - "[mlrun] 2020-06-07 19:58:29,118 starting local run: main.py # arc_to_parquet\n", - "[mlrun] 2020-06-07 19:58:29,169 downloading https://raw.githubusercontent.com/parrt/random-forest-importances/master/notebooks/data/rent.csv to local tmp\n", - "[mlrun] 2020-06-07 19:58:29,535 destination file does not exist, downloading\n", - "[mlrun] 2020-06-07 19:58:29,898 log artifact rent at /User/artifacts/rent.csv, size: 1492462, db: Y\n", - "\n", - "[mlrun] 2020-06-07 19:58:29,917 run executed, status=completed\n", - "final state: succeeded\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 07 19:58:29completedtasks arc-to-parq
v3io_user=admin
kind=job
owner=admin
host=tasks-arc-to-parq-xqkr7
archive_url
key=rent
stats=True
file_ext=csv
rent
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run e9bc67f2189c418d96bfde754d369956 --project default , !mlrun logs e9bc67f2189c418d96bfde754d369956 --project default\n", - "[mlrun] 2020-06-07 19:58:31,666 run executed, status=completed\n" - ] - } - ], - "source": [ - "data_url = \"https://raw.githubusercontent.com/parrt/random-forest-importances/master/notebooks/data/rent.csv\"\n", - "\n", - "fn = import_function(\"hub://arc_to_parquet\", \"a2p\")\n", - "fn.apply(auto_mount())\n", - "\n", - "params = {\n", - " \"name\" : \"tasks arc-to-parq\",\n", - " \"params\" : {\"key\":\"rent\", \"stats\": True, \"file_ext\":\"csv\"}\n", - "}\n", - "acquire_run = fn.run(NewTask(**params),inputs={\"archive_url\" : data_url},\n", - " artifact_path=mlconf.artifact_path)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### train a model" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-07 19:58:31,704 starting run tasks random forest uid=57af834167264641905a5bb5e6b0e263 -> http://mlrun-api:8080\n", - "[mlrun] 2020-06-07 19:58:31,861 Job is running in the background, pod: tasks-random-forest-vkjk5\n", - "[mlrun] 2020-06-07 19:58:35,390 starting local run: main.py # train_model\n", - "[mlrun] 2020-06-07 19:58:36,310 log artifact test_set at /User/artifacts/data/test_set.parquet, size: 24484, db: Y\n", - "[mlrun] 2020-06-07 19:58:37,153 log artifact confusion-matrix at /User/artifacts/model/plots/confusion-matrix.html, size: 27401, db: N\n", - "[mlrun] 2020-06-07 19:58:37,598 log artifact feature-importances at /User/artifacts/model/plots/feature-importances.html, size: 19685, db: N\n", - "[mlrun] 2020-06-07 19:58:37,806 log artifact precision-recall-multiclass at /User/artifacts/model/plots/precision-recall-multiclass.html, size: 74009, db: N\n", - "[mlrun] 2020-06-07 19:58:37,936 log artifact roc-multiclass at /User/artifacts/model/plots/roc-multiclass.html, size: 73053, db: N\n", - "[mlrun] 2020-06-07 19:58:38,079 log artifact model at /User/artifacts/model/, size: 10346780, db: Y\n", - "\n", - "[mlrun] 2020-06-07 19:58:38,106 run executed, status=completed\n", - "final state: succeeded\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 07 19:58:36completedtasks random forest
v3io_user=admin
kind=job
owner=admin
host=tasks-random-forest-vkjk5
class=sklearn.ensemble.RandomForestClassifier
dataset
sample=-5000
model_pkg_class=sklearn.ensemble.RandomForestClassifier
label_column=interest_level
CLASS_n_estimators=100
CLASS_min_samples_leaf=1
CLASS_n_jobs=-1
CLASS_oob_score=True
test-accuracy=0.6902857142857143
test-error=0.3097142857142857
auc-micro=0.8567196734693878
auc-weighted=0.7077200281488216
f1-score=0.44361444815007395
precision_score=0.4969837043184901
recall_score=0.42733978329897576
test_set
confusion-matrix
feature-importances
precision-recall-multiclass
roc-multiclass
model
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run 57af834167264641905a5bb5e6b0e263 --project default , !mlrun logs 57af834167264641905a5bb5e6b0e263 --project default\n", - "[mlrun] 2020-06-07 19:58:41,115 run executed, status=completed\n" - ] - } - ], - "source": [ - "fn = import_function(\"hub://sklearn_classifier\", \"skrf\")\n", - "fn.apply(auto_mount())\n", - "\n", - "# define model\n", - "params = {\n", - " \"name\" : \"tasks random forest\",\n", - " \"params\" : {\n", - " \"sample\" : -5_000, # 5k random rows,\n", - " \"model_pkg_class\" : \"sklearn.ensemble.RandomForestClassifier\",\n", - " \"label_column\" : \"interest_level\",\n", - " \"CLASS_n_estimators\" : 100,\n", - " \"CLASS_min_samples_leaf\" : 1,\n", - " \"CLASS_n_jobs\" : -1,\n", - " \"CLASS_oob_score\" : True}\n", - "}\n", - "\n", - "train_run = fn.run(NewTask(**params), inputs={\"dataset\" : acquire_run.outputs[\"rent\"]},\n", - " artifact_path=mlconf.artifact_path)" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "

Feature Importances

\n", - "" - ], - "text/plain": [ - "" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from IPython.display import HTML\n", - "HTML(filename=train_run.outputs['feature-importances'])" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "data = acquire_run.outputs[\"rent\"]\n", - "labels = \"interest_level\"\n", - "model = train_run.outputs[\"model\"]\n" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-07 19:58:41,152 starting run features-permutation_importances uid=89235b15ac2a4213aefc906c178a1c5e -> http://mlrun-api:8080\n", - "[mlrun] 2020-06-07 19:58:41,312 Job is running in the background, pod: features-permutation-importances-dwxmt\n", - "[mlrun] 2020-06-07 19:58:44,871 starting local run: main.py # permutation_importances\n", - "[mlrun] 2020-06-07 19:58:48,714 log artifact feature importances-permute at /User/artifacts/plots/feature-permutations.html, size: 25694, db: Y\n", - "[mlrun] 2020-06-07 19:58:48,770 log artifact feature-importances-permute-tbl at /User/artifacts/feature-importances-permute-tbl.csv, size: 167, db: Y\n", - "\n", - "[mlrun] 2020-06-07 19:58:48,785 run executed, status=completed\n", - "final state: succeeded\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 07 19:58:45completedfeatures-permutation_importances
v3io_user=admin
kind=job
owner=admin
host=features-permutation-importances-dwxmt
model
dataset
labels=interest_level
plots_dest=plots
feature importances-permute
feature-importances-permute-tbl
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run 89235b15ac2a4213aefc906c178a1c5e --project default , !mlrun logs 89235b15ac2a4213aefc906c178a1c5e --project default\n", - "[mlrun] 2020-06-07 19:58:50,488 run executed, status=completed\n" - ] - } - ], - "source": [ - "fi_perms = perms_fn.run(\n", - " NewTask(params={\"labels\": labels, \n", - " \"plots_dest\": \"plots\"}),\n", - " inputs={\"model\": model, \"dataset\": data},\n", - " artifact_path=mlconf.artifact_path)" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "" - ], - "text/plain": [ - "" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from IPython.display import HTML\n", - "HTML(filename=fi_perms.outputs['feature importances-permute'])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} \ No newline at end of file diff --git a/feature_perms/feature_perms.py b/feature_perms/feature_perms.py deleted file mode 100644 index 13caae32e..000000000 --- a/feature_perms/feature_perms.py +++ /dev/null @@ -1,174 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import numpy as np -import pandas as pd -import numbers - -import sklearn -from sklearn.base import clone -from sklearn.utils import check_random_state - -import matplotlib.pyplot as plt -import seaborn as sns - -from cloudpickle import load - -from mlrun.execution import MLClientCtx -from mlrun.datastore import DataItem -from mlrun.artifacts import get_model, PlotArtifact -from typing import Union, Callable, List - - -def _get_n_samples_bootstrap(n_samples, max_samples) -> int: - """get the number of samples in a bootstrap sample - - returns the total number of samples to draw for the bootstrap sample - - private api in sklearn >= v0.24, taken from sklearn.ensemble._forest.py - - :param n_samples: Number of samples in the dataset. - :param max_samples: - The maximum number of samples to draw from the total available: - - if float, this indicates a fraction of the total and should be - the interval `(0, 1)`; - - if int, this indicates the exact number of samples; - - if None, this indicates the total number of samples. - """ - if max_samples is None: - return n_samples - - if isinstance(max_samples, numbers.Integral): - if not (1 <= max_samples <= n_samples): - msg = "`max_samples` must be in range 1 to {} but got value {}" - raise ValueError(msg.format(n_samples, max_samples)) - return max_samples - - if isinstance(max_samples, numbers.Real): - if not (0 < max_samples < 1): - msg = "`max_samples` must be in range (0, 1) but got value {}" - raise ValueError(msg.format(max_samples)) - return int(round(n_samples * max_samples)) - - msg = "`max_samples` should be int or float, but got type '{}'" - raise TypeError(msg.format(type(max_samples))) - - -def _get_unsampled_ix(random_state, n_samples: int) -> np.array: - """ - future-proof get unsampled indices - """ - n_bootstrap = _get_n_samples_bootstrap(n_samples, n_samples) - random_instance = check_random_state(random_state) - sample_indices = random_instance.randint(0, n_samples, n_bootstrap) - sample_counts = np.bincount(sample_indices, minlength=n_samples) - - return np.arange(n_samples)[sample_counts == 0] - - -def _oob_classifier_accuracy(rf, X_train, y_train) -> float: - """ - Compute out-of-bag (OOB) accuracy for a scikit-learn forest classifier. - - https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/ensemble/forest.py#L425 - """ - X = X_train.values if isinstance(X_train, pd.DataFrame) else X_train - y = y_train.values if isinstance(y_train, pd.Series) else y_train - - n_samples = len(X) - n_classes = len(np.unique(y)) - predictions = np.zeros((n_samples, n_classes)) - for tree in rf.estimators_: - unsampled_indices = _get_unsampled_ix(tree.random_state, n_samples) - tree_preds = tree.predict_proba(X[unsampled_indices, :]) - predictions[unsampled_indices] += tree_preds - - predicted_class_indexes = np.argmax(predictions, axis=1) - predicted_classes = [rf.classes_[i] for i in predicted_class_indexes] - - oob_score = np.mean(y == predicted_classes) - - return oob_score - - -def permutation_importance( - context: MLClientCtx, - model: DataItem, - dataset: DataItem, - labels: str, - figsz=(10, 5), - plots_dest: str = "plots", - fitype: str = "permute", -) -> pd.DataFrame: - """calculate change in metric - - type 'permute' uses a pre-estimated model - type 'dropcol' uses a re-estimates model - - :param context: the function's execution context - :param model: a trained model - :param dataset: features and ground truths, regression targets - :param labels name of the ground truths column - :param figsz: matplotlib figure size - :param plots_dest: path within artifact store - : - """ - model_file, model_data, _ = get_model(model.url, suffix=".pkl") - model = load(open(str(model_file), "rb")) - - X = dataset.as_df() - y = X.pop(labels) - header = X.columns - - metric = _oob_classifier_accuracy - - baseline = metric(model, X, y) - - imp = [] - for col in X.columns: - if fitype is "permute": - save = X[col].copy() - X[col] = np.random.permutation(X[col]) - m = metric(model, X, y) - X[col] = save - imp.append(baseline - m) - elif fitype is "dropcol": - X_ = X.drop(col, axis=1) - model_ = clone(model) - #model_.random_state = random_state - model_.fit(X_, y) - o = model_.oob_score_ - imp.append(baseline - o) - else: - raise ValueError("unknown fitype, only 'permute' or 'dropcol' permitted") - - zipped = zip(imp, header) - feature_imp = pd.DataFrame(sorted(zipped), columns=["importance", "feature"]) - feature_imp.sort_values(by="importance", ascending=False, inplace=True) - - plt.clf() - plt.figure(figsize=figsz) - sns.barplot(x="importance", y="feature", data=feature_imp) - plt.title(f"feature importances-{fitype}") - plt.tight_layout() - - context.log_artifact( - PlotArtifact(f"feature importances-{fitype}", body=plt.gcf()), - local_path=f"{plots_dest}/feature-permutations.html", - ) - context.log_dataset( - f"feature-importances-{fitype}-tbl", df=feature_imp, index=False - ) diff --git a/feature_perms/function.yaml b/feature_perms/function.yaml deleted file mode 100644 index 713981fdf..000000000 --- a/feature_perms/function.yaml +++ /dev/null @@ -1,63 +0,0 @@ -kind: job -metadata: - name: feature-perms - tag: '' - hash: 2e32234a73e2e48f029cf6c957b150ec2ffd4bc7 - project: '' - labels: - author: yjb - categories: - - data-analysis -spec: - command: '' - args: [] - image: mlrun/ml-models - env: [] - default_handler: permutation_importance - entry_points: - permutation_importance: - name: permutation_importance - doc: 'calculate change in metric - - - type ''permute'' uses a pre-estimated model - - type ''dropcol'' uses a re-estimates model' - parameters: - - name: context - type: MLClientCtx - doc: the function's execution context - default: '' - - name: model - type: DataItem - doc: a trained model - default: '' - - name: dataset - type: DataItem - doc: features and ground truths, regression targets - default: '' - - name: labels - type: str - default: '' - - name: figsz - doc: matplotlib figure size - default: - - 10 - - 5 - - name: plots_dest - type: str - doc: path within artifact store - default: plots - - name: fitype - type: str - default: permute - outputs: - - default: '' - lineno: 93 - description: estimate feature importances using permutations - build: - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IG51bXB5IGFzIG5wCmltcG9ydCBwYW5kYXMgYXMgcGQKaW1wb3J0IG51bWJlcnMKCmltcG9ydCBza2xlYXJuCmZyb20gc2tsZWFybi5iYXNlIGltcG9ydCBjbG9uZQpmcm9tIHNrbGVhcm4udXRpbHMgaW1wb3J0IGNoZWNrX3JhbmRvbV9zdGF0ZQoKaW1wb3J0IG1hdHBsb3RsaWIucHlwbG90IGFzIHBsdAppbXBvcnQgc2VhYm9ybiBhcyBzbnMKCmZyb20gY2xvdWRwaWNrbGUgaW1wb3J0IGxvYWQKCmZyb20gbWxydW4uZXhlY3V0aW9uIGltcG9ydCBNTENsaWVudEN0eApmcm9tIG1scnVuLmRhdGFzdG9yZSBpbXBvcnQgRGF0YUl0ZW0KZnJvbSBtbHJ1bi5hcnRpZmFjdHMgaW1wb3J0IGdldF9tb2RlbCwgUGxvdEFydGlmYWN0CmZyb20gdHlwaW5nIGltcG9ydCBVbmlvbiwgQ2FsbGFibGUsIExpc3QKCgpkZWYgX2dldF9uX3NhbXBsZXNfYm9vdHN0cmFwKG5fc2FtcGxlcywgbWF4X3NhbXBsZXMpIC0+IGludDoKICAgICIiImdldCB0aGUgbnVtYmVyIG9mIHNhbXBsZXMgaW4gYSBib290c3RyYXAgc2FtcGxlCgogICAgcmV0dXJucyB0aGUgdG90YWwgbnVtYmVyIG9mIHNhbXBsZXMgdG8gZHJhdyBmb3IgdGhlIGJvb3RzdHJhcCBzYW1wbGUKCiAgICBwcml2YXRlIGFwaSBpbiBza2xlYXJuID49IHYwLjI0LCB0YWtlbiBmcm9tIHNrbGVhcm4uZW5zZW1ibGUuX2ZvcmVzdC5weQoKICAgIDpwYXJhbSBuX3NhbXBsZXM6ICAgTnVtYmVyIG9mIHNhbXBsZXMgaW4gdGhlIGRhdGFzZXQuCiAgICA6cGFyYW0gbWF4X3NhbXBsZXM6CiAgICAgICAgVGhlIG1heGltdW0gbnVtYmVyIG9mIHNhbXBsZXMgdG8gZHJhdyBmcm9tIHRoZSB0b3RhbCBhdmFpbGFibGU6CiAgICAgICAgICAgIC0gaWYgZmxvYXQsIHRoaXMgaW5kaWNhdGVzIGEgZnJhY3Rpb24gb2YgdGhlIHRvdGFsIGFuZCBzaG91bGQgYmUKICAgICAgICAgICAgICB0aGUgaW50ZXJ2YWwgYCgwLCAxKWA7CiAgICAgICAgICAgIC0gaWYgaW50LCB0aGlzIGluZGljYXRlcyB0aGUgZXhhY3QgbnVtYmVyIG9mIHNhbXBsZXM7CiAgICAgICAgICAgIC0gaWYgTm9uZSwgdGhpcyBpbmRpY2F0ZXMgdGhlIHRvdGFsIG51bWJlciBvZiBzYW1wbGVzLgogICAgIiIiCiAgICBpZiBtYXhfc2FtcGxlcyBpcyBOb25lOgogICAgICAgIHJldHVybiBuX3NhbXBsZXMKCiAgICBpZiBpc2luc3RhbmNlKG1heF9zYW1wbGVzLCBudW1iZXJzLkludGVncmFsKToKICAgICAgICBpZiBub3QgKDEgPD0gbWF4X3NhbXBsZXMgPD0gbl9zYW1wbGVzKToKICAgICAgICAgICAgbXNnID0gImBtYXhfc2FtcGxlc2AgbXVzdCBiZSBpbiByYW5nZSAxIHRvIHt9IGJ1dCBnb3QgdmFsdWUge30iCiAgICAgICAgICAgIHJhaXNlIFZhbHVlRXJyb3IobXNnLmZvcm1hdChuX3NhbXBsZXMsIG1heF9zYW1wbGVzKSkKICAgICAgICByZXR1cm4gbWF4X3NhbXBsZXMKCiAgICBpZiBpc2luc3RhbmNlKG1heF9zYW1wbGVzLCBudW1iZXJzLlJlYWwpOgogICAgICAgIGlmIG5vdCAoMCA8IG1heF9zYW1wbGVzIDwgMSk6CiAgICAgICAgICAgIG1zZyA9ICJgbWF4X3NhbXBsZXNgIG11c3QgYmUgaW4gcmFuZ2UgKDAsIDEpIGJ1dCBnb3QgdmFsdWUge30iCiAgICAgICAgICAgIHJhaXNlIFZhbHVlRXJyb3IobXNnLmZvcm1hdChtYXhfc2FtcGxlcykpCiAgICAgICAgcmV0dXJuIGludChyb3VuZChuX3NhbXBsZXMgKiBtYXhfc2FtcGxlcykpCgogICAgbXNnID0gImBtYXhfc2FtcGxlc2Agc2hvdWxkIGJlIGludCBvciBmbG9hdCwgYnV0IGdvdCB0eXBlICd7fSciCiAgICByYWlzZSBUeXBlRXJyb3IobXNnLmZvcm1hdCh0eXBlKG1heF9zYW1wbGVzKSkpCgoKZGVmIF9nZXRfdW5zYW1wbGVkX2l4KHJhbmRvbV9zdGF0ZSwgbl9zYW1wbGVzOiBpbnQpIC0+IG5wLmFycmF5OgogICAgIiIiCiAgICBmdXR1cmUtcHJvb2YgZ2V0IHVuc2FtcGxlZCBpbmRpY2VzCiAgICAiIiIKICAgIG5fYm9vdHN0cmFwID0gX2dldF9uX3NhbXBsZXNfYm9vdHN0cmFwKG5fc2FtcGxlcywgbl9zYW1wbGVzKQogICAgcmFuZG9tX2luc3RhbmNlID0gY2hlY2tfcmFuZG9tX3N0YXRlKHJhbmRvbV9zdGF0ZSkKICAgIHNhbXBsZV9pbmRpY2VzID0gcmFuZG9tX2luc3RhbmNlLnJhbmRpbnQoMCwgbl9zYW1wbGVzLCBuX2Jvb3RzdHJhcCkKICAgIHNhbXBsZV9jb3VudHMgPSBucC5iaW5jb3VudChzYW1wbGVfaW5kaWNlcywgbWlubGVuZ3RoPW5fc2FtcGxlcykKCiAgICByZXR1cm4gbnAuYXJhbmdlKG5fc2FtcGxlcylbc2FtcGxlX2NvdW50cyA9PSAwXQoKCmRlZiBfb29iX2NsYXNzaWZpZXJfYWNjdXJhY3kocmYsIFhfdHJhaW4sIHlfdHJhaW4pIC0+IGZsb2F0OgogICAgIiIiCiAgICBDb21wdXRlIG91dC1vZi1iYWcgKE9PQikgYWNjdXJhY3kgZm9yIGEgc2Npa2l0LWxlYXJuIGZvcmVzdCBjbGFzc2lmaWVyLgoKICAgIGh0dHBzOi8vZ2l0aHViLmNvbS9zY2lraXQtbGVhcm4vc2Npa2l0LWxlYXJuL2Jsb2IvYTI0YzhiNDYvc2tsZWFybi9lbnNlbWJsZS9mb3Jlc3QucHkjTDQyNQogICAgIiIiCiAgICBYID0gWF90cmFpbi52YWx1ZXMgaWYgaXNpbnN0YW5jZShYX3RyYWluLCBwZC5EYXRhRnJhbWUpIGVsc2UgWF90cmFpbgogICAgeSA9IHlfdHJhaW4udmFsdWVzIGlmIGlzaW5zdGFuY2UoeV90cmFpbiwgcGQuU2VyaWVzKSBlbHNlIHlfdHJhaW4KCiAgICBuX3NhbXBsZXMgPSBsZW4oWCkKICAgIG5fY2xhc3NlcyA9IGxlbihucC51bmlxdWUoeSkpCiAgICBwcmVkaWN0aW9ucyA9IG5wLnplcm9zKChuX3NhbXBsZXMsIG5fY2xhc3NlcykpCiAgICBmb3IgdHJlZSBpbiByZi5lc3RpbWF0b3JzXzoKICAgICAgICB1bnNhbXBsZWRfaW5kaWNlcyA9IF9nZXRfdW5zYW1wbGVkX2l4KHRyZWUucmFuZG9tX3N0YXRlLCBuX3NhbXBsZXMpCiAgICAgICAgdHJlZV9wcmVkcyA9IHRyZWUucHJlZGljdF9wcm9iYShYW3Vuc2FtcGxlZF9pbmRpY2VzLCA6XSkKICAgICAgICBwcmVkaWN0aW9uc1t1bnNhbXBsZWRfaW5kaWNlc10gKz0gdHJlZV9wcmVkcwoKICAgIHByZWRpY3RlZF9jbGFzc19pbmRleGVzID0gbnAuYXJnbWF4KHByZWRpY3Rpb25zLCBheGlzPTEpCiAgICBwcmVkaWN0ZWRfY2xhc3NlcyA9IFtyZi5jbGFzc2VzX1tpXSBmb3IgaSBpbiBwcmVkaWN0ZWRfY2xhc3NfaW5kZXhlc10KCiAgICBvb2Jfc2NvcmUgPSBucC5tZWFuKHkgPT0gcHJlZGljdGVkX2NsYXNzZXMpCgogICAgcmV0dXJuIG9vYl9zY29yZQoKCmRlZiBwZXJtdXRhdGlvbl9pbXBvcnRhbmNlKAogICAgY29udGV4dDogTUxDbGllbnRDdHgsCiAgICBtb2RlbDogRGF0YUl0ZW0sCiAgICBkYXRhc2V0OiBEYXRhSXRlbSwKICAgIGxhYmVsczogc3RyLAogICAgZmlnc3o9KDEwLCA1KSwKICAgIHBsb3RzX2Rlc3Q6IHN0ciA9ICJwbG90cyIsCiAgICBmaXR5cGU6IHN0ciA9ICJwZXJtdXRlIiwKKSAtPiBwZC5EYXRhRnJhbWU6CiAgICAiIiJjYWxjdWxhdGUgY2hhbmdlIGluIG1ldHJpYwoKICAgIHR5cGUgJ3Blcm11dGUnIHVzZXMgYSBwcmUtZXN0aW1hdGVkIG1vZGVsCiAgICB0eXBlICdkcm9wY29sJyB1c2VzIGEgcmUtZXN0aW1hdGVzIG1vZGVsCgogICAgOnBhcmFtIGNvbnRleHQ6ICAgICB0aGUgZnVuY3Rpb24ncyBleGVjdXRpb24gY29udGV4dAogICAgOnBhcmFtIG1vZGVsOiAgICAgICBhIHRyYWluZWQgbW9kZWwKICAgIDpwYXJhbSBkYXRhc2V0OiAgICAgZmVhdHVyZXMgYW5kIGdyb3VuZCB0cnV0aHMsIHJlZ3Jlc3Npb24gdGFyZ2V0cwogICAgOnBhcmFtIGxhYmVscyAgICAgICBuYW1lIG9mIHRoZSBncm91bmQgdHJ1dGhzIGNvbHVtbgogICAgOnBhcmFtIGZpZ3N6OiAgICAgICBtYXRwbG90bGliIGZpZ3VyZSBzaXplCiAgICA6cGFyYW0gcGxvdHNfZGVzdDogIHBhdGggd2l0aGluIGFydGlmYWN0IHN0b3JlCiAgICA6CiAgICAiIiIKICAgIG1vZGVsX2ZpbGUsIG1vZGVsX2RhdGEsIF8gPSBnZXRfbW9kZWwobW9kZWwudXJsLCBzdWZmaXg9Ii5wa2wiKQogICAgbW9kZWwgPSBsb2FkKG9wZW4oc3RyKG1vZGVsX2ZpbGUpLCAicmIiKSkKCiAgICBYID0gZGF0YXNldC5hc19kZigpCiAgICB5ID0gWC5wb3AobGFiZWxzKQogICAgaGVhZGVyID0gWC5jb2x1bW5zCgogICAgbWV0cmljID0gX29vYl9jbGFzc2lmaWVyX2FjY3VyYWN5CgogICAgYmFzZWxpbmUgPSBtZXRyaWMobW9kZWwsIFgsIHkpCgogICAgaW1wID0gW10KICAgIGZvciBjb2wgaW4gWC5jb2x1bW5zOgogICAgICAgIGlmIGZpdHlwZSBpcyAicGVybXV0ZSI6CiAgICAgICAgICAgIHNhdmUgPSBYW2NvbF0uY29weSgpCiAgICAgICAgICAgIFhbY29sXSA9IG5wLnJhbmRvbS5wZXJtdXRhdGlvbihYW2NvbF0pCiAgICAgICAgICAgIG0gPSBtZXRyaWMobW9kZWwsIFgsIHkpCiAgICAgICAgICAgIFhbY29sXSA9IHNhdmUKICAgICAgICAgICAgaW1wLmFwcGVuZChiYXNlbGluZSAtIG0pCiAgICAgICAgZWxpZiBmaXR5cGUgaXMgImRyb3Bjb2wiOgogICAgICAgICAgICBYXyA9IFguZHJvcChjb2wsIGF4aXM9MSkKICAgICAgICAgICAgbW9kZWxfID0gY2xvbmUobW9kZWwpCiAgICAgICAgICAgICNtb2RlbF8ucmFuZG9tX3N0YXRlID0gcmFuZG9tX3N0YXRlCiAgICAgICAgICAgIG1vZGVsXy5maXQoWF8sIHkpCiAgICAgICAgICAgIG8gPSBtb2RlbF8ub29iX3Njb3JlXwogICAgICAgICAgICBpbXAuYXBwZW5kKGJhc2VsaW5lIC0gbykKICAgICAgICBlbHNlOgogICAgICAgICAgICByYWlzZSBWYWx1ZUVycm9yKCJ1bmtub3duIGZpdHlwZSwgb25seSAncGVybXV0ZScgb3IgJ2Ryb3Bjb2wnIHBlcm1pdHRlZCIpCgogICAgemlwcGVkID0gemlwKGltcCwgaGVhZGVyKQogICAgZmVhdHVyZV9pbXAgPSBwZC5EYXRhRnJhbWUoc29ydGVkKHppcHBlZCksIGNvbHVtbnM9WyJpbXBvcnRhbmNlIiwgImZlYXR1cmUiXSkKICAgIGZlYXR1cmVfaW1wLnNvcnRfdmFsdWVzKGJ5PSJpbXBvcnRhbmNlIiwgYXNjZW5kaW5nPUZhbHNlLCBpbnBsYWNlPVRydWUpCgogICAgcGx0LmNsZigpCiAgICBwbHQuZmlndXJlKGZpZ3NpemU9Zmlnc3opCiAgICBzbnMuYmFycGxvdCh4PSJpbXBvcnRhbmNlIiwgeT0iZmVhdHVyZSIsIGRhdGE9ZmVhdHVyZV9pbXApCiAgICBwbHQudGl0bGUoZiJmZWF0dXJlIGltcG9ydGFuY2VzLXtmaXR5cGV9IikKICAgIHBsdC50aWdodF9sYXlvdXQoKQoKICAgIGNvbnRleHQubG9nX2FydGlmYWN0KAogICAgICAgIFBsb3RBcnRpZmFjdChmImZlYXR1cmUgaW1wb3J0YW5jZXMte2ZpdHlwZX0iLCBib2R5PXBsdC5nY2YoKSksCiAgICAgICAgbG9jYWxfcGF0aD1mIntwbG90c19kZXN0fS9mZWF0dXJlLXBlcm11dGF0aW9ucy5odG1sIiwKICAgICkKICAgIGNvbnRleHQubG9nX2RhdGFzZXQoCiAgICAgICAgZiJmZWF0dXJlLWltcG9ydGFuY2VzLXtmaXR5cGV9LXRibCIsIGRmPWZlYXR1cmVfaW1wLCBpbmRleD1GYWxzZQogICAgKQo= - commands: [] - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/feature_perms/feature_perms.py - affinity: null -verbose: false diff --git a/feature_perms/item.yaml b/feature_perms/item.yaml deleted file mode 100644 index bd909d3ee..000000000 --- a/feature_perms/item.yaml +++ /dev/null @@ -1,25 +0,0 @@ -apiVersion: v1 -categories: -- data-analysis -description: estimate feature importances using permutations -doc: '' -example: feature_perms.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: yjb -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: feature-perms -platformVersion: 3.5.0 -spec: - filename: feature_perms.py - handler: permutation_importance - image: mlrun/ml-models - kind: job - requirements: [] -url: '' -version: 1.1.0 -test_valid : False diff --git a/feature_perms/requirements.txt b/feature_perms/requirements.txt deleted file mode 100644 index 70a079c7d..000000000 --- a/feature_perms/requirements.txt +++ /dev/null @@ -1,5 +0,0 @@ -scikit-learn -matplotlib -seaborn -scikit-plot - diff --git a/feature_perms/test_feature_perms.py b/feature_perms/test_feature_perms.py deleted file mode 100644 index a59891ea8..000000000 --- a/feature_perms/test_feature_perms.py +++ /dev/null @@ -1,134 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from mlrun import code_to_function, import_function -from pathlib import Path -import os - -ARTIFACTS_PATH = 'artifacts' -DATA_URL = "https://raw.githubusercontent.com/parrt/random-forest-importances/master/notebooks/data/rent.csv" -FEATURE_OUTPUT = "feature-importances-permute-tbl" - - -def arc_to_parquet(): - from mlrun import import_function - - archive_func = import_function('hub://arc_to_parquet') - - archive_run = archive_func.run( - handler="arc_to_parquet", - params={"key": "rent", "stats": True, "file_ext": "csv"}, - inputs={"archive_url": DATA_URL}, - artifact_path=os.getcwd() + '/artifacts', - local=True, - ) - - return archive_run.artifact('rent').url - - -def sklearn_classifier(run): - cwd = os.getcwd() - file_path = str(Path(cwd).parent.absolute()) + "/sklearn_classifier/sklearn_classifier.py" - fn = code_to_function( - name='test_sklearn_classifier', - filename=file_path, - handler="train_model", - kind="local", - ) - - fn.spec.command = file_path - fn.run( - params={ - "sample": -5_000, # 5k random rows, - "model_pkg_class": "sklearn.ensemble.RandomForestClassifier", - "label_column": "interest_level", - "CLASS_n_estimators": 100, - "CLASS_min_samples_leaf": 1, - "CLASS_n_jobs": -1, - "CLASS_oob_score": True, - }, - handler="train_model", - inputs={"dataset": run.outputs["rent"]}, - artifact_path='artifacts', - ) - - -def train_model(data): - from mlrun import import_function - - train = import_function('hub://sklearn_classifier') - - train_run = train.run( - inputs={"dataset": data}, - params={ - "sample": -5_000, # 5k random rows, - "model_pkg_class": "sklearn.ensemble.RandomForestClassifier", - "label_column": "interest_level", - "CLASS_n_estimators": 100, - "CLASS_min_samples_leaf": 1, - "CLASS_n_jobs": -1, - "CLASS_oob_score": True, - }, - local=True - ) - - return train_run.artifact('model').url - - -def test_feature_selection_run_local(): - data = arc_to_parquet() - model = train_model(data) - labels = "interest_level" - fn = code_to_function( - name='test_run_local_feature_perms', - filename="feature_perms.py", - handler="permutation_importance", - kind="local", - ) - fn.spec.command = "feature_perms.py" - - run = fn.run( - params={ - "labels": labels, - "plots_dest": "plots", - }, - inputs={ - "model": model, - "dataset": data, - }, - artifact_path='artifacts', - ) - - assert run.artifact(FEATURE_OUTPUT).get() - - -def test_feature_perms_import_function(): - data = arc_to_parquet() - model = train_model(data) - labels = "interest_level" - fn = import_function("function.yaml") - - run = fn.run( - params={ - "labels": labels, - "plots_dest": "plots" - }, - inputs={ - "model": model, - "dataset": data}, - artifact_path=os.getcwd() + '/artifacts', - local=True, - ) - - assert run.artifact(FEATURE_OUTPUT).get() diff --git a/feature_selection/function.yaml b/feature_selection/function.yaml index aca1f0c0c..44cdd9894 100644 --- a/feature_selection/function.yaml +++ b/feature_selection/function.yaml @@ -1,51 +1,35 @@ -kind: job metadata: name: feature-selection tag: '' - hash: 5815ef4c27a1f08c9d8d3f88ad6bd4c9cb5c7f4a - project: '' - labels: - author: orz categories: - data-preparation - machine-learning +kind: job spec: - command: '' - args: [] - image: mlrun/mlrun - build: - functionSourceCode:  - commands: [] - code_origin: '' - origin_filename: '' - requirements: [] entry_points: show_values_on_bars: - name: show_values_on_bars doc: '' + has_kwargs: false parameters: - name: axs - name: h_v default: v - name: space default: 0.4 - outputs: [] - lineno: 43 + lineno: 54 has_varargs: false - has_kwargs: false + name: show_values_on_bars plot_stat: - name: plot_stat doc: '' + has_kwargs: false parameters: - name: context - name: stat_name - name: stat_df - outputs: [] - lineno: 65 + lineno: 76 has_varargs: false - has_kwargs: false + name: plot_stat feature_selection: - name: feature_selection doc: 'Applies selected feature selection statistical functions or models on our ''df_artifact''. @@ -53,6 +37,7 @@ spec: Each statistical function or model will vote for it''s best K selected features. If a feature has >= ''min_votes'' votes, it will be selected.' + has_kwargs: false parameters: - name: context doc: the function context. @@ -99,17 +84,20 @@ spec: type: bool doc: skips datatypes that are neither float nor int within the feature vector. default: false - outputs: [] - lineno: 80 + - name: is_feature_vector + type: bool + doc: bool stating if the data is passed as a feature vector. + default: false + lineno: 106 has_varargs: false - has_kwargs: false - description: Select features through multiple Statistical and Model filters - default_handler: feature_selection + name: feature_selection disable_auto_mount: false - env: [] - priority_class_name: '' - preemption_mode: prevent - affinity: null - tolerations: null - security_context: {} + command: '' + build: + origin_filename: '' + functionSourceCode:  + code_origin: '' + default_handler: feature_selection + image: mlrun/mlrun + description: Select features through multiple Statistical and Model filters verbose: false diff --git a/feature_selection/item.yaml b/feature_selection/item.yaml index ced618e00..99675b4e8 100644 --- a/feature_selection/item.yaml +++ b/feature_selection/item.yaml @@ -12,9 +12,9 @@ labels: author: orz maintainers: [] marketplaceType: '' -mlrunVersion: 1.6.3 +mlrunVersion: 1.6.4 name: feature-selection -platformVersion: 3.5.0 +platformVersion: 3.6.0 spec: filename: feature_selection.py handler: feature_selection diff --git a/feature_selection/requirements.txt b/feature_selection/requirements.txt index a13fc8ce6..e4d79d180 100644 --- a/feature_selection/requirements.txt +++ b/feature_selection/requirements.txt @@ -1,3 +1,3 @@ -scikit-learn~=1.0.2 +scikit-learn scikit-plot plotly~=5.4.0 diff --git a/feature_selection/test_feature_selection.py b/feature_selection/test_feature_selection.py index 3032b3193..6ae949aab 100644 --- a/feature_selection/test_feature_selection.py +++ b/feature_selection/test_feature_selection.py @@ -66,3 +66,4 @@ def test_run_local_feature_selection(): ] ) _delete_outputs({ARTIFACTS_PATH, RUNS_PATH, SCHEDULES_PATH}) + assert run.outputs['feature_scores'] and run.outputs['selected_features'] diff --git a/get_offline_features/function.yaml b/get_offline_features/function.yaml deleted file mode 100644 index 5a1780d98..000000000 --- a/get_offline_features/function.yaml +++ /dev/null @@ -1,127 +0,0 @@ -kind: job -metadata: - name: get-offline-features - tag: '' - hash: 22cc15eacc16e61f2fc1a9579fabde9ef7a2fce2 - project: '' - labels: - author: yonish - categories: - - data-preparation - - data-analysis - - feature-store -spec: - command: '' - args: [] - image: mlrun/mlrun - build: - functionSourceCode:  - commands: [] - code_origin: '' - origin_filename: '' - requirements: [] - entry_points: - get_offline_features: - name: get_offline_features - doc: 'retrieve offline feature vector results - - - specify a feature vector object/uri and retrieve the desired features, their - metadata - - and statistics. returns :py:class:`~mlrun.feature_store.OfflineVectorResponse`, - - results can be returned as a dataframe or written to a target. - - If feature vector does not exist, a new one will be created and saved with - the given features. - - - The start_time and end_time attributes allow filtering the data to a given - time range, they accept - - string values or pandas `Timestamp` objects, string values can also be relative, - for example: - - "now", "now - 1d2h", "now+5m", where a valid pandas Timedelta string follows - the verb "now", - - for time alignment you can use the verb "floor" e.g. "now -1d floor 1H" will - align the time to the last hour - - (the floor string is passed to pandas.Timestamp.floor(), can use D, H, T, - S for day, hour, min, sec alignment)' - parameters: - - name: context - type: MLClientCtx - doc: MLRun context - - name: feature_vector - type: str - doc: feature vector uri - - name: features - type: Union[List[str], ] - doc: Relevant only if feature_vector not exist. list of feature to collect - to this vector format [/]. [as - ] - default: null - - name: label_feature - type: str - doc: feature name to be used as label data - default: null - - name: description - type: str - doc: text description of the vector - default: null - - name: entity_rows - type: DataItem - doc: URI of the data entity rows to join with - default: null - - name: entity_timestamp_column - type: str - doc: timestamp column name in the entity rows dataframe - default: null - - name: target - type: Union[str, Dict] - doc: where to write the results to - default: null - - name: run_config - type: Union[str, Dict] - doc: function and/or run configuration see :py:class:`~mlrun.feature_store.RunConfig` - default: null - - name: drop_columns - type: List[str] - doc: list of columns to drop from the final result - default: null - - name: start_time - type: str - doc: datetime, low limit of time needed to be filtered. Optional entity_timestamp_column - must be passed when using time filtering - default: null - - name: end_time - type: str - doc: datetime, high limit of time needed to be filtered. Optional entity_timestamp_column - must be passed when using time filtering - default: null - - name: with_indexes - type: bool - doc: return vector with index columns (default False) - default: false - - name: update_stats - type: bool - doc: update features statistics from the requested feature sets on the vector. - Default is False. - default: false - outputs: [] - lineno: 28 - has_varargs: false - has_kwargs: false - description: retrieve offline feature vector results - default_handler: get_offline_features - disable_auto_mount: false - env: [] - priority_class_name: '' - preemption_mode: prevent - affinity: null - tolerations: null - security_context: {} -verbose: false diff --git a/get_offline_features/get_offline_features.ipynb b/get_offline_features/get_offline_features.ipynb deleted file mode 100644 index d97402a2e..000000000 --- a/get_offline_features/get_offline_features.ipynb +++ /dev/null @@ -1,1536 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# `get_offline_features()` from MLRun FeatureStore\n", - "\n", - "This MLRun Function has the following `params`:\n", - "\n", - "- `feature_vector: str`, feature vector uri.\n", - "\n", - "- `entity_rows: DataItem` = None, URI of the data entity rows to join with.\n", - "\n", - "- `entity_timestamp_column: str = None`, timestamp column name in the entity rows dataframe.\n", - "\n", - "- `target: Union[str, Dict] = None`, where to write the results to.\n", - "\n", - "- `run_config: Union[str, Dict] = None`, function and/or run configuration see :py:class:`~mlrun.feature_store.RunConfig`.\n", - "\n", - "- `drop_columns: List[str] = None`, list of columns to drop from the final result. \n", - "\n", - "- `start_time: str = None`, datetime, low limit of time needed to be filtered. Optional. `entity_timestamp_column` must be passed when using time filtering.\n", - "\n", - "- `end_time: str = None`, datetime, high limit of time needed to be filtered. Optional. `entity_timestamp_column` must be passed when using time filtering.\n", - "\n", - "- `with_indexes: bool = False`, return vector with index columns (default False).\n", - "\n", - "- `update_stats: bool = False`, update features statistics from the requested feature sets on the vector. Default is False." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import mlrun\n", - "import mlrun.feature_store as fstore\n", - "from mlrun.datastore.targets import CSVTarget\n", - "from mlrun.datastore.sources import CSVSource\n", - "from mlrun.run import get_dataitem\n", - "from mlrun.feature_store.steps import *\n", - "from mlrun.features import MinMaxValidator\n", - "import pandas as pd\n", - "import datetime\n", - "import os" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2022-01-31 14:41:48,288 [info] loaded project get-offline-features from MLRun DB\n" - ] - } - ], - "source": [ - "ABS_PATH = 'v3io://users/{}/get_offline_features/'.format(os.environ['V3IO_USERNAME'])\n", - "# Initialize the MLRun project object\n", - "project = mlrun.get_or_create_project('get-offline-features', context=\"./\", user_project=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Generating the Same FeatureSets and FeatureVecotrs Based on the Stocks Example" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Create Sample Data For Demo" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "quotes = pd.DataFrame(\n", - " {\n", - " \"time\": [\n", - " pd.Timestamp(\"2016-05-25 13:30:00.023\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.023\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.030\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.041\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.048\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.049\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.072\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.075\")\n", - " ],\n", - " \"ticker\": [\n", - " \"GOOG\",\n", - " \"MSFT\",\n", - " \"MSFT\",\n", - " \"MSFT\",\n", - " \"GOOG\",\n", - " \"AAPL\",\n", - " \"GOOG\",\n", - " \"MSFT\"\n", - " ],\n", - " \"bid\": [720.50, 51.95, 51.97, 51.99, 720.50, 97.99, 720.50, 52.01],\n", - " \"ask\": [720.93, 51.96, 51.98, 52.00, 720.93, 98.01, 720.88, 52.03]\n", - " }\n", - ")\n", - "\n", - "trades = pd.DataFrame(\n", - " {\n", - " \"time\": [\n", - " pd.Timestamp(\"2016-05-25 13:30:00.023\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.038\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.048\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.048\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.048\")\n", - " ],\n", - " \"ticker\": [\"MSFT\", \"MSFT\", \"GOOG\", \"GOOG\", \"AAPL\"],\n", - " \"price\": [51.95, 51.95, 720.77, 720.92, 98.0],\n", - " \"quantity\": [75, 155, 100, 100, 100]\n", - " }\n", - ")\n", - "\n", - "stocks = pd.DataFrame(\n", - " {\n", - " \"ticker\": [\"MSFT\", \"GOOG\", \"AAPL\"],\n", - " \"name\": [\"Microsoft Corporation\", \"Alphabet Inc\", \"Apple Inc\"],\n", - " \"exchange\": [\"NASDAQ\", \"NASDAQ\", \"NASDAQ\"]\n", - " }\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "def move_date(df, col):\n", - " max_date = df[col].max()\n", - " now_date = datetime.datetime.now()\n", - " delta = now_date - max_date \n", - " df[col] = df[col] + delta \n", - " return df\n", - "\n", - "quotes = move_date(quotes, \"time\")\n", - "trades = move_date(trades, \"time\")\n", - "trades.to_csv('trades.csv', index=False)\n", - "data_uri = os.path.join(ABS_PATH, 'trades.csv')" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
timetickerbidask
02022-01-31 14:41:48.260566GOOG720.50720.93
12022-01-31 14:41:48.260566MSFT51.9551.96
22022-01-31 14:41:48.267566MSFT51.9751.98
32022-01-31 14:41:48.278566MSFT51.9952.00
42022-01-31 14:41:48.285566GOOG720.50720.93
52022-01-31 14:41:48.286566AAPL97.9998.01
62022-01-31 14:41:48.309566GOOG720.50720.88
72022-01-31 14:41:48.312566MSFT52.0152.03
\n", - "
" - ], - "text/plain": [ - " time ticker bid ask\n", - "0 2022-01-31 14:41:48.260566 GOOG 720.50 720.93\n", - "1 2022-01-31 14:41:48.260566 MSFT 51.95 51.96\n", - "2 2022-01-31 14:41:48.267566 MSFT 51.97 51.98\n", - "3 2022-01-31 14:41:48.278566 MSFT 51.99 52.00\n", - "4 2022-01-31 14:41:48.285566 GOOG 720.50 720.93\n", - "5 2022-01-31 14:41:48.286566 AAPL 97.99 98.01\n", - "6 2022-01-31 14:41:48.309566 GOOG 720.50 720.88\n", - "7 2022-01-31 14:41:48.312566 MSFT 52.01 52.03" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "quotes" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
timetickerpricequantity
02022-01-31 14:41:48.288476MSFT51.9575
12022-01-31 14:41:48.303476MSFT51.95155
22022-01-31 14:41:48.313476GOOG720.77100
32022-01-31 14:41:48.313476GOOG720.92100
42022-01-31 14:41:48.313476AAPL98.00100
\n", - "
" - ], - "text/plain": [ - " time ticker price quantity\n", - "0 2022-01-31 14:41:48.288476 MSFT 51.95 75\n", - "1 2022-01-31 14:41:48.303476 MSFT 51.95 155\n", - "2 2022-01-31 14:41:48.313476 GOOG 720.77 100\n", - "3 2022-01-31 14:41:48.313476 GOOG 720.92 100\n", - "4 2022-01-31 14:41:48.313476 AAPL 98.00 100" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "trades" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
tickernameexchange
0MSFTMicrosoft CorporationNASDAQ
1GOOGAlphabet IncNASDAQ
2AAPLApple IncNASDAQ
\n", - "
" - ], - "text/plain": [ - " ticker name exchange\n", - "0 MSFT Microsoft Corporation NASDAQ\n", - "1 GOOG Alphabet Inc NASDAQ\n", - "2 AAPL Apple Inc NASDAQ" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "stocks" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Build & Ingest Simple Feature Set (stocks)" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
nameexchange
ticker
MSFTMicrosoft CorporationNASDAQ
GOOGAlphabet IncNASDAQ
AAPLApple IncNASDAQ
\n", - "
" - ], - "text/plain": [ - " name exchange\n", - "ticker \n", - "MSFT Microsoft Corporation NASDAQ\n", - "GOOG Alphabet Inc NASDAQ\n", - "AAPL Apple Inc NASDAQ" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# add feature set without time column (stock ticker metadata) \n", - "stocks_set = fstore.FeatureSet(\"stocks\", entities=[fstore.Entity(\"ticker\")])\n", - "fstore.ingest(stocks_set, stocks, infer_options=fstore.InferOptions.default())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Build Advanced feature set - with feature engineering pipeline" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "quotes_set = fstore.FeatureSet(\"stock-quotes\", entities=[fstore.Entity(\"ticker\")])" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "class MyMap(MapClass):\n", - " def __init__(self, multiplier=1, **kwargs):\n", - " super().__init__(**kwargs)\n", - " self._multiplier = multiplier\n", - "\n", - " def do(self, event):\n", - " event[\"multi\"] = event[\"bid\"] * self._multiplier\n", - " return event" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "data": { - "image/svg+xml": [ - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "mlrun-flow\n", - "\n", - "\n", - "\n", - "_start\n", - "\n", - "start\n", - "\n", - "\n", - "\n", - "MyMap\n", - "\n", - "MyMap\n", - "\n", - "\n", - "\n", - "_start->MyMap\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "storey.Extend\n", - "\n", - "storey.Extend\n", - "\n", - "\n", - "\n", - "MyMap->storey.Extend\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "filter\n", - "\n", - "filter\n", - "\n", - "\n", - "\n", - "storey.Extend->filter\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "FeaturesetValidator\n", - "\n", - "FeaturesetValidator\n", - "\n", - "\n", - "\n", - "filter->FeaturesetValidator\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "Aggregates\n", - "\n", - "Aggregates\n", - "\n", - "\n", - "\n", - "FeaturesetValidator->Aggregates\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "parquet\n", - "\n", - "\n", - "parquet\n", - "\n", - "\n", - "\n", - "Aggregates->parquet\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "nosql\n", - "\n", - "\n", - "nosql\n", - "\n", - "\n", - "\n", - "Aggregates->nosql\n", - "\n", - "\n", - "\n", - "\n", - "\n" - ], - "text/plain": [ - "" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "quotes_set.graph.to(\"MyMap\", multiplier=3)\\\n", - " .to(\"storey.Extend\", _fn=\"({'extra': event['bid'] * 77})\")\\\n", - " .to(\"storey.Filter\", \"filter\", _fn=\"(event['bid'] > 51.92)\")\\\n", - " .to(FeaturesetValidator())\n", - "\n", - "quotes_set.add_aggregation(\"ask\", [\"sum\", \"max\"], \"1h\", \"10m\", name=\"asks1\")\n", - "quotes_set.add_aggregation(\"ask\", [\"sum\", \"max\"], \"5h\", \"10m\", name=\"asks5\")\n", - "quotes_set.add_aggregation(\"bid\", [\"min\", \"max\"], \"1h\", \"10m\", name=\"bids\")\n", - "\n", - "# add feature validation policy\n", - "quotes_set[\"bid\"] = fstore.Feature(validator=MinMaxValidator(min=52, severity=\"info\"))\n", - "\n", - "# add default target definitions and plot\n", - "quotes_set.set_targets()\n", - "quotes_set.plot(rankdir=\"LR\", with_targets=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Ingest Data Into Offline And Online Stores" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 14:41:51.377248+00:00 args={'min': 52, 'value': 51.95}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 14:41:51.377927+00:00 args={'min': 52, 'value': 51.97}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 14:41:51.378103+00:00 args={'min': 52, 'value': 51.99}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 14:41:51.578640+00:00 args={'min': 52, 'value': 51.95}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 14:41:51.581692+00:00 args={'min': 52, 'value': 51.97}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 14:41:51.584351+00:00 args={'min': 52, 'value': 51.99}\n" - ] - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
asks1_max_1hasks1_sum_1hasks5_max_5hasks5_sum_5hbids_max_1hbids_min_1htimebidaskmultiextra
ticker
GOOG720.93720.93720.93720.93720.50720.502022-01-31 14:41:48.260566720.50720.932161.5055478.50
MSFT51.9651.9651.9651.9651.9551.952022-01-31 14:41:48.26056651.9551.96155.854000.15
MSFT51.98103.9451.98103.9451.9751.952022-01-31 14:41:48.26756651.9751.98155.914001.69
MSFT52.00155.9452.00155.9451.9951.952022-01-31 14:41:48.27856651.9952.00155.974003.23
GOOG720.931441.86720.931441.86720.50720.502022-01-31 14:41:48.285566720.50720.932161.5055478.50
AAPL98.0198.0198.0198.0197.9997.992022-01-31 14:41:48.28656697.9998.01293.977545.23
GOOG720.932162.74720.932162.74720.50720.502022-01-31 14:41:48.309566720.50720.882161.5055478.50
MSFT52.03207.9752.03207.9752.0151.952022-01-31 14:41:48.31256652.0152.03156.034004.77
\n", - "
" - ], - "text/plain": [ - " asks1_max_1h asks1_sum_1h asks5_max_5h asks5_sum_5h bids_max_1h \\\n", - "ticker \n", - "GOOG 720.93 720.93 720.93 720.93 720.50 \n", - "MSFT 51.96 51.96 51.96 51.96 51.95 \n", - "MSFT 51.98 103.94 51.98 103.94 51.97 \n", - "MSFT 52.00 155.94 52.00 155.94 51.99 \n", - "GOOG 720.93 1441.86 720.93 1441.86 720.50 \n", - "AAPL 98.01 98.01 98.01 98.01 97.99 \n", - "GOOG 720.93 2162.74 720.93 2162.74 720.50 \n", - "MSFT 52.03 207.97 52.03 207.97 52.01 \n", - "\n", - " bids_min_1h time bid ask multi \\\n", - "ticker \n", - "GOOG 720.50 2022-01-31 14:41:48.260566 720.50 720.93 2161.50 \n", - "MSFT 51.95 2022-01-31 14:41:48.260566 51.95 51.96 155.85 \n", - "MSFT 51.95 2022-01-31 14:41:48.267566 51.97 51.98 155.91 \n", - "MSFT 51.95 2022-01-31 14:41:48.278566 51.99 52.00 155.97 \n", - "GOOG 720.50 2022-01-31 14:41:48.285566 720.50 720.93 2161.50 \n", - "AAPL 97.99 2022-01-31 14:41:48.286566 97.99 98.01 293.97 \n", - "GOOG 720.50 2022-01-31 14:41:48.309566 720.50 720.88 2161.50 \n", - "MSFT 51.95 2022-01-31 14:41:48.312566 52.01 52.03 156.03 \n", - "\n", - " extra \n", - "ticker \n", - "GOOG 55478.50 \n", - "MSFT 4000.15 \n", - "MSFT 4001.69 \n", - "MSFT 4003.23 \n", - "GOOG 55478.50 \n", - "AAPL 7545.23 \n", - "GOOG 55478.50 \n", - "MSFT 4004.77 " - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# save ingest data and print the FeatureSet spec\n", - "fstore.ingest(quotes_set, quotes)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Get an Offline Feature Vector" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "features = [\n", - " \"stock-quotes.multi\",\n", - " \"stock-quotes.asks5_sum_5h as total_ask\",\n", - " \"stock-quotes.bids_min_1h\",\n", - " \"stock-quotes.bids_max_1h\",\n", - " \"stocks.*\",\n", - "]\n", - "\n", - "vector = fstore.FeatureVector(\"stocks-vec\", features)\n", - "vector.save()" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "target_dict = CSVTarget('mycsv',path=os.path.join(ABS_PATH, 'my_csv.csv')).to_dict()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Using `get_offline_features()` " - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "get_offline_features_fn = mlrun.import_function('hub://get_offline_features:development')" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2022-01-31 14:41:52,066 [info] starting run get-offline-features-get_offline_features uid=956663b9a9ba448c9ea65e8e9245718e DB=http://mlrun-api:8080\n", - "> 2022-01-31 14:41:52,214 [info] Creating DataFrame from entity_rows = v3io://users/yonatan/get_offline_features/trades.csv\n", - "> 2022-01-31 14:41:52,292 [info] Preparing 'mycsv' target\n", - "> 2022-01-31 14:41:52,294 [info] getting offline features from the FeatureVector store://feature-vectors/get-offline-features-yonatan/stocks-vec\n", - "> 2022-01-31 14:41:52,708 [info] wrote target: {'name': 'mycsv', 'kind': 'csv', 'path': 'v3io://users/yonatan/get_offline_features/my_csv.csv', 'status': 'ready', 'updated': '2022-01-31T14:41:52.708534+00:00'}\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
get-offline-features-yonatan0Jan 31 14:41:52completedget-offline-features-get_offline_features
v3io_user=yonatan
kind=
owner=yonatan
host=jupyter-yoni-647b99c95d-w4jlc
entity_rows
feature_vector=store://feature-vectors/get-offline-features-yonatan/stocks-vec
target={'name': 'mycsv', 'kind': 'csv', 'path': 'v3io://users/yonatan/get_offline_features/my_csv.csv', 'partitioned': False}
entity_timestamp_column=time
target=v3io://users/yonatan/get_offline_features/my_csv.csv
feature_vector=store://feature-vectors/get-offline-features-yonatan/stocks-vec
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/html": [ - " > to track results use the .show() or .logs() methods or click here to open in UI" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2022-01-31 14:41:52,896 [info] run executed, status=completed\n" - ] - } - ], - "source": [ - "gof_run = get_offline_features_fn.run(\n", - " handler='get_offline_features',\n", - " inputs= {'entity_rows': data_uri},\n", - " params={'feature_vector': vector.uri,\n", - " 'target': target_dict,\n", - " 'entity_timestamp_column': \"time\",\n", - " },\n", - " local=True\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'store://feature-vectors/get-offline-features-yonatan/stocks-vec'" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "gof_run.outputs['feature_vector']" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
Unnamed: 0pricequantitymultitotal_askbids_min_1hbids_max_1hnameexchange
0051.9575155.8551.9651.9551.95Microsoft CorporationNASDAQ
1151.9575155.91103.9451.9551.97Microsoft CorporationNASDAQ
2251.9575155.97155.9451.9551.99Microsoft CorporationNASDAQ
3351.9575156.03207.9751.9552.01Microsoft CorporationNASDAQ
4451.95155155.8551.9651.9551.95Microsoft CorporationNASDAQ
5551.95155155.91103.9451.9551.97Microsoft CorporationNASDAQ
6651.95155155.97155.9451.9551.99Microsoft CorporationNASDAQ
7751.95155156.03207.9751.9552.01Microsoft CorporationNASDAQ
88720.771002161.50720.93720.50720.50Alphabet IncNASDAQ
99720.771002161.501441.86720.50720.50Alphabet IncNASDAQ
1010720.771002161.502162.74720.50720.50Alphabet IncNASDAQ
1111720.921002161.50720.93720.50720.50Alphabet IncNASDAQ
1212720.921002161.501441.86720.50720.50Alphabet IncNASDAQ
1313720.921002161.502162.74720.50720.50Alphabet IncNASDAQ
141498.00100293.9798.0197.9997.99Apple IncNASDAQ
\n", - "
" - ], - "text/plain": [ - " Unnamed: 0 price quantity multi total_ask bids_min_1h \\\n", - "0 0 51.95 75 155.85 51.96 51.95 \n", - "1 1 51.95 75 155.91 103.94 51.95 \n", - "2 2 51.95 75 155.97 155.94 51.95 \n", - "3 3 51.95 75 156.03 207.97 51.95 \n", - "4 4 51.95 155 155.85 51.96 51.95 \n", - "5 5 51.95 155 155.91 103.94 51.95 \n", - "6 6 51.95 155 155.97 155.94 51.95 \n", - "7 7 51.95 155 156.03 207.97 51.95 \n", - "8 8 720.77 100 2161.50 720.93 720.50 \n", - "9 9 720.77 100 2161.50 1441.86 720.50 \n", - "10 10 720.77 100 2161.50 2162.74 720.50 \n", - "11 11 720.92 100 2161.50 720.93 720.50 \n", - "12 12 720.92 100 2161.50 1441.86 720.50 \n", - "13 13 720.92 100 2161.50 2162.74 720.50 \n", - "14 14 98.00 100 293.97 98.01 97.99 \n", - "\n", - " bids_max_1h name exchange \n", - "0 51.95 Microsoft Corporation NASDAQ \n", - "1 51.97 Microsoft Corporation NASDAQ \n", - "2 51.99 Microsoft Corporation NASDAQ \n", - "3 52.01 Microsoft Corporation NASDAQ \n", - "4 51.95 Microsoft Corporation NASDAQ \n", - "5 51.97 Microsoft Corporation NASDAQ \n", - "6 51.99 Microsoft Corporation NASDAQ \n", - "7 52.01 Microsoft Corporation NASDAQ \n", - "8 720.50 Alphabet Inc NASDAQ \n", - "9 720.50 Alphabet Inc NASDAQ \n", - "10 720.50 Alphabet Inc NASDAQ \n", - "11 720.50 Alphabet Inc NASDAQ \n", - "12 720.50 Alphabet Inc NASDAQ \n", - "13 720.50 Alphabet Inc NASDAQ \n", - "14 97.99 Apple Inc NASDAQ " - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "mlrun.get_dataitem(gof_run.outputs['feature_vector']).as_df()" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/get_offline_features/get_offline_features.py b/get_offline_features/get_offline_features.py deleted file mode 100644 index 1a8b8ff03..000000000 --- a/get_offline_features/get_offline_features.py +++ /dev/null @@ -1,143 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from typing import Union, List, Dict - -import mlrun -import mlrun.feature_store as fs -from mlrun.datastore.store_resources import is_store_uri, parse_store_uri -from mlrun.datastore.targets import get_target_driver, kind_to_driver -from mlrun.datastore.base import DataItem -from mlrun.execution import MLClientCtx -from mlrun.utils import StorePrefix -from mlrun.common.helpers import parse_versioned_object_uri -from mlrun.errors import MLRunInvalidArgumentError - - -def get_offline_features( - context: MLClientCtx, - feature_vector: str, - features: Union[List[str], None] = None, - label_feature: str = None, - description: str = None, - entity_rows: DataItem = None, - entity_timestamp_column: str = None, - target: Union[str, Dict] = None, - run_config: Union[str, Dict] = None, - drop_columns: List[str] = None, - start_time: str = None, - end_time: str = None, - with_indexes: bool = False, - update_stats: bool = False, -): - """retrieve offline feature vector results - - specify a feature vector object/uri and retrieve the desired features, their metadata - and statistics. returns :py:class:`~mlrun.feature_store.OfflineVectorResponse`, - results can be returned as a dataframe or written to a target. - If feature vector does not exist, a new one will be created and saved with the given features. - - The start_time and end_time attributes allow filtering the data to a given time range, they accept - string values or pandas `Timestamp` objects, string values can also be relative, for example: - "now", "now - 1d2h", "now+5m", where a valid pandas Timedelta string follows the verb "now", - for time alignment you can use the verb "floor" e.g. "now -1d floor 1H" will align the time to the last hour - (the floor string is passed to pandas.Timestamp.floor(), can use D, H, T, S for day, hour, min, sec alignment) - - - :param context: MLRun context - :param feature_vector: feature vector uri - :param features: Relevant only if feature_vector not exist. list of feature to collect to this vector - format [/]. [as ] - :param label_feature: feature name to be used as label data - :param description: text description of the vector - :param entity_rows: URI of the data entity rows to join with - :param target: where to write the results to - :param drop_columns: list of columns to drop from the final result - :param entity_timestamp_column: timestamp column name in the entity rows dataframe - :param run_config: function and/or run configuration - see :py:class:`~mlrun.feature_store.RunConfig` - :param start_time: datetime, low limit of time needed to be filtered. Optional - entity_timestamp_column must be passed when using time filtering - :param end_time: datetime, high limit of time needed to be filtered. Optional - entity_timestamp_column must be passed when using time filtering - :param with_indexes: return vector with index columns (default False) - :param update_stats: update features statistics from the requested feature sets on the vector. Default is False. - - :returns feature_vector input - """ - - if features: - # Creating a new FeatureVector and saving: - if is_store_uri(feature_vector): - prefix, new_uri = parse_store_uri(feature_vector) - if prefix != StorePrefix.FeatureVector: - raise MLRunInvalidArgumentError( - f"provided store uri ({feature_vector}) does not represent a feature vector (prefix={prefix})" - ) - feature_vector = new_uri - - context.logger.info(f"Creating FeatureVector {feature_vector}") - project, name, tag, _ = parse_versioned_object_uri(feature_vector, mlrun.mlconf.default_project) - vector = fs.FeatureVector(name, features, label_feature=label_feature, description=description) - vector.metadata.project = project - vector.metadata.tag = tag - vector.save() - feature_vector_uri = vector.uri - else: - if is_store_uri(feature_vector): - feature_vector_uri = feature_vector - else: - vector = fs.get_feature_vector(feature_vector) - feature_vector_uri = vector.uri - - # Preparing entity_rows: - if entity_rows is not None: - context.logger.info(f"Creating DataFrame from entity_rows = {entity_rows}") - entity_rows = entity_rows.as_df() - - # Preparing target: - if target: - if isinstance(target, str): - target = kind_to_driver[target]() - - name = target.name if hasattr(target, "name") else target["name"] - context.logger.info(f"Preparing '{name}' target") - target = get_target_driver(target) - if hasattr(target, 'path') and target.path: - context.log_result("target", target.path) - - # Preparing run_config: - if run_config and isinstance(run_config, dict): - context.logger.info("Preparing run configuration") - run_config = fs.RunConfig(**run_config) - - # Calling get_offline_features: - context.logger.info( - f"getting offline features from the FeatureVector {feature_vector}" - ) - fs.get_offline_features( - feature_vector=feature_vector_uri, - entity_rows=entity_rows, - entity_timestamp_column=entity_timestamp_column, - target=target, - run_config=run_config, - drop_columns=drop_columns, - start_time=start_time, - end_time=end_time, - with_indexes=with_indexes, - update_stats=update_stats, - ) - - context.log_result("feature_vector", feature_vector) - context.log_result("feature_vector_uri", feature_vector_uri) diff --git a/get_offline_features/item.yaml b/get_offline_features/item.yaml deleted file mode 100644 index 00d9c34c1..000000000 --- a/get_offline_features/item.yaml +++ /dev/null @@ -1,26 +0,0 @@ -apiVersion: v1 -categories: -- data-preparation -- data-analysis -- feature-store -description: retrieve offline feature vector results -doc: '' -example: get_offline_features.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: yonish -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.6.3 -name: get_offline_features -platformVersion: 3.5.0 -spec: - filename: get_offline_features.py - handler: get_offline_features - image: mlrun/mlrun - kind: job - requirements: [] -url: '' -version: 1.3.0 diff --git a/get_offline_features/test_get_offline_features.py b/get_offline_features/test_get_offline_features.py deleted file mode 100644 index 21913e011..000000000 --- a/get_offline_features/test_get_offline_features.py +++ /dev/null @@ -1,239 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import os -import tempfile -import shutil -import datetime - -import pytest -import mlrun -import mlrun.feature_store as fstore -from mlrun.datastore.targets import CSVTarget -from mlrun.feature_store.steps import * -from mlrun.features import MinMaxValidator -from mlrun.run import get_dataitem - - -REQUIRED_ENV_VARS = [ - "MLRUN_DBPATH", - "MLRUN_ARTIFACT_PATH", - "V3IO_USERNAME", - "V3IO_API", - "V3IO_ACCESS_KEY", -] - - -def _validate_environment_variables() -> bool: - """ - Checks that all required Environment variables are set. - """ - environment_keys = os.environ.keys() - return all(key in environment_keys for key in REQUIRED_ENV_VARS) - - -def _set_environment(): - """ - Creating project and temp dir for the project. - """ - artifact_path = tempfile.TemporaryDirectory().name - os.makedirs(artifact_path) - project = mlrun.get_or_create_project( - "get-offline-features-test", context="./", user_project=True - ) - return artifact_path, project - - -def _cleanup_environment(artifact_path: str): - """ - Cleanup the test environment, deleting files and artifacts created during the test. - - :param artifact_path: The artifact path to delete. - """ - # Clean the local directory: - for test_output in [ - *os.listdir(artifact_path), - "schedules", - "runs", - "artifacts", - "functions", - ]: - test_output_path = os.path.abspath(f"./{test_output}") - if os.path.exists(test_output_path): - if os.path.isdir(test_output_path): - shutil.rmtree(test_output_path) - else: - os.remove(test_output_path) - - # Clean the artifacts directory: - shutil.rmtree(artifact_path) - - -def create_dataframes() -> (pd.DataFrame, pd.DataFrame, pd.DataFrame): - """ - Creates all the necessary DataFrames to the test. - """ - - def move_date(df, col): - max_date = df[col].max() - now_date = datetime.datetime.now() - delta = now_date - max_date - df[col] = df[col] + delta - return df - - stocks = pd.DataFrame( - { - "ticker": ["MSFT", "GOOG", "AAPL"], - "name": ["Microsoft Corporation", "Alphabet Inc", "Apple Inc"], - "exchange": ["NASDAQ", "NASDAQ", "NASDAQ"], - } - ) - - quotes = pd.DataFrame( - { - "time": [ - pd.Timestamp("2016-05-25 13:30:00.023"), - pd.Timestamp("2016-05-25 13:30:00.023"), - pd.Timestamp("2016-05-25 13:30:00.030"), - pd.Timestamp("2016-05-25 13:30:00.041"), - pd.Timestamp("2016-05-25 13:30:00.048"), - pd.Timestamp("2016-05-25 13:30:00.049"), - pd.Timestamp("2016-05-25 13:30:00.072"), - pd.Timestamp("2016-05-25 13:30:00.075"), - ], - "ticker": ["GOOG", "MSFT", "MSFT", "MSFT", "GOOG", "AAPL", "GOOG", "MSFT"], - "bid": [720.50, 51.95, 51.97, 51.99, 720.50, 97.99, 720.50, 52.01], - "ask": [720.93, 51.96, 51.98, 52.00, 720.93, 98.01, 720.88, 52.03], - } - ) - - trades = pd.DataFrame( - { - "time": [ - pd.Timestamp("2016-05-25 13:30:00.023"), - pd.Timestamp("2016-05-25 13:30:00.038"), - pd.Timestamp("2016-05-25 13:30:00.048"), - pd.Timestamp("2016-05-25 13:30:00.048"), - pd.Timestamp("2016-05-25 13:30:00.048"), - ], - "ticker": ["MSFT", "MSFT", "GOOG", "GOOG", "AAPL"], - "price": [51.95, 51.95, 720.77, 720.92, 98.0], - "quantity": [75, 155, 100, 100, 100], - } - ) - quotes = move_date(quotes, "time") - trades = move_date(trades, "time") - return quotes, trades, stocks - - -class MyMap(MapClass): - def __init__(self, multiplier=1, **kwargs): - super().__init__(**kwargs) - self._multiplier = multiplier - - def do(self, event): - event["multi"] = event["bid"] * self._multiplier - return event - - -def _create_feature_set(): - """ - Creating all the necessary FeatureSets for the test. - """ - stocks_set = fstore.FeatureSet("stocks", entities=[fstore.Entity("ticker")]) - - quotes_set = fstore.FeatureSet("stock-quotes", entities=[fstore.Entity("ticker")]) - - quotes_set.graph.to("MyMap", multiplier=3).to( - "storey.Extend", _fn="({'extra': event['bid'] * 77})" - ).to("storey.Filter", "filter", _fn="(event['bid'] > 51.92)").to( - FeaturesetValidator() - ) - - quotes_set.add_aggregation("ask", ["sum", "max"], "1h", "10m", name="asks1") - quotes_set.add_aggregation("ask", ["sum", "max"], "5h", "10m", name="asks5") - quotes_set.add_aggregation("bid", ["min", "max"], "1h", "10m", name="bids") - - # add feature validation policy - quotes_set["bid"] = fstore.Feature( - validator=MinMaxValidator(min=52, severity="info") - ) - - # add default target definitions and plot - quotes_set.set_targets() - return quotes_set, stocks_set - - -@pytest.mark.skipif( - condition=not _validate_environment_variables(), - reason="Project's environment variables are not set", -) -def test_get_offline_vector(): - # Creating project: - artifact_path, project = _set_environment() - - # Importing the marketplace function: - gof_fn = mlrun.import_function("function.yaml") - - # Creating the dataframes: - quotes, trades, stocks = create_dataframes() - - # Defining features for the FeatureVector: - features = [ - "stock-quotes.multi", - "stock-quotes.asks5_sum_5h as total_ask", - "stock-quotes.bids_min_1h", - "stock-quotes.bids_max_1h", - "stocks.*", - ] - - # Creating the FeatureSets and ingesting them: - quotes_set, stocks_set = _create_feature_set() - fstore.ingest(stocks_set, stocks) - fstore.ingest(quotes_set, quotes) - - # Saving the trades dataframe as a csv to use as entity_rows: - trades_uri = os.path.join(artifact_path, "trades.csv") - trades.to_csv(trades_uri, index=False) - - # Creating target for the FeatureVector: - target_dict = CSVTarget( - "mycsv", path=os.path.join(artifact_path, "my_csv.csv") - ).to_dict() - - # Running the getting_offline_features function: - gof_run = None - try: - gof_run = gof_fn.run( - handler="get_offline_features", - inputs={"entity_rows": trades_uri}, - params={ - "feature_vector": "stocks-vec", - "features": features, - "target": target_dict, - "entity_timestamp_column": "time", - }, - local=True, - ) - - except Exception as e: - print(f"- The test failed - raised the following error:\n- {e}") - - target_df = get_dataitem(gof_run.outputs["target"]).as_df() - vector_df = get_dataitem(gof_run.outputs["feature_vector"]).as_df() - - # Asserting that the target and FeatureVector dataframes are the same: - assert mlrun.datastore.is_store_uri(gof_run.outputs["feature_vector_uri"]) - assert vector_df.equals(target_df), "Target and feature vector are not the same" - _cleanup_environment(artifact_path) diff --git a/hugging_face_classifier_trainer/function.yaml b/hugging_face_classifier_trainer/function.yaml deleted file mode 100644 index 65f5aeb10..000000000 --- a/hugging_face_classifier_trainer/function.yaml +++ /dev/null @@ -1,370 +0,0 @@ -kind: job -metadata: - name: hugging-face-classifier-trainer - tag: '' - hash: f9d8aa4a2c66e24fa418bb163829adc3e2ada06c - project: '' - labels: - author: davids - categories: - - deep-learning - - huggingface - - machine-learning - - model-training -spec: - command: '' - args: [] - image: '' - build: - functionSourceCode: aW1wb3J0IG9zCmltcG9ydCBzaHV0aWwKaW1wb3J0IHRlbXBmaWxlCmltcG9ydCB6aXBmaWxlCmZyb20gYWJjIGltcG9ydCBBQkMKZnJvbSB0eXBpbmcgaW1wb3J0IEFueSwgQ2FsbGFibGUsIERpY3QsIExpc3QsIE9wdGlvbmFsLCBUdXBsZSwgVW5pb24KCmltcG9ydCBtbHJ1bgppbXBvcnQgbWxydW4uZGF0YXN0b3JlCmltcG9ydCBtbHJ1bi51dGlscwppbXBvcnQgbnVtcHkgYXMgbnAKaW1wb3J0IHBhbmRhcyBhcyBwZAppbXBvcnQgdHJhbnNmb3JtZXJzCmZyb20gZGF0YXNldHMgaW1wb3J0IERhdGFzZXQsIGxvYWRfZGF0YXNldCwgbG9hZF9tZXRyaWMKZnJvbSBtbHJ1biBpbXBvcnQgTUxDbGllbnRDdHgKZnJvbSBtbHJ1biBpbXBvcnQgZmVhdHVyZV9zdG9yZSBhcyBmcwpmcm9tIG1scnVuLmFydGlmYWN0cyBpbXBvcnQgQXJ0aWZhY3QsIFBsb3RseUFydGlmYWN0CmZyb20gbWxydW4uZGF0YXN0b3JlIGltcG9ydCBEYXRhSXRlbQpmcm9tIG1scnVuLmZyYW1ld29ya3MuX2NvbW1vbiBpbXBvcnQgQ29tbW9uVHlwZXMsIE1MUnVuSW50ZXJmYWNlCmZyb20gbWxydW4udXRpbHMgaW1wb3J0IGNyZWF0ZV9jbGFzcwpmcm9tIHBsb3RseSBpbXBvcnQgZ3JhcGhfb2JqZWN0cyBhcyBnbwpmcm9tIHNrbGVhcm4ubW9kZWxfc2VsZWN0aW9uIGltcG9ydCB0cmFpbl90ZXN0X3NwbGl0CmZyb20gdHJhbnNmb3JtZXJzIGltcG9ydCAoCiAgICBBdXRvVG9rZW5pemVyLAogICAgRGF0YUNvbGxhdG9yV2l0aFBhZGRpbmcsCiAgICBFdmFsUHJlZGljdGlvbiwKICAgIFByZVRyYWluZWRNb2RlbCwKICAgIFByZVRyYWluZWRUb2tlbml6ZXIsCiAgICBUcmFpbmVyLAogICAgVHJhaW5lckNhbGxiYWNrLAogICAgVHJhaW5lckNvbnRyb2wsCiAgICBUcmFpbmVyU3RhdGUsCiAgICBUcmFpbmluZ0FyZ3VtZW50cywKKQoKCiMgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLWZyb20gTUxSVU4tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQpjbGFzcyBIRk9SVE9wdGltaXplck1MUnVuSW50ZXJmYWNlKE1MUnVuSW50ZXJmYWNlLCBBQkMpOgogICAgIiIiCiAgICBJbnRlcmZhY2UgZm9yIGFkZGluZyBNTFJ1biBmZWF0dXJlcyBmb3IgdGVuc29yZmxvdyBrZXJhcyBBUEkuCiAgICAiIiIKCiAgICAjIE1MUnVuJ3MgY29udGV4dCBkZWZhdWx0IG5hbWU6CiAgICBERUZBVUxUX0NPTlRFWFRfTkFNRSA9ICJtbHJ1bi1odWdnaW5nZmFjZSIKCiAgICAjIEF0dHJpYnV0ZXMgdG8gYmUgaW5zZXJ0ZWQgc28gdGhlIE1MUnVuIGludGVyZmFjZSB3aWxsIGJlIGZ1bGx5IGVuYWJsZWQuCiAgICBfUFJPUEVSVElFUyA9IHsKICAgICAgICAiX2F1dG9fbG9nIjogRmFsc2UsCiAgICAgICAgIl9jb250ZXh0IjogTm9uZSwKICAgICAgICAiX21vZGVsX25hbWUiOiAibW9kZWwiLAogICAgICAgICJfdGFnIjogIiIsCiAgICAgICAgIl9sYWJlbHMiOiBOb25lLAogICAgICAgICJfZXh0cmFfZGF0YSI6IE5vbmUsCiAgICB9CiAgICBfTUVUSE9EUyA9IFsiZW5hYmxlX2F1dG9fbG9nZ2luZyJdCiAgICAjIEF0dHJpYnV0ZXMgdG8gcmVwbGFjZSBzbyB0aGUgTUxSdW4gaW50ZXJmYWNlIHdpbGwgYmUgZnVsbHkgZW5hYmxlZC4KICAgIF9SRVBMQUNFRF9NRVRIT0RTID0gWwogICAgICAgICJvcHRpbWl6ZSIsCiAgICBdCgogICAgQGNsYXNzbWV0aG9kCiAgICBkZWYgYWRkX2ludGVyZmFjZSgKICAgICAgICBjbHMsCiAgICAgICAgb2JqLAogICAgICAgIHJlc3RvcmF0aW9uOiBDb21tb25UeXBlcy5NTFJ1bkludGVyZmFjZVJlc3RvcmF0aW9uVHlwZSA9IE5vbmUsCiAgICApOgogICAgICAgICIiIgogICAgICAgIEVucmljaCB0aGUgb2JqZWN0IHdpdGggdGhpcyBpbnRlcmZhY2UgcHJvcGVydGllcywgbWV0aG9kcyBhbmQgZnVuY3Rpb25zLCBzbyBpdCB3aWxsIGhhdmUgdGhpcyBUZW5zb3JGbG93LktlcmFzCiAgICAgICAgTUxSdW4ncyBmZWF0dXJlcy4KICAgICAgICA6cGFyYW0gb2JqOiAgICAgICAgICAgICAgICAgICAgIFRoZSBvYmplY3QgdG8gZW5yaWNoIGhpcyBpbnRlcmZhY2UuCiAgICAgICAgOnBhcmFtIHJlc3RvcmF0aW9uOiBSZXN0b3JhdGlvbiBpbmZvcm1hdGlvbiB0dXBsZSBhcyByZXR1cm5lZCBmcm9tICdyZW1vdmVfaW50ZXJmYWNlJyBpbiBvcmRlciB0bwogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgYWRkIHRoZSBpbnRlcmZhY2UgaW4gYSBjZXJ0YWluIHN0YXRlLgogICAgICAgICIiIgogICAgICAgIHN1cGVyKEhGT1JUT3B0aW1pemVyTUxSdW5JbnRlcmZhY2UsIGNscykuYWRkX2ludGVyZmFjZSgKICAgICAgICAgICAgb2JqPW9iaiwgcmVzdG9yYXRpb249cmVzdG9yYXRpb24KICAgICAgICApCgogICAgQGNsYXNzbWV0aG9kCiAgICBkZWYgbWxydW5fb3B0aW1pemUoY2xzKToKICAgICAgICAiIiIKICAgICAgICBNTFJ1bidzIHRmLmtlcmFzLk1vZGVsLmZpdCB3cmFwcGVyLiBJdCB3aWxsIHNldHVwIHRoZSBvcHRpbWl6ZXIgd2hlbiB1c2luZyBob3Jvdm9kLiBUaGUgb3B0aW1pemVyIG11c3QgYmUKICAgICAgICBwYXNzZWQgaW4gYSBrZXl3b3JkIGFyZ3VtZW50IGFuZCB3aGVuIHVzaW5nIGhvcm92b2QsIGl0IG11c3QgYmUgcGFzc2VkIGFzIGFuIE9wdGltaXplciBpbnN0YW5jZSwgbm90IGEgc3RyaW5nLgoKICAgICAgICByYWlzZSBNTFJ1bkludmFsaWRBcmd1bWVudEVycm9yOiBJbiBjYXNlIHRoZSBvcHRpbWl6ZXIgcHJvdmlkZWQgZGlkIG5vdCBmb2xsb3cgdGhlIGluc3RydWN0aW9ucyBhYm92ZS4KICAgICAgICAiIiIKCiAgICAgICAgZGVmIHdyYXBwZXIoc2VsZiwgKmFyZ3MsICoqa3dhcmdzKToKICAgICAgICAgICAgc2F2ZV9kaXIgPSBjbHMuX2dldF9mdW5jdGlvbl9hcmd1bWVudCgKICAgICAgICAgICAgICAgIHNlbGYub3B0aW1pemUsCiAgICAgICAgICAgICAgICBhcmd1bWVudF9uYW1lPSJzYXZlX2RpciIsCiAgICAgICAgICAgICAgICBwYXNzZWRfYXJncz1hcmdzLAogICAgICAgICAgICAgICAgcGFzc2VkX2t3YXJncz1rd2FyZ3MsCiAgICAgICAgICAgIClbMF0KCiAgICAgICAgICAgICMgQ2FsbCB0aGUgb3JpZ2luYWwgb3B0aW1pemUgbWV0aG9kOgogICAgICAgICAgICByZXN1bHQgPSBzZWxmLm9yaWdpbmFsX29wdGltaXplKCphcmdzLCAqKmt3YXJncykKCiAgICAgICAgICAgIGlmIHNlbGYuX2F1dG9fbG9nOgogICAgICAgICAgICAgICAgIyBMb2cgdGhlIG9ubnggbW9kZWw6CiAgICAgICAgICAgICAgICBzZWxmLl9jb250ZXh0LmxvZ19tb2RlbCgKICAgICAgICAgICAgICAgICAgICBrZXk9Im1vZGVsIiwKICAgICAgICAgICAgICAgICAgICBkYl9rZXk9c2VsZi5fbW9kZWxfbmFtZSwKICAgICAgICAgICAgICAgICAgICBtb2RlbF9maWxlPWYie3NhdmVfZGlyfS9tb2RlbF9vcHRpbWl6ZWQub25ueCIsCiAgICAgICAgICAgICAgICAgICAgdGFnPXNlbGYuX3RhZywKICAgICAgICAgICAgICAgICAgICBmcmFtZXdvcms9Ik9OTlgiLAogICAgICAgICAgICAgICAgICAgIGxhYmVscz1zZWxmLl9sYWJlbHMsCiAgICAgICAgICAgICAgICAgICAgZXh0cmFfZGF0YT1zZWxmLl9leHRyYV9kYXRhLAogICAgICAgICAgICAgICAgKQoKICAgICAgICAgICAgcmV0dXJuIHJlc3VsdAoKICAgICAgICByZXR1cm4gd3JhcHBlcgoKICAgIGRlZiBlbmFibGVfYXV0b19sb2dnaW5nKAogICAgICAgIHNlbGYsCiAgICAgICAgY29udGV4dDogbWxydW4uTUxDbGllbnRDdHgsCiAgICAgICAgbW9kZWxfbmFtZTogc3RyID0gIm1vZGVsIiwKICAgICAgICB0YWc6IHN0ciA9ICIiLAogICAgICAgIGxhYmVsczogRGljdFtzdHIsIHN0cl0gPSBOb25lLAogICAgICAgIGV4dHJhX2RhdGE6IGRpY3QgPSBOb25lLAogICAgKToKICAgICAgICBzZWxmLl9hdXRvX2xvZyA9IFRydWUKCiAgICAgICAgc2VsZi5fY29udGV4dCA9IGNvbnRleHQKICAgICAgICBzZWxmLl9tb2RlbF9uYW1lID0gbW9kZWxfbmFtZQogICAgICAgIHNlbGYuX3RhZyA9IHRhZwogICAgICAgIHNlbGYuX2xhYmVscyA9IGxhYmVscwogICAgICAgIHNlbGYuX2V4dHJhX2RhdGEgPSBleHRyYV9kYXRhCgoKY2xhc3MgSEZUcmFpbmVyTUxSdW5JbnRlcmZhY2UoTUxSdW5JbnRlcmZhY2UsIEFCQyk6CiAgICAiIiIKICAgIEludGVyZmFjZSBmb3IgYWRkaW5nIE1MUnVuIGZlYXR1cmVzIGZvciB0ZW5zb3JmbG93IGtlcmFzIEFQSS4KICAgICIiIgoKICAgICMgTUxSdW5zIGNvbnRleHQgZGVmYXVsdCBuYW1lOgogICAgREVGQVVMVF9DT05URVhUX05BTUUgPSAibWxydW4taHVnZ2luZ2ZhY2UiCgogICAgIyBBdHRyaWJ1dGVzIHRvIHJlcGxhY2Ugc28gdGhlIE1MUnVuIGludGVyZmFjZSB3aWxsIGJlIGZ1bGx5IGVuYWJsZWQuCiAgICBfUkVQTEFDRURfTUVUSE9EUyA9IFsKICAgICAgICAidHJhaW4iLAogICAgICAgICMgImV2YWx1YXRlIgogICAgXQoKICAgIEBjbGFzc21ldGhvZAogICAgZGVmIGFkZF9pbnRlcmZhY2UoCiAgICAgICAgY2xzLAogICAgICAgIG9iajogVHJhaW5lciwKICAgICAgICByZXN0b3JhdGlvbjogQ29tbW9uVHlwZXMuTUxSdW5JbnRlcmZhY2VSZXN0b3JhdGlvblR5cGUgPSBOb25lLAogICAgKToKICAgICAgICAiIiIKICAgICAgICBFbnJpY2ggdGhlIG9iamVjdCB3aXRoIHRoaXMgaW50ZXJmYWNlIHByb3BlcnRpZXMsIG1ldGhvZHMgYW5kIGZ1bmN0aW9ucywgc28gaXQgd2lsbCBoYXZlIHRoaXMgVGVuc29yRmxvdy5LZXJhcwogICAgICAgIE1MUnVucyBmZWF0dXJlcy4KICAgICAgICA6cGFyYW0gb2JqOiAgICAgICAgICAgICAgICAgICAgIFRoZSBvYmplY3QgdG8gZW5yaWNoIGhpcyBpbnRlcmZhY2UuCiAgICAgICAgOnBhcmFtIHJlc3RvcmF0aW9uOiBSZXN0b3JhdGlvbiBpbmZvcm1hdGlvbiB0dXBsZSBhcyByZXR1cm5lZCBmcm9tICdyZW1vdmVfaW50ZXJmYWNlJyBpbiBvcmRlciB0bwogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgYWRkIHRoZSBpbnRlcmZhY2UgaW4gYSBjZXJ0YWluIHN0YXRlLgogICAgICAgICIiIgoKICAgICAgICBzdXBlcihIRlRyYWluZXJNTFJ1bkludGVyZmFjZSwgY2xzKS5hZGRfaW50ZXJmYWNlKAogICAgICAgICAgICBvYmo9b2JqLCByZXN0b3JhdGlvbj1yZXN0b3JhdGlvbgogICAgICAgICkKCiAgICBAY2xhc3NtZXRob2QKICAgIGRlZiBtbHJ1bl90cmFpbihjbHMpOgoKICAgICAgICAiIiIKICAgICAgICBNTFJ1bnMgdGYua2VyYXMuTW9kZWwuZml0IHdyYXBwZXIuIEl0IHdpbGwgc2V0dXAgdGhlIG9wdGltaXplciB3aGVuIHVzaW5nIGhvcm92b2QuIFRoZSBvcHRpbWl6ZXIgbXVzdCBiZQogICAgICAgIHBhc3NlZCBpbiBhIGtleXdvcmQgYXJndW1lbnQgYW5kIHdoZW4gdXNpbmcgaG9yb3ZvZCwgaXQgbXVzdCBiZSBwYXNzZWQgYXMgYW4gT3B0aW1pemVyIGluc3RhbmNlLCBub3QgYSBzdHJpbmcuCgogICAgICAgIHJhaXNlIE1MUnVuSW52YWxpZEFyZ3VtZW50RXJyb3I6IEluIGNhc2UgdGhlIG9wdGltaXplciBwcm92aWRlZCBkaWQgbm90IGZvbGxvdyB0aGUgaW5zdHJ1Y3Rpb25zIGFib3ZlLgogICAgICAgICIiIgoKICAgICAgICBkZWYgd3JhcHBlcihzZWxmOiBUcmFpbmVyLCAqYXJncywgKiprd2FyZ3MpOgogICAgICAgICAgICAjIFJlc3RvcmUgdGhlIGV2YWx1YXRpb24gbWV0aG9kIGFzIGB0cmFpbmAgd2lsbCB1c2UgaXQ6CiAgICAgICAgICAgICMgY2xzLl9yZXN0b3JlX2F0dHJpYnV0ZShvYmo9c2VsZiwgYXR0cmlidXRlX25hbWU9ImV2YWx1YXRlIikKCiAgICAgICAgICAgICMgQ2FsbCB0aGUgb3JpZ2luYWwgZml0IG1ldGhvZDoKICAgICAgICAgICAgcmVzdWx0ID0gc2VsZi5vcmlnaW5hbF90cmFpbigqYXJncywgKiprd2FyZ3MpCgogICAgICAgICAgICAjIFJlcGxhY2UgdGhlIGV2YWx1YXRpb24gbWV0aG9kIGFnYWluOgogICAgICAgICAgICAjIGNscy5fcmVwbGFjZV9mdW5jdGlvbihvYmo9c2VsZiwgZnVuY3Rpb25fbmFtZT0iZXZhbHVhdGUiKQoKICAgICAgICAgICAgcmV0dXJuIHJlc3VsdAoKICAgICAgICByZXR1cm4gd3JhcHBlcgoKCmNsYXNzIE1MUnVuQ2FsbGJhY2soVHJhaW5lckNhbGxiYWNrKToKICAgICIiIgogICAgQ2FsbGJhY2sgZm9yIGNvbGxlY3RpbmcgbG9ncyBkdXJpbmcgdHJhaW5pbmcgLyBldmFsdWF0aW9uIG9mIHRoZSBgVHJhaW5lcmAgQVBJLgogICAgIiIiCgogICAgZGVmIF9faW5pdF9fKAogICAgICAgIHNlbGYsCiAgICAgICAgY29udGV4dDogbWxydW4uTUxDbGllbnRDdHggPSBOb25lLAogICAgICAgIG1vZGVsX25hbWU6IHN0ciA9ICJtb2RlbCIsCiAgICAgICAgdGFnOiBzdHIgPSAiIiwKICAgICAgICBsYWJlbHM6IERpY3Rbc3RyLCBzdHJdID0gTm9uZSwKICAgICAgICBleHRyYV9kYXRhOiBkaWN0ID0gTm9uZSwKICAgICk6CiAgICAgICAgc3VwZXIoKS5fX2luaXRfXygpCgogICAgICAgICMgU3RvcmUgdGhlIGNvbmZpZ3VyYXRpb25zOgogICAgICAgIHNlbGYuX2NvbnRleHQgPSAoCiAgICAgICAgICAgIGNvbnRleHQKICAgICAgICAgICAgaWYgY29udGV4dCBpcyBub3QgTm9uZQogICAgICAgICAgICBlbHNlIG1scnVuLmdldF9vcl9jcmVhdGVfY3R4KCIuL21scnVuLWh1Z2dpbmdmYWNlIikKICAgICAgICApCiAgICAgICAgc2VsZi5fbW9kZWxfbmFtZSA9IG1vZGVsX25hbWUKICAgICAgICBzZWxmLl90YWcgPSB0YWcKICAgICAgICBzZWxmLl9sYWJlbHMgPSBsYWJlbHMKICAgICAgICBzZWxmLl9leHRyYV9kYXRhID0gZXh0cmFfZGF0YSBpZiBleHRyYV9kYXRhIGlzIG5vdCBOb25lIGVsc2Uge30KCiAgICAgICAgIyBTZXQgdXAgdGhlIGxvZ2dpbmcgbW9kZToKICAgICAgICBzZWxmLl9pc190cmFpbmluZyA9IEZhbHNlCiAgICAgICAgc2VsZi5fc3RlcHM6IExpc3RbTGlzdFtpbnRdXSA9IFtdCiAgICAgICAgc2VsZi5fbWV0cmljX3Njb3JlczogRGljdFtzdHIsIExpc3RbZmxvYXRdXSA9IHt9CiAgICAgICAgc2VsZi5fYXJ0aWZhY3RzOiBEaWN0W3N0ciwgQXJ0aWZhY3RdID0ge30KCiAgICBkZWYgb25fZXBvY2hfYmVnaW4oCiAgICAgICAgc2VsZiwKICAgICAgICBhcmdzOiBUcmFpbmluZ0FyZ3VtZW50cywKICAgICAgICBzdGF0ZTogVHJhaW5lclN0YXRlLAogICAgICAgIGNvbnRyb2w6IFRyYWluZXJDb250cm9sLAogICAgICAgICoqa3dhcmdzLAogICAgKToKICAgICAgICBzZWxmLl9zdGVwcy5hcHBlbmQoW10pCgogICAgZGVmIG9uX2Vwb2NoX2VuZCgKICAgICAgICBzZWxmLAogICAgICAgIGFyZ3M6IFRyYWluaW5nQXJndW1lbnRzLAogICAgICAgIHN0YXRlOiBUcmFpbmVyU3RhdGUsCiAgICAgICAgY29udHJvbDogVHJhaW5lckNvbnRyb2wsCiAgICAgICAgKiprd2FyZ3MsCiAgICApOgogICAgICAgIHNlbGYuX2xvZ19tZXRyaWNzKCkKCiAgICBkZWYgb25fbG9nKAogICAgICAgIHNlbGYsCiAgICAgICAgYXJnczogVHJhaW5pbmdBcmd1bWVudHMsCiAgICAgICAgc3RhdGU6IFRyYWluZXJTdGF0ZSwKICAgICAgICBjb250cm9sOiBUcmFpbmVyQ29udHJvbCwKICAgICAgICBsb2dzOiBEaWN0W3N0ciwgZmxvYXRdID0gTm9uZSwKICAgICAgICAqKmt3YXJncywKICAgICk6CiAgICAgICAgcmVjZW50X2xvZ3MgPSBzdGF0ZS5sb2dfaGlzdG9yeVstMV0uY29weSgpCgogICAgICAgIHJlY2VudF9sb2dzLnBvcCgiZXBvY2giKQogICAgICAgIGN1cnJlbnRfc3RlcCA9IGludChyZWNlbnRfbG9ncy5wb3AoInN0ZXAiKSkKICAgICAgICBpZiBjdXJyZW50X3N0ZXAgbm90IGluIHNlbGYuX3N0ZXBzWy0xXToKICAgICAgICAgICAgc2VsZi5fc3RlcHNbLTFdLmFwcGVuZChjdXJyZW50X3N0ZXApCgogICAgICAgIGZvciBtZXRyaWNfbmFtZSwgbWV0cmljX3Njb3JlIGluIHJlY2VudF9sb2dzLml0ZW1zKCk6CiAgICAgICAgICAgIGlmIG1ldHJpY19uYW1lLnN0YXJ0c3dpdGgoInRyYWluXyIpOgogICAgICAgICAgICAgICAgaWYgbWV0cmljX25hbWUuc3BsaXQoInRyYWluXyIpWzFdIG5vdCBpbiBzZWxmLl9tZXRyaWNfc2NvcmVzOgogICAgICAgICAgICAgICAgICAgIHNlbGYuX21ldHJpY19zY29yZXNbbWV0cmljX25hbWVdID0gW21ldHJpY19zY29yZV0KICAgICAgICAgICAgICAgIGNvbnRpbnVlCiAgICAgICAgICAgIGlmIG1ldHJpY19uYW1lIG5vdCBpbiBzZWxmLl9tZXRyaWNfc2NvcmVzOgogICAgICAgICAgICAgICAgc2VsZi5fbWV0cmljX3Njb3Jlc1ttZXRyaWNfbmFtZV0gPSBbXQogICAgICAgICAgICBzZWxmLl9tZXRyaWNfc2NvcmVzW21ldHJpY19uYW1lXS5hcHBlbmQobWV0cmljX3Njb3JlKQoKICAgIGRlZiBvbl90cmFpbl9iZWdpbigKICAgICAgICBzZWxmLAogICAgICAgIGFyZ3M6IFRyYWluaW5nQXJndW1lbnRzLAogICAgICAgIHN0YXRlOiBUcmFpbmVyU3RhdGUsCiAgICAgICAgY29udHJvbDogVHJhaW5lckNvbnRyb2wsCiAgICAgICAgKiprd2FyZ3MsCiAgICApOgogICAgICAgIHNlbGYuX2lzX3RyYWluaW5nID0gVHJ1ZQoKICAgIGRlZiBvbl90cmFpbl9lbmQoCiAgICAgICAgc2VsZiwKICAgICAgICBhcmdzOiBUcmFpbmluZ0FyZ3VtZW50cywKICAgICAgICBzdGF0ZTogVHJhaW5lclN0YXRlLAogICAgICAgIGNvbnRyb2w6IFRyYWluZXJDb250cm9sLAogICAgICAgIG1vZGVsOiBQcmVUcmFpbmVkTW9kZWwgPSBOb25lLAogICAgICAgIHRva2VuaXplcjogUHJlVHJhaW5lZFRva2VuaXplciA9IE5vbmUsCiAgICAgICAgKiprd2FyZ3MsCiAgICApOgogICAgICAgIHNlbGYuX2xvZ19tZXRyaWNzKCkKCiAgICAgICAgdGVtcF9kaXJlY3RvcnkgPSB0ZW1wZmlsZS5nZXR0ZW1wZGlyKCkKCiAgICAgICAgIyBTYXZlIGFuZCBsb2cgdGhlIHRva2VuaXplcjoKICAgICAgICBpZiB0b2tlbml6ZXIgaXMgbm90IE5vbmU6CiAgICAgICAgICAgICMgU2F2ZSB0b2tlbml6ZXI6CiAgICAgICAgICAgIHRva2VuaXplcl9kaXIgPSBvcy5wYXRoLmpvaW4odGVtcF9kaXJlY3RvcnksICJ0b2tlbml6ZXIiKQogICAgICAgICAgICB0b2tlbml6ZXIuc2F2ZV9wcmV0cmFpbmVkKHNhdmVfZGlyZWN0b3J5PXRva2VuaXplcl9kaXIpCiAgICAgICAgICAgICMgWmlwIHRoZSB0b2tlbml6ZXIgZGlyZWN0b3J5OgogICAgICAgICAgICB0b2tlbml6ZXJfemlwID0gc2h1dGlsLm1ha2VfYXJjaGl2ZSgKICAgICAgICAgICAgICAgIGJhc2VfbmFtZT0idG9rZW5pemVyIiwKICAgICAgICAgICAgICAgIGZvcm1hdD0iemlwIiwKICAgICAgICAgICAgICAgIHJvb3RfZGlyPXRva2VuaXplcl9kaXIsCiAgICAgICAgICAgICkKICAgICAgICAgICAgIyBMb2cgdGhlIHppcCBmaWxlOgogICAgICAgICAgICBzZWxmLl9hcnRpZmFjdHNbInRva2VuaXplciJdID0gc2VsZi5fY29udGV4dC5sb2dfYXJ0aWZhY3QoCiAgICAgICAgICAgICAgICBpdGVtPSJ0b2tlbml6ZXIiLCBsb2NhbF9wYXRoPXRva2VuaXplcl96aXAKICAgICAgICAgICAgKQoKICAgICAgICAjIFNhdmUgdGhlIG1vZGVsOgogICAgICAgIG1vZGVsX2RpciA9IG9zLnBhdGguam9pbih0ZW1wX2RpcmVjdG9yeSwgIm1vZGVsIikKICAgICAgICBtb2RlbC5zYXZlX3ByZXRyYWluZWQoc2F2ZV9kaXJlY3Rvcnk9bW9kZWxfZGlyKQoKICAgICAgICAjIFppcCB0aGUgbW9kZWwgZGlyZWN0b3J5OgogICAgICAgIHNodXRpbC5tYWtlX2FyY2hpdmUoCiAgICAgICAgICAgIGJhc2VfbmFtZT0ibW9kZWwiLAogICAgICAgICAgICBmb3JtYXQ9InppcCIsCiAgICAgICAgICAgIHJvb3RfZGlyPW1vZGVsX2RpciwKICAgICAgICApCgogICAgICAgICMgTG9nIHRoZSBtb2RlbDoKICAgICAgICBzZWxmLl9jb250ZXh0LmxvZ19tb2RlbCgKICAgICAgICAgICAga2V5PSJtb2RlbCIsCiAgICAgICAgICAgIGRiX2tleT1zZWxmLl9tb2RlbF9uYW1lLAogICAgICAgICAgICBtb2RlbF9maWxlPSJtb2RlbC56aXAiLAogICAgICAgICAgICB0YWc9c2VsZi5fdGFnLAogICAgICAgICAgICBmcmFtZXdvcms9Ikh1Z2dpbmcgRmFjZSIsCiAgICAgICAgICAgIGxhYmVscz1zZWxmLl9sYWJlbHMsCiAgICAgICAgICAgIGV4dHJhX2RhdGE9eyoqc2VsZi5fYXJ0aWZhY3RzLCAqKnNlbGYuX2V4dHJhX2RhdGF9LAogICAgICAgICkKCiAgICBkZWYgb25fZXZhbHVhdGUoCiAgICAgICAgc2VsZiwKICAgICAgICBhcmdzOiBUcmFpbmluZ0FyZ3VtZW50cywKICAgICAgICBzdGF0ZTogVHJhaW5lclN0YXRlLAogICAgICAgIGNvbnRyb2w6IFRyYWluZXJDb250cm9sLAogICAgICAgICoqa3dhcmdzLAogICAgKToKICAgICAgICBzZWxmLl9sb2dfbWV0cmljcygpCgogICAgICAgIGlmIHNlbGYuX2lzX3RyYWluaW5nOgogICAgICAgICAgICByZXR1cm4KCiAgICAgICAgIyBUT0RPOiBVcGRhdGUgdGhlIG1vZGVsIG9iamVjdAoKICAgIGRlZiBfbG9nX21ldHJpY3Moc2VsZik6CiAgICAgICAgZm9yIG1ldHJpY19uYW1lLCBtZXRyaWNfc2NvcmVzIGluIHNlbGYuX21ldHJpY19zY29yZXMuaXRlbXMoKToKICAgICAgICAgICAgc2VsZi5fY29udGV4dC5sb2dfcmVzdWx0KGtleT1tZXRyaWNfbmFtZSwgdmFsdWU9bWV0cmljX3Njb3Jlc1stMV0pCiAgICAgICAgICAgIGlmIGxlbihtZXRyaWNfc2NvcmVzKSA+IDE6CiAgICAgICAgICAgICAgICBzZWxmLl9sb2dfbWV0cmljX3Bsb3QobmFtZT1tZXRyaWNfbmFtZSwgc2NvcmVzPW1ldHJpY19zY29yZXMpCiAgICAgICAgc2VsZi5fY29udGV4dC5jb21taXQoY29tcGxldGVkPUZhbHNlKQoKICAgIGRlZiBfbG9nX21ldHJpY19wbG90KHNlbGYsIG5hbWU6IHN0ciwgc2NvcmVzOiBMaXN0W2Zsb2F0XSk6CiAgICAgICAgIyBJbml0aWFsaXplIGEgcGxvdGx5IGZpZ3VyZToKICAgICAgICBtZXRyaWNfZmlndXJlID0gZ28uRmlndXJlKCkKCiAgICAgICAgIyBBZGQgdGl0bGVzOgogICAgICAgIG1ldHJpY19maWd1cmUudXBkYXRlX2xheW91dCgKICAgICAgICAgICAgdGl0bGU9bmFtZS5jYXBpdGFsaXplKCkucmVwbGFjZSgiXyIsICIgIiksCiAgICAgICAgICAgIHhheGlzX3RpdGxlPSJTYW1wbGVzIiwKICAgICAgICAgICAgeWF4aXNfdGl0bGU9IlNjb3JlcyIsCiAgICAgICAgKQoKICAgICAgICAjIERyYXc6CiAgICAgICAgbWV0cmljX2ZpZ3VyZS5hZGRfdHJhY2UoCiAgICAgICAgICAgIGdvLlNjYXR0ZXIoeD1ucC5hcmFuZ2UobGVuKHNjb3JlcykpLCB5PXNjb3JlcywgbW9kZT0ibGluZXMiKQogICAgICAgICkKCiAgICAgICAgIyBDcmVhdGUgdGhlIHBsb3RseSBhcnRpZmFjdDoKICAgICAgICBhcnRpZmFjdF9uYW1lID0gZiJ7bmFtZX1fcGxvdCIKICAgICAgICBhcnRpZmFjdCA9IFBsb3RseUFydGlmYWN0KGtleT1hcnRpZmFjdF9uYW1lLCBmaWd1cmU9bWV0cmljX2ZpZ3VyZSkKICAgICAgICBzZWxmLl9hcnRpZmFjdHNbYXJ0aWZhY3RfbmFtZV0gPSBzZWxmLl9jb250ZXh0LmxvZ19hcnRpZmFjdChhcnRpZmFjdCkKCgpkZWYgX2FwcGx5X21scnVuX29uX3RyYWluZXIoCiAgICB0cmFpbmVyOiB0cmFuc2Zvcm1lcnMuVHJhaW5lciwKICAgIG1vZGVsX25hbWU6IHN0ciA9IE5vbmUsCiAgICB0YWc6IHN0ciA9ICIiLAogICAgY29udGV4dDogbWxydW4uTUxDbGllbnRDdHggPSBOb25lLAogICAgYXV0b19sb2c6IGJvb2wgPSBUcnVlLAogICAgbGFiZWxzOiBEaWN0W3N0ciwgc3RyXSA9IE5vbmUsCiAgICBleHRyYV9kYXRhOiBkaWN0ID0gTm9uZSwKICAgICoqa3dhcmdzLAopOgogICAgIyBHZXQgcGFyYW1ldGVycyBkZWZhdWx0czoKICAgIGlmIGNvbnRleHQgaXMgTm9uZToKICAgICAgICBjb250ZXh0ID0gbWxydW4uZ2V0X29yX2NyZWF0ZV9jdHgoSEZUcmFpbmVyTUxSdW5JbnRlcmZhY2UuREVGQVVMVF9DT05URVhUX05BTUUpCgogICAgSEZUcmFpbmVyTUxSdW5JbnRlcmZhY2UuYWRkX2ludGVyZmFjZShvYmo9dHJhaW5lcikKCiAgICBpZiBhdXRvX2xvZzoKICAgICAgICB0cmFpbmVyLmFkZF9jYWxsYmFjaygKICAgICAgICAgICAgTUxSdW5DYWxsYmFjaygKICAgICAgICAgICAgICAgIGNvbnRleHQ9Y29udGV4dCwKICAgICAgICAgICAgICAgIG1vZGVsX25hbWU9bW9kZWxfbmFtZSwKICAgICAgICAgICAgICAgIHRhZz10YWcsCiAgICAgICAgICAgICAgICBsYWJlbHM9bGFiZWxzLAogICAgICAgICAgICAgICAgZXh0cmFfZGF0YT1leHRyYV9kYXRhLAogICAgICAgICAgICApCiAgICAgICAgKQoKCmRlZiBfYXBwbHlfbWxydW5fb25fb3B0aW1pemVyKAogICAgb3B0aW1pemVyLAogICAgbW9kZWxfbmFtZTogc3RyID0gTm9uZSwKICAgIHRhZzogc3RyID0gIiIsCiAgICBjb250ZXh0OiBtbHJ1bi5NTENsaWVudEN0eCA9IE5vbmUsCiAgICBhdXRvX2xvZzogYm9vbCA9IFRydWUsCiAgICBsYWJlbHM6IERpY3Rbc3RyLCBzdHJdID0gTm9uZSwKICAgIGV4dHJhX2RhdGE6IGRpY3QgPSBOb25lLAogICAgKiprd2FyZ3MsCik6CiAgICAjIEdldCBwYXJhbWV0ZXJzIGRlZmF1bHRzOgogICAgaWYgY29udGV4dCBpcyBOb25lOgogICAgICAgIGNvbnRleHQgPSBtbHJ1bi5nZXRfb3JfY3JlYXRlX2N0eCgKICAgICAgICAgICAgSEZPUlRPcHRpbWl6ZXJNTFJ1bkludGVyZmFjZS5ERUZBVUxUX0NPTlRFWFRfTkFNRQogICAgICAgICkKCiAgICBIRk9SVE9wdGltaXplck1MUnVuSW50ZXJmYWNlLmFkZF9pbnRlcmZhY2Uob2JqPW9wdGltaXplcikKCiAgICBpZiBhdXRvX2xvZzoKICAgICAgICBvcHRpbWl6ZXIuZW5hYmxlX2F1dG9fbG9nZ2luZygKICAgICAgICAgICAgY29udGV4dD1jb250ZXh0LAogICAgICAgICAgICBtb2RlbF9uYW1lPW1vZGVsX25hbWUsCiAgICAgICAgICAgIHRhZz10YWcsCiAgICAgICAgICAgIGxhYmVscz1sYWJlbHMsCiAgICAgICAgICAgIGV4dHJhX2RhdGE9ZXh0cmFfZGF0YSwKICAgICAgICApCgoKZGVmIGFwcGx5X21scnVuKAogICAgaHVnZ2luZ2ZhY2Vfb2JqZWN0LAogICAgbW9kZWxfbmFtZTogc3RyID0gTm9uZSwKICAgIHRhZzogc3RyID0gIiIsCiAgICBjb250ZXh0OiBtbHJ1bi5NTENsaWVudEN0eCA9IE5vbmUsCiAgICBhdXRvX2xvZzogYm9vbCA9IFRydWUsCiAgICBsYWJlbHM6IERpY3Rbc3RyLCBzdHJdID0gTm9uZSwKICAgIGV4dHJhX2RhdGE6IGRpY3QgPSBOb25lLAogICAgKiprd2FyZ3MsCik6CiAgICAiIiIKICAgIFdyYXAgdGhlIGdpdmVuIG1vZGVsIHdpdGggTUxSdW4ncyBpbnRlcmZhY2UgcHJvdmlkaW5nIGl0IHdpdGggbWxydW4ncyBhZGRpdGlvbmFsIGZlYXR1cmVzLgogICAgOnBhcmFtIGh1Z2dpbmdmYWNlX29iamVjdDogVGhlIG1vZGVsIHRvIHdyYXAuIENhbiBiZSBsb2FkZWQgZnJvbSB0aGUgbW9kZWwgcGF0aCBnaXZlbiBhcyB3ZWxsLgogICAgOnBhcmFtIG1vZGVsX25hbWU6ICAgICAgICAgVGhlIG1vZGVsIG5hbWUgdG8gdXNlIGZvciBzdG9yaW5nIHRoZSBtb2RlbCBhcnRpZmFjdC4gRGVmYXVsdDogIm1vZGVsIi4KICAgIDpwYXJhbSB0YWc6ICAgICAgICAgICAgICAgIFRoZSBtb2RlbCdzIHRhZyB0byBsb2cgd2l0aC4KICAgIDpwYXJhbSBjb250ZXh0OiAgICAgICAgICAgIE1MUnVuIGNvbnRleHQgdG8gd29yayB3aXRoLiBJZiBubyBjb250ZXh0IGlzIGdpdmVuIGl0IHdpbGwgYmUgcmV0cmlldmVkIHZpYQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgJ21scnVuLmdldF9vcl9jcmVhdGVfY3R4KE5vbmUpJwogICAgOnBhcmFtIGF1dG9fbG9nOiAgICAgICAgICAgV2hldGhlciB0byBlbmFibGUgTUxSdW4ncyBhdXRvIGxvZ2dpbmcuIERlZmF1bHQ6IFRydWUuCiAgICAiIiIKCiAgICBpZiBpc2luc3RhbmNlKGh1Z2dpbmdmYWNlX29iamVjdCwgdHJhbnNmb3JtZXJzLlRyYWluZXIpOgogICAgICAgIHJldHVybiBfYXBwbHlfbWxydW5fb25fdHJhaW5lcigKICAgICAgICAgICAgdHJhaW5lcj1odWdnaW5nZmFjZV9vYmplY3QsCiAgICAgICAgICAgIG1vZGVsX25hbWU9bW9kZWxfbmFtZSwKICAgICAgICAgICAgdGFnPXRhZywKICAgICAgICAgICAgY29udGV4dD1jb250ZXh0LAogICAgICAgICAgICBhdXRvX2xvZz1hdXRvX2xvZywKICAgICAgICAgICAgbGFiZWxzPWxhYmVscywKICAgICAgICAgICAgZXh0cmFfZGF0YT1leHRyYV9kYXRhLAogICAgICAgICkKICAgIGltcG9ydCBvcHRpbXVtLm9ubnhydW50aW1lIGFzIG9wdGltdW1fb3J0CgogICAgaWYgaXNpbnN0YW5jZShodWdnaW5nZmFjZV9vYmplY3QsIG9wdGltdW1fb3J0Lk9SVE9wdGltaXplcik6CiAgICAgICAgcmV0dXJuIF9hcHBseV9tbHJ1bl9vbl9vcHRpbWl6ZXIoCiAgICAgICAgICAgIG9wdGltaXplcj1odWdnaW5nZmFjZV9vYmplY3QsCiAgICAgICAgICAgIG1vZGVsX25hbWU9bW9kZWxfbmFtZSwKICAgICAgICAgICAgdGFnPXRhZywKICAgICAgICAgICAgY29udGV4dD1jb250ZXh0LAogICAgICAgICAgICBhdXRvX2xvZz1hdXRvX2xvZywKICAgICAgICAgICAgbGFiZWxzPWxhYmVscywKICAgICAgICAgICAgZXh0cmFfZGF0YT1leHRyYV9kYXRhLAogICAgICAgICkKICAgIHJhaXNlIG1scnVuLmVycm9ycy5NTFJ1bkludmFsaWRBcmd1bWVudEVycm9yCgoKIyAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tIGZyb20gYXV0b190cmFpbmVyLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KY2xhc3MgS1dBcmdzUHJlZml4ZXM6CiAgICBNT0RFTF9DTEFTUyA9ICJDTEFTU18iCiAgICBGSVQgPSAiRklUXyIKICAgIFRSQUlOID0gIlRSQUlOXyIKICAgIFBSRURJQ1QgPSAiUFJFRElDVF8iCgoKZGVmIF9nZXRfc3ViX2RpY3RfYnlfcHJlZml4KHNyYzogRGljdCwgcHJlZml4X2tleTogc3RyKSAtPiBEaWN0W3N0ciwgQW55XToKICAgICIiIgogICAgQ29sbGVjdCBhbGwgdGhlIGtleXMgZnJvbSB0aGUgZ2l2ZW4gZGljdCB0aGF0IHN0YXJ0cyB3aXRoIHRoZSBnaXZlbiBwcmVmaXggYW5kIGNyZWF0ZXMgYSBuZXcgZGljdGlvbmFyeSB3aXRoIHRoZXNlCiAgICBrZXlzLgoKICAgIDpwYXJhbSBzcmM6ICAgICAgICAgVGhlIHNvdXJjZSBkaWN0IHRvIGV4dHJhY3QgdGhlIHZhbHVlcyBmcm9tLgogICAgOnBhcmFtIHByZWZpeF9rZXk6ICBPbmx5IGtleXMgd2l0aCB0aGlzIHByZWZpeCB3aWxsIGJlIHJldHVybmVkLiBUaGUga2V5cyBpbiB0aGUgcmVzdWx0IGRpY3Qgd2lsbCBiZSB3aXRob3V0IHRoaXMKICAgICAgICAgICAgICAgICAgICAgICAgcHJlZml4LgogICAgIiIiCiAgICByZXR1cm4gewogICAgICAgIGtleS5yZXBsYWNlKHByZWZpeF9rZXksICIiKTogdmFsCiAgICAgICAgZm9yIGtleSwgdmFsIGluIHNyYy5pdGVtcygpCiAgICAgICAgaWYga2V5LnN0YXJ0c3dpdGgocHJlZml4X2tleSkKICAgIH0KCgpkZWYgX2dldF9kYXRhZnJhbWUoCiAgICBjb250ZXh0OiBNTENsaWVudEN0eCwKICAgIGRhdGFzZXQ6IERhdGFJdGVtLAogICAgbGFiZWxfY29sdW1uczogT3B0aW9uYWxbVW5pb25bc3RyLCBMaXN0W3N0cl1dXSA9IE5vbmUsCiAgICBkcm9wX2NvbHVtbnM6IFVuaW9uW3N0ciwgTGlzdFtzdHJdLCBpbnQsIExpc3RbaW50XV0gPSBOb25lLAopIC0+IFR1cGxlW3BkLkRhdGFGcmFtZSwgT3B0aW9uYWxbVW5pb25bc3RyLCBMaXN0W3N0cl1dXV06CiAgICAiIiIKICAgIEdldHRpbmcgdGhlIERhdGFGcmFtZSBvZiB0aGUgZGF0YXNldCBhbmQgZHJvcCB0aGUgY29sdW1ucyBhY2NvcmRpbmdseS4KCiAgICA6cGFyYW0gY29udGV4dDogICAgICAgICBNTFJ1biBjb250ZXh0LgogICAgOnBhcmFtIGRhdGFzZXQ6ICAgICAgICAgVGhlIGRhdGFzZXQgdG8gdHJhaW4gdGhlIG1vZGVsIG9uLgogICAgICAgICAgICAgICAgICAgICAgICAgICAgQ2FuIGJlIGVpdGhlciBhIGxpc3Qgb2YgbGlzdHMsIGRpY3QsIFVSSSBvciBhIEZlYXR1cmVWZWN0b3IuCiAgICA6cGFyYW0gbGFiZWxfY29sdW1uczogICBUaGUgdGFyZ2V0IGxhYmVsKHMpIG9mIHRoZSBjb2x1bW4ocykgaW4gdGhlIGRhdGFzZXQuIGZvciBSZWdyZXNzaW9uIG9yCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBDbGFzc2lmaWNhdGlvbiB0YXNrcy4KICAgIDpwYXJhbSBkcm9wX2NvbHVtbnM6ICAgIHN0ci9pbnQgb3IgYSBsaXN0IG9mIHN0cmluZ3MvaW50cyB0aGF0IHJlcHJlc2VudCB0aGUgY29sdW1uIG5hbWVzL2luZGljZXMgdG8gZHJvcC4KICAgICIiIgogICAgaWYgaXNpbnN0YW5jZShkYXRhc2V0LCAobGlzdCwgZGljdCkpOgogICAgICAgIGRhdGFzZXQgPSBwZC5EYXRhRnJhbWUoZGF0YXNldCkKICAgICAgICAjIENoZWNraW5nIGlmIGRyb3BfY29sdW1ucyBwcm92aWRlZCBieSBpbnRlZ2VyIHR5cGU6CiAgICAgICAgaWYgZHJvcF9jb2x1bW5zOgogICAgICAgICAgICBpZiBpc2luc3RhbmNlKGRyb3BfY29sdW1ucywgc3RyKSBvciAoCiAgICAgICAgICAgICAgICBpc2luc3RhbmNlKGRyb3BfY29sdW1ucywgbGlzdCkKICAgICAgICAgICAgICAgIGFuZCBhbnkoaXNpbnN0YW5jZShjb2wsIHN0cikgZm9yIGNvbCBpbiBkcm9wX2NvbHVtbnMpCiAgICAgICAgICAgICk6CiAgICAgICAgICAgICAgICBjb250ZXh0LmxvZ2dlci5lcnJvcigKICAgICAgICAgICAgICAgICAgICAiZHJvcF9jb2x1bW5zIG11c3QgYmUgYW4gaW50ZWdlci9saXN0IG9mIGludGVnZXJzIGlmIG5vdCBwcm92aWRlZCB3aXRoIGEgVVJJL0ZlYXR1cmVWZWN0b3IgZGF0YXNldCIKICAgICAgICAgICAgICAgICkKICAgICAgICAgICAgICAgIHJhaXNlIFZhbHVlRXJyb3IKICAgICAgICAgICAgZGF0YXNldC5kcm9wKGRyb3BfY29sdW1ucywgYXhpcz0xLCBpbnBsYWNlPVRydWUpCgogICAgICAgIHJldHVybiBkYXRhc2V0LCBsYWJlbF9jb2x1bW5zCgogICAgc3RvcmVfdXJpX3ByZWZpeCwgXyA9IG1scnVuLmRhdGFzdG9yZS5wYXJzZV9zdG9yZV91cmkoZGF0YXNldC5hcnRpZmFjdF91cmwpCiAgICBpZiBtbHJ1bi51dGlscy5TdG9yZVByZWZpeC5GZWF0dXJlVmVjdG9yID09IHN0b3JlX3VyaV9wcmVmaXg6CiAgICAgICAgIyBmZWF0dXJlLXZlY3RvciBjYXNlOgogICAgICAgIGxhYmVsX2NvbHVtbnMgPSBsYWJlbF9jb2x1bW5zIG9yIGRhdGFzZXQubWV0YS5zdGF0dXMubGFiZWxfY29sdW1uCiAgICAgICAgZGF0YXNldCA9IGZzLmdldF9vZmZsaW5lX2ZlYXR1cmVzKAogICAgICAgICAgICBkYXRhc2V0Lm1ldGEudXJpLCBkcm9wX2NvbHVtbnM9ZHJvcF9jb2x1bW5zCiAgICAgICAgKS50b19kYXRhZnJhbWUoKQoKICAgICAgICBjb250ZXh0LmxvZ2dlci5pbmZvKGYibGFiZWwgY29sdW1uczoge2xhYmVsX2NvbHVtbnN9IikKICAgIGVsc2U6CiAgICAgICAgIyBzaW1wbGUgVVJMIGNhc2U6CiAgICAgICAgZGF0YXNldCA9IGRhdGFzZXQuYXNfZGYoKQogICAgICAgIGlmIGRyb3BfY29sdW1uczoKICAgICAgICAgICAgaWYgYWxsKGNvbCBpbiBkYXRhc2V0IGZvciBjb2wgaW4gZHJvcF9jb2x1bW5zKToKICAgICAgICAgICAgICAgIGRhdGFzZXQgPSBkYXRhc2V0LmRyb3AoZHJvcF9jb2x1bW5zLCBheGlzPTEpCiAgICAgICAgICAgIGVsc2U6CiAgICAgICAgICAgICAgICBjb250ZXh0LmxvZ2dlci5pbmZvKAogICAgICAgICAgICAgICAgICAgICJub3QgYWxsIG9mIHRoZSBjb2x1bW5zIHRvIGRyb3AgaW4gdGhlIGRhdGFzZXQsIGRyb3AgY29sdW1ucyBwcm9jZXNzIHNraXBwZWQiCiAgICAgICAgICAgICAgICApCiAgICByZXR1cm4gZGF0YXNldCwgbGFiZWxfY29sdW1ucwoKCiMgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLSBIdWdnaW5nIEZhY2UgVHJhaW5lciAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQoKCmRlZiBfY3JlYXRlX2NvbXB1dGVfbWV0cmljcyhtZXRyaWNzOiBMaXN0W3N0cl0pIC0+IENhbGxhYmxlW1tFdmFsUHJlZGljdGlvbl0sIERpY3RdOgogICAgIiIiCiAgICBUaGlzIGZ1bmN0aW9uIGNyZWF0ZSBhbmQgcmV0dXJucyBhIGZ1bmN0aW9uIHRoYXQgd2lsbCBiZSB1c2VkIHRvIGNvbXB1dGUgbWV0cmljcyBhdCBldmFsdWF0aW9uLgogICAgOnBhcmFtIG1ldHJpY3M6IExpc3Qgb2YgZGlmZmVyZW50IG1ldHJpY3MgZm9yIGV2YWx1YXRlIHRoZSBtb2RlbCBzdWNoIGFzIGYxLCBhY2N1cmFjeSBldGMuCgogICAgOnJldHVybnM6IEZ1bmN0aW9uIHRoYXQgd2lsbCBiZSB1c2VkIHRvIGNvbXB1dGUgbWV0cmljcyBhdCBldmFsdWF0aW9uLgogICAgICAgICAgICAgTXVzdCB0YWtlIGEgW2BFdmFsUHJlZGljdGlvbmBdIGFuZCByZXR1cm4gYSBkaWN0aW9uYXJ5IHN0cmluZyB0byBtZXRyaWMgdmFsdWVzLgogICAgIiIiCgogICAgZGVmIF9jb21wdXRlX21ldHJpY3MoZXZhbF9wcmVkKToKICAgICAgICBsb2dpdHMsIGxhYmVscyA9IGV2YWxfcHJlZAogICAgICAgIHByZWRpY3Rpb25zID0gbnAuYXJnbWF4KGxvZ2l0cywgYXhpcz0tMSkKICAgICAgICBtZXRyaWNfZGljdF9yZXN1bHRzID0ge30KICAgICAgICBmb3IgbWV0cmljIGluIG1ldHJpY3M6CiAgICAgICAgICAgIGxvYWRfbWV0ID0gbG9hZF9tZXRyaWMobWV0cmljKQogICAgICAgICAgICBtZXRyaWNfcmVzID0gbG9hZF9tZXQuY29tcHV0ZShwcmVkaWN0aW9ucz1wcmVkaWN0aW9ucywgcmVmZXJlbmNlcz1sYWJlbHMpWwogICAgICAgICAgICAgICAgbWV0cmljCiAgICAgICAgICAgIF0KICAgICAgICAgICAgbWV0cmljX2RpY3RfcmVzdWx0c1ttZXRyaWNdID0gbWV0cmljX3JlcwoKICAgICAgICByZXR1cm4gbWV0cmljX2RpY3RfcmVzdWx0cwoKICAgIHJldHVybiBfY29tcHV0ZV9tZXRyaWNzCgoKZGVmIF9lZGl0X2NvbHVtbnMoCiAgICBkYXRhc2V0OiBEYXRhc2V0LAogICAgZHJvcF9jb2x1bW5zOiBMaXN0W3N0cl0gPSBOb25lLAogICAgcmVuYW1lX2NvbHVtbnM6IFtzdHIsIHN0cl0gPSBOb25lLAopIC0+IERhdGFzZXQ6CiAgICAiIiIKICAgIERyb3AgYW5kIHJlbmFtZXMgdGhhdCBjb2x1bW5zIG9mIHRoZSBnaXZlbiBkYXRhc2V0CiAgICA6cGFyYW0gZGF0YXNldDogICAgICAgICBEYXRhc2V0IHRvIHByb2Nlc3MKICAgIDpwYXJhbSBkcm9wX2NvbHVtbnM6ICAgIFRoZSBjb2x1bW5zIHRvIGRyb3AgZnJvbSB0aGUgZGF0YXNldC4KICAgIDpwYXJhbSByZW5hbWVfY29sdW1uczogIERpY3Qgb2YgY29sdW1ucyBybyByZW5hbWUgOiB7PG9sZF9uYW1lPjogPG5ld19uYW1lPiwgLi4ufQoKICAgIDpyZXR1cm5zOiBUaGUgZGF0YXNldCBhZnRlciB0aGUgZGVzaXJlZCBwcm9jZXNzCiAgICAiIiIKICAgIGlmIGRyb3BfY29sdW1uczoKICAgICAgICBkYXRhc2V0ID0gZGF0YXNldC5yZW1vdmVfY29sdW1ucyhkcm9wX2NvbHVtbnMpCiAgICBpZiByZW5hbWVfY29sdW1uczoKICAgICAgICBkYXRhc2V0ID0gZGF0YXNldC5yZW5hbWVfY29sdW1ucyhyZW5hbWVfY29sdW1ucykKICAgIHJldHVybiBkYXRhc2V0CgoKZGVmIF9wcmVwYXJlX2RhdGFzZXQoCiAgICBjb250ZXh0OiBNTENsaWVudEN0eCwKICAgIGRhdGFzZXRfbmFtZTogc3RyLAogICAgbGFiZWxfbmFtZTogc3RyID0gTm9uZSwKICAgIGRyb3BfY29sdW1uczogT3B0aW9uYWxbTGlzdFtzdHJdXSA9IE5vbmUsCiAgICBudW1fb2ZfdHJhaW5fc2FtcGxlczogaW50ID0gTm9uZSwKICAgIHRyYWluX3Rlc3Rfc3BsaXRfc2l6ZTogZmxvYXQgPSBOb25lLAogICAgcmFuZG9tX3N0YXRlOiBpbnQgPSBOb25lLAopIC0+IFR1cGxlW0RhdGFzZXQsIERhdGFzZXRdOgogICAgIiIiCiAgICBMb2FkaW5nIHRoZSBkYXRhc2V0IGFuZCBlZGl0aW5nIHRoZSBjb2x1bW5zCgogICAgOnBhcmFtIGNvbnRleHQ6ICAgICAgICAgICAgICAgICBNTFJ1biBjb250ZXgKICAgIDpwYXJhbSBkYXRhc2V0X25hbWU6ICAgICAgICAgICAgVGhlIG5hbWUgb2YgdGhlIGRhdGFzZXQgdG8gZ2V0IGZyb20gdGhlIEh1Z2dpbmdGYWNlIGh1YgogICAgOnBhcmFtIGxhYmVsX25hbWU6ICAgICAgICAgICAgICBUaGUgdGFyZ2V0IGxhYmVsIG9mIHRoZSBjb2x1bW4gaW4gdGhlIGRhdGFzZXQuCiAgICA6cGFyYW0gZHJvcF9jb2x1bW5zOiAgICAgICAgICAgIFRoZSBjb2x1bW5zIHRvIGRyb3AgZnJvbSB0aGUgZGF0YXNldC4KICAgIDpwYXJhbSBudW1fb2ZfdHJhaW5fc2FtcGxlczogICAgTWF4IG51bWJlciBvZiB0cmFpbmluZyBzYW1wbGVzLCBmb3IgZGVidWdnaW5nLgogICAgOnBhcmFtIHRyYWluX3Rlc3Rfc3BsaXRfc2l6ZTogICBTaG91bGQgYmUgYmV0d2VlbiAwLjAgYW5kIDEuMCBhbmQgcmVwcmVzZW50IHRoZSBwcm9wb3J0aW9uIG9mIHRoZSBkYXRhc2V0IHRvIGluY2x1ZGUKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgaW4gdGhlIHRlc3Qgc3BsaXQuCiAgICA6cGFyYW0gcmFuZG9tX3N0YXRlOiAgICAgICAgICAgIFJhbmRvbSBzdGF0ZSBmb3IgdHJhaW5fdGVzdF9zcGxpdAoKICAgICIiIgoKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oCiAgICAgICAgZiJMb2FkaW5nIGFuZCBlZGl0aW5nIHtkYXRhc2V0X25hbWV9IGRhdGFzZXQgZnJvbSBIdWdnaW5nIEZhY2UgaHViIgogICAgKQogICAgcmVuYW1lX2NvbHMgPSB7bGFiZWxfbmFtZTogImxhYmVscyJ9CgogICAgIyBMb2FkaW5nIGFuZCBlZGl0aW5nIGRhdGFzZXQ6CiAgICBkYXRhc2V0ID0gbG9hZF9kYXRhc2V0KGRhdGFzZXRfbmFtZSkKCiAgICAjIHRyYWluIHNldAogICAgdHJhaW5fZGF0YXNldCA9IGRhdGFzZXRbInRyYWluIl0KICAgIGlmIG51bV9vZl90cmFpbl9zYW1wbGVzOgogICAgICAgIHRyYWluX2RhdGFzZXQgPSB0cmFpbl9kYXRhc2V0LnNodWZmbGUoc2VlZD1yYW5kb21fc3RhdGUpLnNlbGVjdCgKICAgICAgICAgICAgbGlzdChyYW5nZShudW1fb2ZfdHJhaW5fc2FtcGxlcykpCiAgICAgICAgKQogICAgdHJhaW5fZGF0YXNldCA9IF9lZGl0X2NvbHVtbnModHJhaW5fZGF0YXNldCwgZHJvcF9jb2x1bW5zLCByZW5hbWVfY29scykKCiAgICAjIHRlc3Qgc2V0CiAgICB0ZXN0X2RhdGFzZXQgPSBkYXRhc2V0WyJ0ZXN0Il0KICAgIGlmIHRyYWluX3Rlc3Rfc3BsaXRfc2l6ZSBvciBudW1fb2ZfdHJhaW5fc2FtcGxlczoKICAgICAgICB0cmFpbl90ZXN0X3NwbGl0X3NpemUgPSB0cmFpbl90ZXN0X3NwbGl0X3NpemUgb3IgMC4yCiAgICAgICAgbnVtX29mX3Rlc3Rfc2FtcGxlcyA9IGludCgKICAgICAgICAgICAgKHRyYWluX2RhdGFzZXQubnVtX3Jvd3MgKiB0cmFpbl90ZXN0X3NwbGl0X3NpemUpCiAgICAgICAgICAgIC8vICgxIC0gdHJhaW5fdGVzdF9zcGxpdF9zaXplKQogICAgICAgICkKICAgICAgICB0ZXN0X2RhdGFzZXQgPSB0ZXN0X2RhdGFzZXQuc2h1ZmZsZShzZWVkPXJhbmRvbV9zdGF0ZSkuc2VsZWN0KAogICAgICAgICAgICBsaXN0KHJhbmdlKG51bV9vZl90ZXN0X3NhbXBsZXMpKQogICAgICAgICkKICAgIHRlc3RfZGF0YXNldCA9IF9lZGl0X2NvbHVtbnModGVzdF9kYXRhc2V0LCBkcm9wX2NvbHVtbnMsIHJlbmFtZV9jb2xzKQoKICAgIHJldHVybiB0cmFpbl9kYXRhc2V0LCB0ZXN0X2RhdGFzZXQKCgpkZWYgdHJhaW4oCiAgICBjb250ZXh0OiBNTENsaWVudEN0eCwKICAgIGhmX2RhdGFzZXQ6IHN0ciA9IE5vbmUsCiAgICBkYXRhc2V0OiBEYXRhSXRlbSA9IE5vbmUsCiAgICB0ZXN0X3NldDogRGF0YUl0ZW0gPSBOb25lLAogICAgZHJvcF9jb2x1bW5zOiBPcHRpb25hbFtMaXN0W3N0cl1dID0gTm9uZSwKICAgIHByZXRyYWluZWRfdG9rZW5pemVyOiBzdHIgPSBOb25lLAogICAgcHJldHJhaW5lZF9tb2RlbDogc3RyID0gTm9uZSwKICAgIG1vZGVsX2NsYXNzOiBzdHIgPSBOb25lLAogICAgbW9kZWxfbmFtZTogc3RyID0gImh1Z2dpbmdmYWNlLW1vZGVsIiwKICAgIGxhYmVsX25hbWU6IHN0ciA9ICJsYWJlbHMiLAogICAgdGV4dF9jb2w6IHN0ciA9ICJ0ZXh0IiwKICAgIG51bV9vZl90cmFpbl9zYW1wbGVzOiBpbnQgPSBOb25lLAogICAgdHJhaW5fdGVzdF9zcGxpdF9zaXplOiBmbG9hdCA9IE5vbmUsCiAgICBtZXRyaWNzOiBMaXN0W3N0cl0gPSBOb25lLAogICAgcmFuZG9tX3N0YXRlOiBpbnQgPSBOb25lLAopOgogICAgIiIiCiAgICBUcmFpbmluZyBhbmQgZXZhbHVhdGluZyBhIHByZXRyYWluZWQgbW9kZWwgd2l0aCBhIHByZXRyYWluZWQgdG9rZW5pemVyIG92ZXIgYSBkYXRhc2V0LgogICAgVGhlIGRhdGFzZXQgY2FuIGJlIGVpdGhlciBiZSB0aGUgbmFtZSBvZiB0aGUgZGF0YXNldCB0aGF0IGNvbnRhaW5zIGluIHRoZSBIdWdnaW5nRmFjZSBodWIsCiAgICBvciBhIFVSSSBvciBhIEZlYXR1cmVWZWN0b3IKCiAgICA6cGFyYW0gY29udGV4dDogICAgICAgICAgICAgICAgIE1MUnVuIGNvbnRleHQKICAgIDpwYXJhbSBoZl9kYXRhc2V0OiAgICAgICAgICAgICAgVGhlIG5hbWUgb2YgdGhlIGRhdGFzZXQgdG8gZ2V0IGZyb20gdGhlIEh1Z2dpbmdGYWNlIGh1YgogICAgOnBhcmFtIGRhdGFzZXQ6ICAgICAgICAgICAgICAgICBUaGUgZGF0YXNldCB0byB0cmFpbiB0aGUgbW9kZWwgb24uIENhbiBiZSBlaXRoZXIgYSBVUkkgb3IgYSBGZWF0dXJlVmVjdG9yCiAgICA6cGFyYW0gdGVzdF9zZXQ6ICAgICAgICAgICAgICAgIFRoZSB0ZXN0IHNldCB0byB0cmFpbiB0aGUgbW9kZWwgd2l0aC4KICAgIDpwYXJhbSBkcm9wX2NvbHVtbnM6ICAgICAgICAgICAgVGhlIGNvbHVtbnMgdG8gZHJvcCBmcm9tIHRoZSBkYXRhc2V0LgogICAgOnBhcmFtIHByZXRyYWluZWRfdG9rZW5pemVyOiAgICBUaGUgbmFtZSBvZiB0aGUgcHJldHJhaW5lZCB0b2tlbml6ZXIgZnJvbSB0aGUgSHVnZ2luZ0ZhY2UgaHViLgogICAgOnBhcmFtIHByZXRyYWluZWRfbW9kZWw6ICAgICAgICBUaGUgbmFtZSBvZiB0aGUgcHJldHJhaW5lZCBtb2RlbCBmcm9tIHRoZSBIdWdnaW5nRmFjZSBodWIuCiAgICA6cGFyYW0gbW9kZWxfbmFtZTogICAgICAgICAgICAgIFRoZSBtb2RlbCdzIG5hbWUgdG8gdXNlIGZvciBzdG9yaW5nIHRoZSBtb2RlbCBhcnRpZmFjdCwgZGVmYXVsdCB0byAnbW9kZWwnCiAgICA6cGFyYW0gbW9kZWxfY2xhc3M6ICAgICAgICAgICAgIFRoZSBjbGFzcyBvZiB0aGUgbW9kZWwsIGUuZy4gYHRyYW5zZm9ybWVycy5BdXRvTW9kZWxGb3JTZXF1ZW5jZUNsYXNzaWZpY2F0aW9uYAogICAgOnBhcmFtIGxhYmVsX25hbWU6ICAgICAgICAgICAgICBUaGUgdGFyZ2V0IGxhYmVsIG9mIHRoZSBjb2x1bW4gaW4gdGhlIGRhdGFzZXQuCiAgICA6cGFyYW0gdGV4dF9jb2w6ICAgICAgICAgICAgICAgIFRoZSBpbnB1dCB0ZXh0IGNvbHVtbiB1biB0aGUgZGF0YXNldC4KICAgIDpwYXJhbSBudW1fb2ZfdHJhaW5fc2FtcGxlczogICAgTWF4IG51bWJlciBvZiB0cmFpbmluZyBzYW1wbGVzLCBmb3IgZGVidWdnaW5nLgogICAgOnBhcmFtIHRyYWluX3Rlc3Rfc3BsaXRfc2l6ZTogICBTaG91bGQgYmUgYmV0d2VlbiAwLjAgYW5kIDEuMCBhbmQgcmVwcmVzZW50IHRoZSBwcm9wb3J0aW9uIG9mIHRoZSBkYXRhc2V0IHRvIGluY2x1ZGUKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgaW4gdGhlIHRlc3Qgc3BsaXQuCiAgICA6cGFyYW0gbWV0cmljczogICAgICAgICAgICAgICAgIExpc3Qgb2YgZGlmZmVyZW50IG1ldHJpY3MgZm9yIGV2YWx1YXRlIHRoZSBtb2RlbCBzdWNoIGFzIGYxLCBhY2N1cmFjeSBldGMuCiAgICA6cGFyYW0gcmFuZG9tX3N0YXRlOiAgICAgICAgICAgIFJhbmRvbSBzdGF0ZSBmb3IgdHJhaW5fdGVzdF9zcGxpdAogICAgIiIiCgogICAgaWYgdHJhaW5fdGVzdF9zcGxpdF9zaXplIGlzIE5vbmUgYW5kIHRlc3Rfc2V0IGlzIE5vbmU6CiAgICAgICAgY29udGV4dC5sb2dnZXIuaW5mbygKICAgICAgICAgICAgIid0cmFpbl90ZXN0X3NwbGl0X3NpemUnIGlzIG5vdCBwcm92aWRlZCwgc2V0dGluZyB0cmFpbl90ZXN0X3NwbGl0X3NpemUgdG8gMC4yIgogICAgICAgICkKICAgICAgICB0cmFpbl90ZXN0X3NwbGl0X3NpemUgPSAwLjIKCiAgICAjIENyZWF0aW5nIHRva2VuaXplcjoKICAgIHRva2VuaXplciA9IEF1dG9Ub2tlbml6ZXIuZnJvbV9wcmV0cmFpbmVkKHByZXRyYWluZWRfdG9rZW5pemVyKQoKICAgIGRlZiBwcmVwcm9jZXNzX2Z1bmN0aW9uKGV4YW1wbGVzKToKICAgICAgICByZXR1cm4gdG9rZW5pemVyKGV4YW1wbGVzW3RleHRfY29sXSwgdHJ1bmNhdGlvbj1UcnVlKQoKICAgICMgcHJlcGFyZSBkYXRhIGZvciB0cmFpbmluZwogICAgaWYgaGZfZGF0YXNldDoKICAgICAgICB0cmFpbl9kYXRhc2V0LCB0ZXN0X2RhdGFzZXQgPSBfcHJlcGFyZV9kYXRhc2V0KAogICAgICAgICAgICBjb250ZXh0LAogICAgICAgICAgICBoZl9kYXRhc2V0LAogICAgICAgICAgICBsYWJlbF9uYW1lLAogICAgICAgICAgICBkcm9wX2NvbHVtbnMsCiAgICAgICAgICAgIG51bV9vZl90cmFpbl9zYW1wbGVzLAogICAgICAgICAgICB0cmFpbl90ZXN0X3NwbGl0X3NpemUsCiAgICAgICAgICAgIHJhbmRvbV9zdGF0ZT1yYW5kb21fc3RhdGUsCiAgICAgICAgKQogICAgZWxpZiBkYXRhc2V0OgogICAgICAgICMgR2V0IERhdGFGcmFtZSBieSBVUkwgb3IgYnkgRmVhdHVyZVZlY3RvcjoKICAgICAgICB0cmFpbl9kYXRhc2V0LCBsYWJlbF9uYW1lID0gX2dldF9kYXRhZnJhbWUoCiAgICAgICAgICAgIGNvbnRleHQ9Y29udGV4dCwKICAgICAgICAgICAgZGF0YXNldD1kYXRhc2V0LAogICAgICAgICAgICBsYWJlbF9jb2x1bW5zPWxhYmVsX25hbWUsCiAgICAgICAgICAgIGRyb3BfY29sdW1ucz1kcm9wX2NvbHVtbnMsCiAgICAgICAgKQogICAgICAgIGlmIHRlc3Rfc2V0OgogICAgICAgICAgICB0ZXN0X2RhdGFzZXQsIF8gPSBfZ2V0X2RhdGFmcmFtZSgKICAgICAgICAgICAgICAgIGNvbnRleHQ9Y29udGV4dCwKICAgICAgICAgICAgICAgIGRhdGFzZXQ9dGVzdF9zZXQsCiAgICAgICAgICAgICAgICBsYWJlbF9jb2x1bW5zPWxhYmVsX25hbWUsCiAgICAgICAgICAgICAgICBkcm9wX2NvbHVtbnM9ZHJvcF9jb2x1bW5zLAogICAgICAgICAgICApCiAgICAgICAgZWxzZToKICAgICAgICAgICAgdHJhaW5fZGF0YXNldCwgdGVzdF9kYXRhc2V0ID0gdHJhaW5fdGVzdF9zcGxpdCgKICAgICAgICAgICAgICAgIHRyYWluX2RhdGFzZXQsCiAgICAgICAgICAgICAgICB0ZXN0X3NpemU9dHJhaW5fdGVzdF9zcGxpdF9zaXplLAogICAgICAgICAgICAgICAgcmFuZG9tX3N0YXRlPXJhbmRvbV9zdGF0ZSwKICAgICAgICAgICAgKQogICAgICAgIHRyYWluX2RhdGFzZXQgPSBEYXRhc2V0LmZyb21fcGFuZGFzKHRyYWluX2RhdGFzZXQpCiAgICAgICAgdGVzdF9kYXRhc2V0ID0gRGF0YXNldC5mcm9tX3BhbmRhcyh0ZXN0X2RhdGFzZXQpCiAgICBlbHNlOgogICAgICAgIHJhaXNlIG1scnVuLmVycm9ycy5NTFJ1bkludmFsaWRBcmd1bWVudEVycm9yKAogICAgICAgICAgICAiVHJhaW5pbmcgZGF0YSB3YXMgbm90IHByb3ZpZGVkLiBBIHRyYWluaW5nIGRhdGFzZXQgaXMgbWFuZGF0b3J5IGZvciB0cmFpbmluZy4iCiAgICAgICAgICAgICIgUGxlYXNlIHByb3ZpZGUgYSB0cmFpbmluZyBzZXQgdXNpbmcgb25lIG9mIHRoZSBhcmd1bWVudHMgJ2hmX2RhdGFzZXQnIG9yICdkYXRhc2V0Jy4iCiAgICAgICAgKQoKICAgICMgTWFwcGluZyBkYXRhc2V0cyB3aXRoIHRoZSB0b2tlbml6ZXI6CiAgICB0b2tlbml6ZWRfdHJhaW4gPSB0cmFpbl9kYXRhc2V0Lm1hcChwcmVwcm9jZXNzX2Z1bmN0aW9uLCBiYXRjaGVkPVRydWUpCiAgICB0b2tlbml6ZWRfdGVzdCA9IHRlc3RfZGF0YXNldC5tYXAocHJlcHJvY2Vzc19mdW5jdGlvbiwgYmF0Y2hlZD1UcnVlKQoKICAgICMgQ3JlYXRpbmcgZGF0YSBjb2xsYXRvciBmb3IgYmF0Y2hpbmc6CiAgICBkYXRhX2NvbGxhdG9yID0gRGF0YUNvbGxhdG9yV2l0aFBhZGRpbmcodG9rZW5pemVyPXRva2VuaXplcikKCiAgICAjIFBhcnNpbmcga3dhcmdzOgogICAgdHJhaW5fa3dhcmdzID0gX2dldF9zdWJfZGljdF9ieV9wcmVmaXgoCiAgICAgICAgc3JjPWNvbnRleHQucGFyYW1ldGVycywgcHJlZml4X2tleT1LV0FyZ3NQcmVmaXhlcy5UUkFJTgogICAgKQogICAgbW9kZWxfY2xhc3Nfa3dhcmdzID0gX2dldF9zdWJfZGljdF9ieV9wcmVmaXgoCiAgICAgICAgc3JjPWNvbnRleHQucGFyYW1ldGVycywgcHJlZml4X2tleT1LV0FyZ3NQcmVmaXhlcy5NT0RFTF9DTEFTUwogICAgKQoKICAgICMgTG9hZGluZyBvdXIgcHJldHJhaW5lZCBtb2RlbDoKICAgIG1vZGVsX2NsYXNzX2t3YXJnc1sicHJldHJhaW5lZF9tb2RlbF9uYW1lX29yX3BhdGgiXSA9ICgKICAgICAgICBtb2RlbF9jbGFzc19rd2FyZ3MuZ2V0KCJwcmV0cmFpbmVkX21vZGVsX25hbWVfb3JfcGF0aCIpIG9yIHByZXRyYWluZWRfbW9kZWwKICAgICkKICAgIHRyYWluX2t3YXJnc1siaHViX3Rva2VuIl0gPSB0cmFpbl9rd2FyZ3MuZ2V0KCJodWJfdG9rZW4iKSBvciBwcmV0cmFpbmVkX3Rva2VuaXplcgogICAgaWYgbm90IG1vZGVsX2NsYXNzX2t3YXJnc1sicHJldHJhaW5lZF9tb2RlbF9uYW1lX29yX3BhdGgiXToKICAgICAgICByYWlzZSBtbHJ1bi5lcnJvcnMuTUxSdW5SdW50aW1lRXJyb3IoCiAgICAgICAgICAgICJNdXN0IHByb3ZpZGUgcHJldHJhaW5lZF9tb2RlbCBuYW1lIGFzICIKICAgICAgICAgICAgImZ1bmN0aW9uIGFyZ3VtZW50IG9yIGluIGV4dHJhIHBhcmFtcyIKICAgICAgICApCiAgICBtb2RlbCA9IGNyZWF0ZV9jbGFzcyhtb2RlbF9jbGFzcykuZnJvbV9wcmV0cmFpbmVkKCoqbW9kZWxfY2xhc3Nfa3dhcmdzKQoKICAgICMgUHJlcGFyaW5nIHRyYWluaW5nIGFyZ3VtZW50czoKICAgIHRyYWluaW5nX2FyZ3MgPSBUcmFpbmluZ0FyZ3VtZW50cygKICAgICAgICAqKnRyYWluX2t3YXJncywKICAgICkKCiAgICBjb21wdXRlX21ldHJpY3MgPSBfY3JlYXRlX2NvbXB1dGVfbWV0cmljcyhtZXRyaWNzKSBpZiBtZXRyaWNzIGVsc2UgTm9uZQogICAgdHJhaW5lciA9IFRyYWluZXIoCiAgICAgICAgbW9kZWw9bW9kZWwsCiAgICAgICAgYXJncz10cmFpbmluZ19hcmdzLAogICAgICAgIHRyYWluX2RhdGFzZXQ9dG9rZW5pemVkX3RyYWluLAogICAgICAgIGV2YWxfZGF0YXNldD10b2tlbml6ZWRfdGVzdCwKICAgICAgICB0b2tlbml6ZXI9dG9rZW5pemVyLAogICAgICAgIGRhdGFfY29sbGF0b3I9ZGF0YV9jb2xsYXRvciwKICAgICAgICBjb21wdXRlX21ldHJpY3M9Y29tcHV0ZV9tZXRyaWNzLAogICAgKQoKICAgIGFwcGx5X21scnVuKHRyYWluZXIsIG1vZGVsX25hbWU9bW9kZWxfbmFtZSkKCiAgICAjIEFwcGx5IHRyYWluaW5nIHdpdGggZXZhbHVhdGlvbjoKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oZiJ0cmFpbmluZyAne21vZGVsX25hbWV9JyIpCiAgICB0cmFpbmVyLnRyYWluKCkKCgpkZWYgX2dldF9tb2RlbF9kaXIobW9kZWxfdXJpOiBzdHIpOgogICAgbW9kZWxfZmlsZSwgXywgXyA9IG1scnVuLmFydGlmYWN0cy5nZXRfbW9kZWwobW9kZWxfdXJpKQogICAgbW9kZWxfZGlyID0gdGVtcGZpbGUuZ2V0dGVtcGRpcigpCiAgICAjIFVuemlwIHRoZSBNb2RlbDoKICAgIHdpdGggemlwZmlsZS5aaXBGaWxlKG1vZGVsX2ZpbGUsICJyIikgYXMgemlwX2ZpbGU6CiAgICAgICAgemlwX2ZpbGUuZXh0cmFjdGFsbChtb2RlbF9kaXIpCgogICAgcmV0dXJuIG1vZGVsX2RpcgoKCmRlZiBvcHRpbWl6ZSgKICAgIG1vZGVsX3BhdGg6IHN0ciwKICAgIG1vZGVsX25hbWU6IHN0ciA9ICJvcHRpbWl6ZWRfbW9kZWwiLAogICAgdGFyZ2V0X2Rpcjogc3RyID0gIi4vb3B0aW1pemVkIiwKICAgIG9wdGltaXphdGlvbl9sZXZlbDogaW50ID0gMSwKKToKICAgICIiIgogICAgT3B0aW1pemluZyB0aGUgdHJhbnNmb3JtZXIgbW9kZWwgdXNpbmcgT05OWCBvcHRpbWl6YXRpb24uCgoKICAgIDpwYXJhbSBtb2RlbF9wYXRoOiAgICAgICAgICBUaGUgcGF0aCBvZiB0aGUgbW9kZWwgdG8gb3B0aW1pemUuCiAgICA6cGFyYW0gbW9kZWxfbmFtZTogICAgICAgICAgTmFtZSBvZiB0aGUgb3B0aW1pemVkIG1vZGVsLgogICAgOnBhcmFtIHRhcmdldF9kaXI6ICAgICAgICAgIFRoZSBkaXJlY3RvcnkgdG8gc2F2ZSB0aGUgT05OWCBtb2RlbC4KICAgIDpwYXJhbSBvcHRpbWl6YXRpb25fbGV2ZWw6ICBPcHRpbWl6YXRpb24gbGV2ZWwgcGVyZm9ybWVkIGJ5IE9OTlggUnVudGltZSBvZiB0aGUgbG9hZGVkIGdyYXBoLiAoZGVmYXVsdCBpcyAxKQogICAgIiIiCiAgICAjIFdlIGltcG9ydCB0aGVzZSBpbiB0aGUgZnVuY3Rpb24gc2NvcGUgc28gT05OWCB3b24ndCBiZSBtYW5kYXRvcnkgZm9yIHRoZSBvdGhlciBoYW5kbGVyczoKICAgIGZyb20gb3B0aW11bS5vbm54cnVudGltZSBpbXBvcnQgT1JUTW9kZWxGb3JTZXF1ZW5jZUNsYXNzaWZpY2F0aW9uLCBPUlRPcHRpbWl6ZXIKICAgIGZyb20gb3B0aW11bS5vbm54cnVudGltZS5jb25maWd1cmF0aW9uIGltcG9ydCBPcHRpbWl6YXRpb25Db25maWcKCiAgICBtb2RlbF9kaXIgPSBfZ2V0X21vZGVsX2Rpcihtb2RlbF91cmk9bW9kZWxfcGF0aCkKICAgICMgQ3JlYXRpbmcgY29uZmlndXJhdGlvbiBmb3Igb3B0aW1pemF0aW9uIHN0ZXA6CiAgICBvcHRpbWl6YXRpb25fY29uZmlnID0gT3B0aW1pemF0aW9uQ29uZmlnKG9wdGltaXphdGlvbl9sZXZlbD1vcHRpbWl6YXRpb25fbGV2ZWwpCgogICAgIyBDb252ZXJ0aW5nIG91ciBwcmV0cmFpbmVkIG1vZGVsIHRvIGFuIE9OTlgtUnVudGltZSBtb2RlbDoKICAgIG9ydF9tb2RlbCA9IE9SVE1vZGVsRm9yU2VxdWVuY2VDbGFzc2lmaWNhdGlvbi5mcm9tX3ByZXRyYWluZWQoCiAgICAgICAgbW9kZWxfZGlyLCBmcm9tX3RyYW5zZm9ybWVycz1UcnVlCiAgICApCgogICAgIyBDcmVhdGluZyBhbiBPTk5YLVJ1bnRpbWUgb3B0aW1pemVyIGZyb20gT05OWCBtb2RlbDoKICAgIG9wdGltaXplciA9IE9SVE9wdGltaXplci5mcm9tX3ByZXRyYWluZWQob3J0X21vZGVsKQoKICAgIGFwcGx5X21scnVuKG9wdGltaXplciwgbW9kZWxfbmFtZT1tb2RlbF9uYW1lKQogICAgIyBPcHRpbWl6aW5nIGFuZCBzYXZpbmcgdGhlIE9OTlggbW9kZWw6CiAgICBvcHRpbWl6ZXIub3B0aW1pemUoc2F2ZV9kaXI9dGFyZ2V0X2Rpciwgb3B0aW1pemF0aW9uX2NvbmZpZz1vcHRpbWl6YXRpb25fY29uZmlnKQo= - base_image: mlrun/mlrun - commands: [] - code_origin: '' - origin_filename: '' - requirements: - - onnx~=1.14.1 - - onnxruntime~=1.16.1 - - optimum~=1.6.4 - - transformers~=4.26.1 - - datasets~=2.10.1 - - scikit-learn~=1.0.2 - entry_points: - add_interface: - name: add_interface - doc: 'Enrich the object with this interface properties, methods and functions, - so it will have this TensorFlow.Keras - - MLRuns features.' - parameters: - - name: cls - - name: obj - type: Trainer - doc: The object to enrich his interface. - - name: restoration - type: MLRunInterfaceRestorationType - doc: Restoration information tuple as returned from 'remove_interface' in - order to add the interface in a certain state. - default: null - outputs: [] - lineno: 146 - has_varargs: false - has_kwargs: false - mlrun_optimize: - name: mlrun_optimize - doc: 'MLRun''s tf.keras.Model.fit wrapper. It will setup the optimizer when - using horovod. The optimizer must be - - passed in a keyword argument and when using horovod, it must be passed as - an Optimizer instance, not a string. - - - raise MLRunInvalidArgumentError: In case the optimizer provided did not follow - the instructions above.' - parameters: - - name: cls - outputs: [] - lineno: 79 - has_varargs: false - has_kwargs: false - wrapper: - name: wrapper - doc: '' - parameters: - - name: self - type: Trainer - outputs: [] - lineno: 173 - has_varargs: true - has_kwargs: true - enable_auto_logging: - name: enable_auto_logging - doc: '' - parameters: - - name: self - - name: context - type: MLClientCtx - - name: model_name - type: str - default: model - - name: tag - type: str - default: '' - - name: labels - type: Dict[str, str] - default: null - - name: extra_data - type: dict - default: null - outputs: [] - lineno: 114 - has_varargs: false - has_kwargs: false - mlrun_train: - name: mlrun_train - doc: 'MLRuns tf.keras.Model.fit wrapper. It will setup the optimizer when using - horovod. The optimizer must be - - passed in a keyword argument and when using horovod, it must be passed as - an Optimizer instance, not a string. - - - raise MLRunInvalidArgumentError: In case the optimizer provided did not follow - the instructions above.' - parameters: - - name: cls - outputs: [] - lineno: 164 - has_varargs: false - has_kwargs: false - on_epoch_begin: - name: on_epoch_begin - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 220 - has_varargs: false - has_kwargs: true - on_epoch_end: - name: on_epoch_end - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 229 - has_varargs: false - has_kwargs: true - on_log: - name: on_log - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - - name: logs - type: Dict[str, float] - default: null - outputs: [] - lineno: 238 - has_varargs: false - has_kwargs: true - on_train_begin: - name: on_train_begin - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 262 - has_varargs: false - has_kwargs: true - on_train_end: - name: on_train_end - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - - name: model - type: PreTrainedModel - default: null - - name: tokenizer - type: PreTrainedTokenizer - default: null - outputs: [] - lineno: 271 - has_varargs: false - has_kwargs: true - on_evaluate: - name: on_evaluate - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 322 - has_varargs: false - has_kwargs: true - apply_mlrun: - name: apply_mlrun - doc: Wrap the given model with MLRun's interface providing it with mlrun's additional - features. - parameters: - - name: huggingface_object - doc: The model to wrap. Can be loaded from the model path given as well. - - name: model_name - type: str - doc: 'The model name to use for storing the model artifact. Default: "model".' - default: null - - name: tag - type: str - doc: The model's tag to log with. - default: '' - - name: context - type: MLClientCtx - doc: MLRun context to work with. If no context is given it will be retrieved - via 'mlrun.get_or_create_ctx(None)' - default: null - - name: auto_log - type: bool - doc: 'Whether to enable MLRun''s auto logging. Default: True.' - default: true - - name: labels - type: Dict[str, str] - default: null - - name: extra_data - type: dict - default: null - outputs: [] - lineno: 421 - has_varargs: false - has_kwargs: true - train: - name: train - doc: 'Training and evaluating a pretrained model with a pretrained tokenizer - over a dataset. - - The dataset can be either be the name of the dataset that contains in the - HuggingFace hub, - - or a URI or a FeatureVector' - parameters: - - name: context - type: MLClientCtx - doc: MLRun context - - name: hf_dataset - type: str - doc: The name of the dataset to get from the HuggingFace hub - default: null - - name: dataset - type: DataItem - doc: The dataset to train the model on. Can be either a URI or a FeatureVector - default: null - - name: test_set - type: DataItem - doc: The test set to train the model with. - default: null - - name: drop_columns - type: Optional[List[str]] - doc: The columns to drop from the dataset. - default: null - - name: pretrained_tokenizer - type: str - doc: The name of the pretrained tokenizer from the HuggingFace hub. - default: null - - name: pretrained_model - type: str - doc: The name of the pretrained model from the HuggingFace hub. - default: null - - name: model_class - type: str - doc: The class of the model, e.g. `transformers.AutoModelForSequenceClassification` - default: null - - name: model_name - type: str - doc: The model's name to use for storing the model artifact, default to 'model' - default: huggingface-model - - name: label_name - type: str - doc: The target label of the column in the dataset. - default: labels - - name: text_col - type: str - doc: The input text column un the dataset. - default: text - - name: num_of_train_samples - type: int - doc: Max number of training samples, for debugging. - default: null - - name: train_test_split_size - type: float - doc: Should be between 0.0 and 1.0 and represent the proportion of the dataset - to include in the test split. - default: null - - name: metrics - type: List[str] - doc: List of different metrics for evaluate the model such as f1, accuracy - etc. - default: null - - name: random_state - type: int - doc: Random state for train_test_split - default: null - outputs: [] - lineno: 647 - has_varargs: false - has_kwargs: false - preprocess_function: - name: preprocess_function - doc: '' - parameters: - - name: examples - outputs: [] - lineno: 696 - has_varargs: false - has_kwargs: false - optimize: - name: optimize - doc: Optimizing the transformer model using ONNX optimization. - parameters: - - name: model_path - type: str - doc: The path of the model to optimize. - - name: model_name - type: str - doc: Name of the optimized model. - default: optimized_model - - name: target_dir - type: str - doc: The directory to save the ONNX model. - default: ./optimized - - name: optimization_level - type: int - doc: Optimization level performed by ONNX Runtime of the loaded graph. (default - is 1) - default: 1 - outputs: [] - lineno: 799 - has_varargs: false - has_kwargs: false - description: Automatic train and optimize functions for HuggingFace framework - default_handler: train - disable_auto_mount: false - clone_target_dir: '' - env: [] - priority_class_name: '' - preemption_mode: prevent - affinity: null - tolerations: null - security_context: {} -verbose: false diff --git a/hugging_face_classifier_trainer/hugging_face_classifier_trainer.ipynb b/hugging_face_classifier_trainer/hugging_face_classifier_trainer.ipynb deleted file mode 100644 index 2768d2dc1..000000000 --- a/hugging_face_classifier_trainer/hugging_face_classifier_trainer.ipynb +++ /dev/null @@ -1,2533 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "\n", - "# MLRun Hugging Face Classifier Trainer Tutorial" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "This notebook shows how to use the handlers of the Hugging Face classifier trainer.\n", - "the following handlers are:\n", - "- `train`\n", - "- `optimize`\n", - "\n", - "All you need is simply **HF model type** and a **HF dataset name**." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": { - "scrolled": true, - "tags": [] - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Requirement already satisfied: onnx~=1.14.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from -r requirements.txt (line 1)) (1.14.1)\n", - "Requirement already satisfied: onnxruntime==1.16.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from -r requirements.txt (line 2)) (1.16.1)\n", - "Requirement already satisfied: optimum~=1.6.4 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from -r requirements.txt (line 3)) (1.6.4)\n", - "Requirement already satisfied: transformers~=4.26.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from -r requirements.txt (line 4)) (4.26.1)\n", - "Requirement already satisfied: datasets~=2.10.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from -r requirements.txt (line 5)) (2.10.1)\n", - "Requirement already satisfied: scikit-learn~=1.0.2 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from -r requirements.txt (line 6)) (1.0.2)\n", - "Requirement already satisfied: coloredlogs in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from onnxruntime==1.16.1->-r requirements.txt (line 2)) (15.0.1)\n", - "Requirement already satisfied: flatbuffers in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from onnxruntime==1.16.1->-r requirements.txt (line 2)) (1.12)\n", - "Requirement already satisfied: numpy>=1.21.6 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from onnxruntime==1.16.1->-r requirements.txt (line 2)) (1.23.5)\n", - "Requirement already satisfied: packaging in /conda/envs/mlrun-base/lib/python3.9/site-packages (from onnxruntime==1.16.1->-r requirements.txt (line 2)) (21.3)\n", - "Requirement already satisfied: protobuf in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from onnxruntime==1.16.1->-r requirements.txt (line 2)) (3.20.2)\n", - "Requirement already satisfied: sympy in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from onnxruntime==1.16.1->-r requirements.txt (line 2)) (1.12)\n", - "Requirement already satisfied: typing-extensions>=3.6.2.1 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from onnx~=1.14.1->-r requirements.txt (line 1)) (4.7.1)\n", - "Requirement already satisfied: torch>=1.9 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from optimum~=1.6.4->-r requirements.txt (line 3)) (2.1.2)\n", - "Requirement already satisfied: huggingface-hub>=0.8.0 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from optimum~=1.6.4->-r requirements.txt (line 3)) (0.20.1)\n", - "Requirement already satisfied: filelock in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from transformers~=4.26.1->-r requirements.txt (line 4)) (3.13.1)\n", - "Requirement already satisfied: pyyaml>=5.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from transformers~=4.26.1->-r requirements.txt (line 4)) (5.4.1)\n", - "Requirement already satisfied: regex!=2019.12.17 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from transformers~=4.26.1->-r requirements.txt (line 4)) (2023.12.25)\n", - "Requirement already satisfied: requests in /conda/envs/mlrun-base/lib/python3.9/site-packages (from transformers~=4.26.1->-r requirements.txt (line 4)) (2.31.0)\n", - "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from transformers~=4.26.1->-r requirements.txt (line 4)) (0.13.3)\n", - "Requirement already satisfied: tqdm>=4.27 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from transformers~=4.26.1->-r requirements.txt (line 4)) (4.65.0)\n", - "Requirement already satisfied: pyarrow>=6.0.0 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from datasets~=2.10.1->-r requirements.txt (line 5)) (11.0.0)\n", - "Requirement already satisfied: dill<0.3.7,>=0.3.0 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from datasets~=2.10.1->-r requirements.txt (line 5)) (0.3.6)\n", - "Requirement already satisfied: pandas in /conda/envs/mlrun-base/lib/python3.9/site-packages (from datasets~=2.10.1->-r requirements.txt (line 5)) (1.4.4)\n", - "Requirement already satisfied: xxhash in /conda/envs/mlrun-base/lib/python3.9/site-packages (from datasets~=2.10.1->-r requirements.txt (line 5)) (3.3.0)\n", - "Requirement already satisfied: multiprocess in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from datasets~=2.10.1->-r requirements.txt (line 5)) (0.70.14)\n", - "Requirement already satisfied: fsspec>=2021.11.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from fsspec[http]>=2021.11.1->datasets~=2.10.1->-r requirements.txt (line 5)) (2023.9.2)\n", - "Requirement already satisfied: aiohttp in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from datasets~=2.10.1->-r requirements.txt (line 5)) (3.9.1)\n", - "Requirement already satisfied: responses<0.19 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from datasets~=2.10.1->-r requirements.txt (line 5)) (0.18.0)\n", - "Requirement already satisfied: scipy>=1.1.0 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from scikit-learn~=1.0.2->-r requirements.txt (line 6)) (1.11.4)\n", - "Requirement already satisfied: joblib>=0.11 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from scikit-learn~=1.0.2->-r requirements.txt (line 6)) (1.3.2)\n", - "Requirement already satisfied: threadpoolctl>=2.0.0 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from scikit-learn~=1.0.2->-r requirements.txt (line 6)) (3.2.0)\n", - "Requirement already satisfied: attrs>=17.3.0 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from aiohttp->datasets~=2.10.1->-r requirements.txt (line 5)) (19.1.0)\n", - "Requirement already satisfied: multidict<7.0,>=4.5 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from aiohttp->datasets~=2.10.1->-r requirements.txt (line 5)) (6.0.4)\n", - "Requirement already satisfied: yarl<2.0,>=1.0 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from aiohttp->datasets~=2.10.1->-r requirements.txt (line 5)) (1.9.2)\n", - "Requirement already satisfied: frozenlist>=1.1.1 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from aiohttp->datasets~=2.10.1->-r requirements.txt (line 5)) (1.4.0)\n", - "Requirement already satisfied: aiosignal>=1.1.2 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from aiohttp->datasets~=2.10.1->-r requirements.txt (line 5)) (1.3.1)\n", - "Requirement already satisfied: async-timeout<5.0,>=4.0 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from aiohttp->datasets~=2.10.1->-r requirements.txt (line 5)) (4.0.3)\n", - "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from packaging->onnxruntime==1.16.1->-r requirements.txt (line 2)) (3.1.1)\n", - "Requirement already satisfied: charset-normalizer<4,>=2 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from requests->transformers~=4.26.1->-r requirements.txt (line 4)) (2.1.1)\n", - "Requirement already satisfied: idna<4,>=2.5 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from requests->transformers~=4.26.1->-r requirements.txt (line 4)) (3.4)\n", - "Requirement already satisfied: urllib3<3,>=1.21.1 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from requests->transformers~=4.26.1->-r requirements.txt (line 4)) (1.26.16)\n", - "Requirement already satisfied: certifi>=2017.4.17 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from requests->transformers~=4.26.1->-r requirements.txt (line 4)) (2023.7.22)\n", - "Requirement already satisfied: networkx in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (3.2.1)\n", - "Requirement already satisfied: jinja2 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (3.1.3)\n", - "Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (12.1.105)\n", - "Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (12.1.105)\n", - "Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (12.1.105)\n", - "Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (8.9.2.26)\n", - "Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (12.1.3.1)\n", - "Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (11.0.2.54)\n", - "Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (10.3.2.106)\n", - "Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (11.4.5.107)\n", - "Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (12.1.0.106)\n", - "Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (2.18.1)\n", - "Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (12.1.105)\n", - "Requirement already satisfied: triton==2.1.0 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (2.1.0)\n", - "Requirement already satisfied: nvidia-nvjitlink-cu12 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (12.3.101)\n", - "Requirement already satisfied: sentencepiece!=0.1.92,>=0.1.91 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from transformers[sentencepiece]>=4.26.0->optimum~=1.6.4->-r requirements.txt (line 3)) (0.2.0)\n", - "Requirement already satisfied: humanfriendly>=9.1 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from coloredlogs->onnxruntime==1.16.1->-r requirements.txt (line 2)) (9.2)\n", - "Requirement already satisfied: python-dateutil>=2.8.1 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from pandas->datasets~=2.10.1->-r requirements.txt (line 5)) (2.8.2)\n", - "Requirement already satisfied: pytz>=2020.1 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from pandas->datasets~=2.10.1->-r requirements.txt (line 5)) (2023.3.post1)\n", - "Requirement already satisfied: mpmath>=0.19 in /User/.pythonlibs/mlrun-base/lib/python3.9/site-packages (from sympy->onnxruntime==1.16.1->-r requirements.txt (line 2)) (1.3.0)\n", - "Requirement already satisfied: six>=1.5 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from python-dateutil>=2.8.1->pandas->datasets~=2.10.1->-r requirements.txt (line 5)) (1.16.0)\n", - "Requirement already satisfied: MarkupSafe>=2.0 in /conda/envs/mlrun-base/lib/python3.9/site-packages (from jinja2->torch>=1.9->optimum~=1.6.4->-r requirements.txt (line 3)) (2.1.3)\n", - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], - "source": [ - "%pip install -r requirements.txt" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [], - "source": [ - "import mlrun" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:10:17,091 [info] Project loaded successfully: {'project_name': 'hugging-face-trainer'}\n" - ] - } - ], - "source": [ - "project = mlrun.get_or_create_project('hugging-face-trainer', context=\"./\", user_project=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "### **Importing the hugging_face_classifier_trainer function from the Marketplace**" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [], - "source": [ - "hugging_face_classifier_trainer = mlrun.import_function(\"hub://hugging_face_classifier_trainer\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "### **Training a model**\n", - "\n", - "Choosing the `train` handler" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "#### Define task parameters¶\n", - "* Class parameters should contain the prefix `CLASS_`\n", - "* Train parameters should contain the prefix `TRAIN_`" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [], - "source": [ - "model_class = \"transformers.AutoModelForSequenceClassification\"\n", - "additional_parameters = {\n", - " \"TRAIN_output_dir\": \"finetuning-sentiment-model-3000-samples\",\n", - " \"TRAIN_learning_rate\": 2e-5,\n", - " \"TRAIN_per_device_train_batch_size\": 16,\n", - " \"TRAIN_per_device_eval_batch_size\": 16,\n", - " \"TRAIN_num_train_epochs\": 3,\n", - " \"TRAIN_weight_decay\": 0.01,\n", - " \"TRAIN_push_to_hub\": False,\n", - " \"TRAIN_evaluation_strategy\": \"epoch\",\n", - " \"TRAIN_eval_steps\": 1,\n", - " \"TRAIN_logging_steps\": 1,\n", - " \"CLASS_num_labels\": 2\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "#### Running the Training job with the \"train\" handler" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": { - "pycharm": { - "name": "#%%\n" - }, - "scrolled": true, - "tags": [] - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:10:21,025 [info] Storing function: {'name': 'hugging-face-classifier-trainer-train', 'uid': '514d8d5530c842238b1cc81983cd943e', 'db': 'http://mlrun-api:8080'}\n", - "> 2024-03-24 17:11:03,727 [info] 'train_test_split_size' is not provided, setting train_test_split_size to 0.2\n", - "> 2024-03-24 17:11:03,882 [info] Loading and editing Shayanvsf/US_Airline_Sentiment dataset from Hugging Face hub\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Found cached dataset parquet (/igz/.cache/huggingface/datasets/Shayanvsf___parquet/Shayanvsf--US_Airline_Sentiment-1319c42f87c44b2f/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "f43b1388d0b344888323bec590baadee", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - " 0%| | 0/3 [00:00 2024-03-24 17:11:08,938 [info] training 'huggingface-model'\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "The following columns in the training set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`, you can safely ignore this message.\n", - "This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n", - "***** Running training *****\n", - " Num examples = 100\n", - " Num Epochs = 3\n", - " Instantaneous batch size per device = 16\n", - " Total train batch size (w. parallel, distributed & accumulation) = 16\n", - " Gradient Accumulation steps = 1\n", - " Total optimization steps = 21\n", - " Number of trainable parameters = 66955010\n", - "You're using a DistilBertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - " \n", - " \n", - " [21/21 00:15, Epoch 3/3]\n", - "
\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
EpochTraining LossValidation LossAccuracyF1
10.7389000.5153110.7916670.000000
20.5259000.4815630.7916670.000000
30.4908000.4716750.7916670.000000

" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`, you can safely ignore this message.\n", - "***** Running Evaluation *****\n", - " Num examples = 24\n", - " Batch size = 16\n", - "/tmp/tmp0c1aawrq.py:561: FutureWarning:\n", - "\n", - "load_metric is deprecated and will be removed in the next major version of datasets. Use 'evaluate.load' instead, from the new library 🤗 Evaluate: https://huggingface.co/docs/evaluate\n", - "\n", - "The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`, you can safely ignore this message.\n", - "***** Running Evaluation *****\n", - " Num examples = 24\n", - " Batch size = 16\n", - "The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`, you can safely ignore this message.\n", - "***** Running Evaluation *****\n", - " Num examples = 24\n", - " Batch size = 16\n", - "\n", - "\n", - "Training completed. Do not forget to share your model on huggingface.co/models =)\n", - "\n", - "\n", - "tokenizer config file saved in /tmp/tokenizer/tokenizer_config.json\n", - "Special tokens file saved in /tmp/tokenizer/special_tokens_map.json\n", - "Configuration saved in /tmp/model/config.json\n", - "Model weights saved in /tmp/model/pytorch_model.bin\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "

\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
hugging-face-trainer-avia0Mar 24 17:10:21completedhugging-face-classifier-trainer-train
v3io_user=avia
kind=local
owner=avia
host=jupyter-avia-6454bdd4c5-xz8cg
hf_dataset=Shayanvsf/US_Airline_Sentiment
drop_columns=['airline_sentiment_confidence', 'negativereason_confidence']
pretrained_tokenizer=distilbert-base-uncased
pretrained_model=distilbert-base-uncased
model_class=transformers.AutoModelForSequenceClassification
label_name=airline_sentiment
num_of_train_samples=100
metrics=['accuracy', 'f1']
random_state=42
TRAIN_output_dir=finetuning-sentiment-model-3000-samples
TRAIN_learning_rate=2e-05
TRAIN_per_device_train_batch_size=16
TRAIN_per_device_eval_batch_size=16
TRAIN_num_train_epochs=3
TRAIN_weight_decay=0.01
TRAIN_push_to_hub=False
TRAIN_evaluation_strategy=epoch
TRAIN_eval_steps=1
TRAIN_logging_steps=1
CLASS_num_labels=2
loss=0.4908
learning_rate=0.0
eval_loss=0.47167453169822693
eval_accuracy=0.7916666666666666
eval_f1=0.0
eval_runtime=0.5186
eval_samples_per_second=46.276
eval_steps_per_second=3.856
train_runtime=17.6054
train_samples_per_second=17.04
train_steps_per_second=1.193
total_flos=3327208489680.0
loss_plot
learning_rate_plot
eval_loss_plot
eval_accuracy_plot
eval_f1_plot
eval_runtime_plot
eval_samples_per_second_plot
eval_steps_per_second_plot
tokenizer
model
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/html": [ - " > to track results use the .show() or .logs() methods or click here to open in UI" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:12:01,880 [info] Run execution finished: {'status': 'completed', 'name': 'hugging-face-classifier-trainer-train'}\n" - ] - } - ], - "source": [ - "train_run = hugging_face_classifier_trainer.run(params={\n", - " \"hf_dataset\": \"Shayanvsf/US_Airline_Sentiment\",\n", - " \"drop_columns\": [\n", - " \"airline_sentiment_confidence\",\n", - " \"negativereason_confidence\",\n", - " ],\n", - " \"pretrained_tokenizer\": \"distilbert-base-uncased\",\n", - " \"pretrained_model\": \"distilbert-base-uncased\",\n", - " \"model_class\": \"transformers.AutoModelForSequenceClassification\",\n", - " \"label_name\": \"airline_sentiment\",\n", - " \"num_of_train_samples\": 100,\n", - " \"metrics\": [\"accuracy\", \"f1\"],\n", - " \"random_state\": 42,\n", - " **additional_parameters\n", - " },\n", - " handler=\"train\",\n", - " local=True,\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "#### The result of the train run" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "{'loss': 0.4908,\n", - " 'learning_rate': 0.0,\n", - " 'eval_loss': 0.47167453169822693,\n", - " 'eval_accuracy': 0.7916666666666666,\n", - " 'eval_f1': 0.0,\n", - " 'eval_runtime': 0.5186,\n", - " 'eval_samples_per_second': 46.276,\n", - " 'eval_steps_per_second': 3.856,\n", - " 'train_runtime': 17.6054,\n", - " 'train_samples_per_second': 17.04,\n", - " 'train_steps_per_second': 1.193,\n", - " 'total_flos': 3327208489680.0,\n", - " 'loss_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/loss_plot.html',\n", - " 'learning_rate_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/learning_rate_plot.html',\n", - " 'eval_loss_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/eval_loss_plot.html',\n", - " 'eval_accuracy_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/eval_accuracy_plot.html',\n", - " 'eval_f1_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/eval_f1_plot.html',\n", - " 'eval_runtime_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/eval_runtime_plot.html',\n", - " 'eval_samples_per_second_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/eval_samples_per_second_plot.html',\n", - " 'eval_steps_per_second_plot': 'v3io:///projects/hugging-face-trainer-avia/artifacts/hugging-face-classifier-trainer-train/0/eval_steps_per_second_plot.html',\n", - " 'tokenizer': 'store://artifacts/hugging-face-trainer-avia/hugging-face-classifier-trainer-train_tokenizer@514d8d5530c842238b1cc81983cd943e',\n", - " 'model': 'store://artifacts/hugging-face-trainer-avia/huggingface-model@514d8d5530c842238b1cc81983cd943e'}" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "train_run.outputs" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [ - { - "data": { - "text/html": [ - "\n", - "\n", - "\n", - "
\n", - "
\n", - "\n", - "" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "train_run.artifact('loss_plot').show()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "#### Getting the model for evaluating and predicting" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [], - "source": [ - "model_path = train_run.outputs['model']" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Optimize the model**\n", - "\n", - "Choosing the `optimize` handler\n", - "\n", - "The result of using this handled is an onnx optimized model." - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": { - "scrolled": true, - "tags": [] - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:12:02,020 [info] Storing function: {'name': 'hugging-face-classifier-trainer-optimize', 'uid': 'fbee1ead18444824a4b5c0308a677bf4', 'db': 'http://mlrun-api:8080'}\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/User/.pythonlibs/mlrun-base/lib/python3.9/site-packages/optimum/onnxruntime/configuration.py:726: FutureWarning:\n", - "\n", - "disable_embed_layer_norm will be deprecated soon, use disable_embed_layer_norm_fusion instead, disable_embed_layer_norm_fusion is set to True.\n", - "\n", - "loading configuration file /tmp/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp/config.json\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "loading configuration file /tmp/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "loading weights file /tmp/pytorch_model.bin\n", - "All model checkpoint weights were used when initializing DistilBertForSequenceClassification.\n", - "\n", - "All the weights of DistilBertForSequenceClassification were initialized from the model checkpoint at /tmp.\n", - "If your task is similar to the task the model of the checkpoint was trained on, you can already use DistilBertForSequenceClassification for predictions without further training.\n", - "/User/.pythonlibs/mlrun-base/lib/python3.9/site-packages/transformers/models/distilbert/modeling_distilbert.py:218: TracerWarning:\n", - "\n", - "torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.\n", - "\n", - "Configuration saved in /tmp/tmp79wjp8m8/config.json\n", - "Could not locate the tokenizer configuration file, will try to use the model config instead.\n", - "loading configuration file /tmp/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "loading configuration file /tmp/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "Could not locate the tokenizer configuration file, will try to use the model config instead.\n", - "loading configuration file /tmp/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "Could not locate the tokenizer configuration file, will try to use the model config instead.\n", - "loading configuration file /tmp/tmp79wjp8m8/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp/tmp79wjp8m8\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "loading configuration file /tmp/tmp79wjp8m8/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp/tmp79wjp8m8\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "Could not locate the tokenizer configuration file, will try to use the model config instead.\n", - "loading configuration file /tmp/tmp79wjp8m8/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp/tmp79wjp8m8\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "Configuration saved in optimized/config.json\n", - "Could not locate the tokenizer configuration file, will try to use the model config instead.\n", - "loading configuration file /tmp/tmp79wjp8m8/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp/tmp79wjp8m8\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "loading configuration file /tmp/tmp79wjp8m8/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp/tmp79wjp8m8\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "Could not locate the tokenizer configuration file, will try to use the model config instead.\n", - "loading configuration file /tmp/tmp79wjp8m8/config.json\n", - "Model config DistilBertConfig {\n", - " \"_name_or_path\": \"/tmp/tmp79wjp8m8\",\n", - " \"activation\": \"gelu\",\n", - " \"architectures\": [\n", - " \"DistilBertForSequenceClassification\"\n", - " ],\n", - " \"attention_dropout\": 0.1,\n", - " \"dim\": 768,\n", - " \"dropout\": 0.1,\n", - " \"hidden_dim\": 3072,\n", - " \"initializer_range\": 0.02,\n", - " \"max_position_embeddings\": 512,\n", - " \"model_type\": \"distilbert\",\n", - " \"n_heads\": 12,\n", - " \"n_layers\": 6,\n", - " \"pad_token_id\": 0,\n", - " \"problem_type\": \"single_label_classification\",\n", - " \"qa_dropout\": 0.1,\n", - " \"seq_classif_dropout\": 0.2,\n", - " \"sinusoidal_pos_embds\": false,\n", - " \"tie_weights_\": true,\n", - " \"torch_dtype\": \"float32\",\n", - " \"transformers_version\": \"4.26.1\",\n", - " \"vocab_size\": 30522\n", - "}\n", - "\n", - "Failed to remove node input: \"/distilbert/transformer/layer.0/attention/Transpose_output_0\"\n", - "input: \"/distilbert/transformer/layer.0/attention/Constant_11_output_0\"\n", - "output: \"/distilbert/transformer/layer.0/attention/Div_output_0\"\n", - "name: \"/distilbert/transformer/layer.0/attention/Div\"\n", - "op_type: \"Div\"\n", - "\n", - "Failed to remove node input: \"/distilbert/transformer/layer.1/attention/Transpose_output_0\"\n", - "input: \"/distilbert/transformer/layer.1/attention/Constant_11_output_0\"\n", - "output: \"/distilbert/transformer/layer.1/attention/Div_output_0\"\n", - "name: \"/distilbert/transformer/layer.1/attention/Div\"\n", - "op_type: \"Div\"\n", - "\n", - "Failed to remove node input: \"/distilbert/transformer/layer.2/attention/Transpose_output_0\"\n", - "input: \"/distilbert/transformer/layer.2/attention/Constant_11_output_0\"\n", - "output: \"/distilbert/transformer/layer.2/attention/Div_output_0\"\n", - "name: \"/distilbert/transformer/layer.2/attention/Div\"\n", - "op_type: \"Div\"\n", - "\n", - "Failed to remove node input: \"/distilbert/transformer/layer.3/attention/Transpose_output_0\"\n", - "input: \"/distilbert/transformer/layer.3/attention/Constant_11_output_0\"\n", - "output: \"/distilbert/transformer/layer.3/attention/Div_output_0\"\n", - "name: \"/distilbert/transformer/layer.3/attention/Div\"\n", - "op_type: \"Div\"\n", - "\n", - "Failed to remove node input: \"/distilbert/transformer/layer.4/attention/Transpose_output_0\"\n", - "input: \"/distilbert/transformer/layer.4/attention/Constant_11_output_0\"\n", - "output: \"/distilbert/transformer/layer.4/attention/Div_output_0\"\n", - "name: \"/distilbert/transformer/layer.4/attention/Div\"\n", - "op_type: \"Div\"\n", - "\n", - "Failed to remove node input: \"/distilbert/transformer/layer.5/attention/Transpose_output_0\"\n", - "input: \"/distilbert/transformer/layer.5/attention/Constant_11_output_0\"\n", - "output: \"/distilbert/transformer/layer.5/attention/Div_output_0\"\n", - "name: \"/distilbert/transformer/layer.5/attention/Div\"\n", - "op_type: \"Div\"\n", - "\n", - "Configuration saved in optimized/config.json\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
hugging-face-trainer-avia0Mar 24 17:12:02completedhugging-face-classifier-trainer-optimize
v3io_user=avia
kind=local
owner=avia
host=jupyter-avia-6454bdd4c5-xz8cg
model_path=store://artifacts/hugging-face-trainer-avia/huggingface-model@514d8d5530c842238b1cc81983cd943e
model
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/html": [ - " > to track results use the .show() or .logs() methods or click here to open in UI" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:12:22,721 [info] Run execution finished: {'status': 'completed', 'name': 'hugging-face-classifier-trainer-optimize'}\n" - ] - } - ], - "source": [ - "optimize_run = hugging_face_classifier_trainer.run(params={\n", - " \"model_path\": str(model_path)\n", - " },\n", - " handler=\"optimize\",\n", - " local=True,\n", - " )" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'model': 'store://artifacts/hugging-face-trainer-avia/optimized_model@fbee1ead18444824a4b5c0308a677bf4'}" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "optimize_run.outputs" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Running the training remotely**\n" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": { - "scrolled": true, - "tags": [] - }, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/User/.pythonlibs/mlrun-base/lib/python3.9/site-packages/mlrun/projects/operations.py:276: OverwriteBuildParamsWarning:\n", - "\n", - "The `overwrite_build_params` parameter default will change from 'False' to 'True' in 1.8.0.\n", - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:14:22,792 [info] Started building image: .mlrun/func-hugging-face-trainer-avia-hugging-face-classifier-trainer:latest\n", - "\u001b[36mINFO\u001b[0m[0000] Retrieving image manifest mlrun/mlrun:1.6.1 \n", - "\u001b[36mINFO\u001b[0m[0000] Retrieving image mlrun/mlrun:1.6.1 from registry index.docker.io \n", - "\u001b[36mINFO\u001b[0m[0000] Built cross stage deps: map[] \n", - "\u001b[36mINFO\u001b[0m[0000] Retrieving image manifest mlrun/mlrun:1.6.1 \n", - "\u001b[36mINFO\u001b[0m[0000] Returning cached image manifest \n", - "\u001b[36mINFO\u001b[0m[0000] Executing 0 build triggers \n", - "\u001b[36mINFO\u001b[0m[0000] Building stage 'mlrun/mlrun:1.6.1' [idx: '0', base-idx: '-1'] \n", - "\u001b[36mINFO\u001b[0m[0000] Unpacking rootfs as cmd RUN echo 'Installing /empty/requirements.txt...'; cat /empty/requirements.txt requires it. \n", - "\u001b[36mINFO\u001b[0m[0047] RUN echo 'Installing /empty/requirements.txt...'; cat /empty/requirements.txt \n", - "\u001b[36mINFO\u001b[0m[0047] Initializing snapshotter ... \n", - "\u001b[36mINFO\u001b[0m[0047] Taking snapshot of full filesystem... \n", - "\u001b[36mINFO\u001b[0m[0074] Cmd: /bin/sh \n", - "\u001b[36mINFO\u001b[0m[0074] Args: [-c echo 'Installing /empty/requirements.txt...'; cat /empty/requirements.txt] \n", - "\u001b[36mINFO\u001b[0m[0074] Running: [/bin/sh -c echo 'Installing /empty/requirements.txt...'; cat /empty/requirements.txt] \n", - "Installing /empty/requirements.txt...\n", - "mlrun[complete]==1.6.1\n", - "onnx~=1.14.1\n", - "onnxruntime~=1.16.1\n", - "optimum~=1.6.4\n", - "transformers~=4.26.1\n", - "datasets~=2.10.1\n", - "scikit-learn~=1.0.2\n", - "\u001b[36mINFO\u001b[0m[0074] Taking snapshot of full filesystem... \n", - "\u001b[36mINFO\u001b[0m[0078] No files were changed, appending empty layer to config. No layer added to image. \n", - "\u001b[36mINFO\u001b[0m[0078] RUN python -m pip install -r /empty/requirements.txt \n", - "\u001b[36mINFO\u001b[0m[0078] Cmd: /bin/sh \n", - "\u001b[36mINFO\u001b[0m[0078] Args: [-c python -m pip install -r /empty/requirements.txt] \n", - "\u001b[36mINFO\u001b[0m[0078] Running: [/bin/sh -c python -m pip install -r /empty/requirements.txt] \n", - "Requirement already satisfied: mlrun[complete]==1.6.1 in /opt/conda/lib/python3.9/site-packages (from -r /empty/requirements.txt (line 1)) (1.6.1)\n", - "Collecting onnx~=1.14.1 (from -r /empty/requirements.txt (line 2))\n", - " Obtaining dependency information for onnx~=1.14.1 from https://files.pythonhosted.org/packages/ff/24/0e522fdcadf0e15fc304145a5b6e5d7246d7f2c507fd9bfe6e1fafb2aa95/onnx-1.14.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading onnx-1.14.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (15 kB)\n", - "Collecting onnxruntime~=1.16.1 (from -r /empty/requirements.txt (line 3))\n", - " Obtaining dependency information for onnxruntime~=1.16.1 from https://files.pythonhosted.org/packages/de/ab/ed3ae0d649cee41e870f8b1653cf4a1c1fc321e0ded4e3e1a3d4a25c0131/onnxruntime-1.16.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading onnxruntime-1.16.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.3 kB)\n", - "Collecting optimum~=1.6.4 (from -r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for optimum~=1.6.4 from https://files.pythonhosted.org/packages/31/72/a7e3b2c57d6368c5f4bb6fba54a85cbf07d25c385a2db3f1a638f3c0ddb2/optimum-1.6.4-py3-none-any.whl.metadata\n", - " Downloading optimum-1.6.4-py3-none-any.whl.metadata (17 kB)\n", - "Collecting transformers~=4.26.1 (from -r /empty/requirements.txt (line 5))\n", - " Obtaining dependency information for transformers~=4.26.1 from https://files.pythonhosted.org/packages/1e/e2/60c3f4691b16d126ee9cfe28f598b13c424b60350ab339aba81aef054b8f/transformers-4.26.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.26.1-py3-none-any.whl.metadata (100 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.3/100.3 kB 6.2 MB/s eta 0:00:00\n", - "Collecting datasets~=2.10.1 (from -r /empty/requirements.txt (line 6))\n", - " Obtaining dependency information for datasets~=2.10.1 from https://files.pythonhosted.org/packages/fe/17/5825fdf034ff1a315becdbb9b6fe5a2bd9d8e724464535f18809593bf9c2/datasets-2.10.1-py3-none-any.whl.metadata\n", - " Downloading datasets-2.10.1-py3-none-any.whl.metadata (20 kB)\n", - "Collecting scikit-learn~=1.0.2 (from -r /empty/requirements.txt (line 7))\n", - " Obtaining dependency information for scikit-learn~=1.0.2 from https://files.pythonhosted.org/packages/57/aa/483fbe6b5314bce2d49801e6cec1f2139a9c220d0d51494788fff47233b3/scikit_learn-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading scikit_learn-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (10 kB)\n", - "Requirement already satisfied: urllib3<1.27,>=1.26.9 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.26.18)\n", - "Requirement already satisfied: GitPython>=3.1.41,~=3.1 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.1.42)\n", - "Requirement already satisfied: aiohttp~=3.9 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.9.3)\n", - "Requirement already satisfied: aiohttp-retry~=2.8 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.8.3)\n", - "Requirement already satisfied: click~=8.1 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (8.1.7)\n", - "Requirement already satisfied: kfp~=1.8 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.8.22)\n", - "Requirement already satisfied: nest-asyncio~=1.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.6.0)\n", - "Requirement already satisfied: ipython~=8.10 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (8.18.1)\n", - "Requirement already satisfied: nuclio-jupyter~=0.9.15 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.9.16)\n", - "Requirement already satisfied: numpy<1.27.0,>=1.16.5 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.26.4)\n", - "Requirement already satisfied: pandas<2.2,>=1.2 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.1.4)\n", - "Requirement already satisfied: pyarrow<15,>=10.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (14.0.2)\n", - "Requirement already satisfied: pyyaml~=5.1 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.4.1)\n", - "Requirement already satisfied: requests~=2.31 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.31.0)\n", - "Requirement already satisfied: tabulate~=0.8.6 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.8.10)\n", - "Requirement already satisfied: v3io~=0.5.21 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.5.23)\n", - "Requirement already satisfied: pydantic>=1.10.8,~=1.10 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.10.14)\n", - "Requirement already satisfied: mergedeep~=1.3 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.3.4)\n", - "Requirement already satisfied: v3io-frames~=0.10.12 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.10.13)\n", - "Requirement already satisfied: semver~=3.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.2)\n", - "Requirement already satisfied: dependency-injector~=4.41 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.41.0)\n", - "Requirement already satisfied: fsspec==2023.9.2 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.9.2)\n", - "Requirement already satisfied: v3iofs~=0.1.17 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.1.18)\n", - "Requirement already satisfied: storey~=1.6.18 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.6.18)\n", - "Requirement already satisfied: inflection~=0.5.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.5.1)\n", - "Requirement already satisfied: python-dotenv~=0.17.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.17.1)\n", - "Requirement already satisfied: setuptools~=68.2 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (68.2.2)\n", - "Requirement already satisfied: deprecated~=1.2 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.2.14)\n", - "Requirement already satisfied: jinja2>=3.1.3,~=3.1 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.1.3)\n", - "Requirement already satisfied: anyio~=3.7 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.7.1)\n", - "Requirement already satisfied: orjson~=3.9 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.9.15)\n", - "Requirement already satisfied: adlfs==2023.9.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.9.0)\n", - "Requirement already satisfied: aiobotocore<2.8,>=2.5.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.5.4)\n", - "Requirement already satisfied: avro~=1.11 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.11.3)\n", - "Requirement already satisfied: azure-core~=1.24 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.30.0)\n", - "Requirement already satisfied: azure-identity~=1.5 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.15.0)\n", - "Requirement already satisfied: azure-keyvault-secrets~=4.2 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.8.0)\n", - "Requirement already satisfied: boto3<1.29.0,>=1.28.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.28.17)\n", - "Requirement already satisfied: dask~=2023.9.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.9.3)\n", - "Requirement already satisfied: databricks-sdk~=0.13.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.13.0)\n", - "Requirement already satisfied: distributed~=2023.9.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.9.3)\n", - "Requirement already satisfied: gcsfs==2023.9.2 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.9.2)\n", - "Requirement already satisfied: google-cloud-bigquery[bqstorage,pandas]==3.14.1 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.14.1)\n", - "Requirement already satisfied: graphviz~=0.20.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.20.1)\n", - "Requirement already satisfied: kafka-python~=2.0 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.0.2)\n", - "Requirement already satisfied: mlflow~=2.8 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.10.2)\n", - "Requirement already satisfied: msrest~=0.6.21 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.6.21)\n", - "Requirement already satisfied: plotly<5.12.0,~=5.4 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.11.0)\n", - "Requirement already satisfied: pyopenssl>=23 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (24.0.0)\n", - "Requirement already satisfied: redis~=4.3 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.6.0)\n", - "Requirement already satisfied: s3fs==2023.9.2 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.9.2)\n", - "Requirement already satisfied: sqlalchemy~=1.4 in /opt/conda/lib/python3.9/site-packages (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.4.51)\n", - "Requirement already satisfied: azure-datalake-store<0.1,>=0.0.46 in /opt/conda/lib/python3.9/site-packages (from adlfs==2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.0.53)\n", - "Requirement already satisfied: azure-storage-blob>=12.12.0 in /opt/conda/lib/python3.9/site-packages (from adlfs==2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (12.19.0)\n", - "Requirement already satisfied: decorator>4.1.2 in /opt/conda/lib/python3.9/site-packages (from gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.1.1)\n", - "Requirement already satisfied: google-auth>=1.2 in /opt/conda/lib/python3.9/site-packages (from gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.28.1)\n", - "Requirement already satisfied: google-auth-oauthlib in /opt/conda/lib/python3.9/site-packages (from gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.2.0)\n", - "Requirement already satisfied: google-cloud-storage in /opt/conda/lib/python3.9/site-packages (from gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.14.0)\n", - "Requirement already satisfied: google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.17.1)\n", - "Requirement already satisfied: google-cloud-core<3.0.0dev,>=1.6.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.4.1)\n", - "Requirement already satisfied: google-resumable-media<3.0dev,>=0.6.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.7.0)\n", - "Requirement already satisfied: packaging>=20.0.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (23.1)\n", - "Requirement already satisfied: python-dateutil<3.0dev,>=2.7.2 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.8.2)\n", - "Requirement already satisfied: db-dtypes<2.0.0dev,>=0.3.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.2.0)\n", - "Requirement already satisfied: google-cloud-bigquery-storage<3.0.0dev,>=2.6.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.24.0)\n", - "Requirement already satisfied: grpcio<2.0dev,>=1.47.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.62.0)\n", - "Requirement already satisfied: protobuf>=3.20.2 in /opt/conda/lib/python3.9/site-packages (from onnx~=1.14.1->-r /empty/requirements.txt (line 2)) (3.20.3)\n", - "Requirement already satisfied: typing-extensions>=3.6.2.1 in /opt/conda/lib/python3.9/site-packages (from onnx~=1.14.1->-r /empty/requirements.txt (line 2)) (4.10.0)\n", - "Collecting coloredlogs (from onnxruntime~=1.16.1->-r /empty/requirements.txt (line 3))\n", - " Obtaining dependency information for coloredlogs from https://files.pythonhosted.org/packages/a7/06/3d6badcf13db419e25b07041d9c7b4a2c331d3f4e7134445ec5df57714cd/coloredlogs-15.0.1-py2.py3-none-any.whl.metadata\n", - " Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)\n", - "Collecting flatbuffers (from onnxruntime~=1.16.1->-r /empty/requirements.txt (line 3))\n", - " Obtaining dependency information for flatbuffers from https://files.pythonhosted.org/packages/bf/45/c961e3cb6ddad76b325c163d730562bb6deb1ace5acbed0306f5fbefb90e/flatbuffers-24.3.7-py2.py3-none-any.whl.metadata\n", - " Downloading flatbuffers-24.3.7-py2.py3-none-any.whl.metadata (849 bytes)\n", - "Collecting sympy (from onnxruntime~=1.16.1->-r /empty/requirements.txt (line 3))\n", - " Obtaining dependency information for sympy from https://files.pythonhosted.org/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07cc8915be2a1c16da7f2b9e74/sympy-1.12-py3-none-any.whl.metadata\n", - " Downloading sympy-1.12-py3-none-any.whl.metadata (12 kB)\n", - "Collecting transformers[sentencepiece]>=4.26.0 (from optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/0a/fd/280f4385e76f3c1890efc15fa93f7206134fefad6351397e1bfab6d0d0de/transformers-4.39.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.39.1-py3-none-any.whl.metadata (134 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.8/134.8 kB 40.1 MB/s eta 0:00:00\n", - "Collecting torch>=1.9 (from optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for torch>=1.9 from https://files.pythonhosted.org/packages/98/04/95a12556d068786d6505c609daf2805bed91c9210c5185499a7c121eba47/torch-2.2.1-cp39-cp39-manylinux1_x86_64.whl.metadata\n", - " Downloading torch-2.2.1-cp39-cp39-manylinux1_x86_64.whl.metadata (25 kB)\n", - "Collecting numpy<1.27.0,>=1.16.5 (from mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1))\n", - " Obtaining dependency information for numpy<1.27.0,>=1.16.5 from https://files.pythonhosted.org/packages/4c/b9/038abd6fbd67b05b03cb1af590cfc02b7f1e5a37af7ac6a868f5093c29f5/numpy-1.23.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading numpy-1.23.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.3 kB)\n", - "Collecting huggingface-hub>=0.8.0 (from optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for huggingface-hub>=0.8.0 from https://files.pythonhosted.org/packages/ab/28/d4b691840d73126d4c9845f8a22dad033ac872509b6d3a0d93b456eef424/huggingface_hub-0.21.4-py3-none-any.whl.metadata\n", - " Downloading huggingface_hub-0.21.4-py3-none-any.whl.metadata (13 kB)\n", - "Collecting filelock (from transformers~=4.26.1->-r /empty/requirements.txt (line 5))\n", - " Obtaining dependency information for filelock from https://files.pythonhosted.org/packages/81/54/84d42a0bee35edba99dee7b59a8d4970eccdd44b99fe728ed912106fc781/filelock-3.13.1-py3-none-any.whl.metadata\n", - " Downloading filelock-3.13.1-py3-none-any.whl.metadata (2.8 kB)\n", - "Collecting regex!=2019.12.17 (from transformers~=4.26.1->-r /empty/requirements.txt (line 5))\n", - " Obtaining dependency information for regex!=2019.12.17 from https://files.pythonhosted.org/packages/05/9e/80c20f1151432a6025690c9c2037053039b028a7b236fa81d7e7ac9dec60/regex-2023.12.25-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading regex-2023.12.25-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.9/40.9 kB 217.5 MB/s eta 0:00:00\n", - "Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers~=4.26.1->-r /empty/requirements.txt (line 5))\n", - " Obtaining dependency information for tokenizers!=0.11.3,<0.14,>=0.11.1 from https://files.pythonhosted.org/packages/d6/27/07a337087dd507170a1b20fed3bbf8da81401185a7130a6e74e440c52040/tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)\n", - "Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.9/site-packages (from transformers~=4.26.1->-r /empty/requirements.txt (line 5)) (4.65.0)\n", - "Collecting dill<0.3.7,>=0.3.0 (from datasets~=2.10.1->-r /empty/requirements.txt (line 6))\n", - " Obtaining dependency information for dill<0.3.7,>=0.3.0 from https://files.pythonhosted.org/packages/be/e3/a84bf2e561beed15813080d693b4b27573262433fced9c1d1fea59e60553/dill-0.3.6-py3-none-any.whl.metadata\n", - " Downloading dill-0.3.6-py3-none-any.whl.metadata (9.8 kB)\n", - "Requirement already satisfied: xxhash in /opt/conda/lib/python3.9/site-packages (from datasets~=2.10.1->-r /empty/requirements.txt (line 6)) (3.4.1)\n", - "Collecting multiprocess (from datasets~=2.10.1->-r /empty/requirements.txt (line 6))\n", - " Obtaining dependency information for multiprocess from https://files.pythonhosted.org/packages/da/d9/f7f9379981e39b8c2511c9e0326d212accacb82f12fbfdc1aa2ce2a7b2b6/multiprocess-0.70.16-py39-none-any.whl.metadata\n", - " Downloading multiprocess-0.70.16-py39-none-any.whl.metadata (7.2 kB)\n", - "Collecting responses<0.19 (from datasets~=2.10.1->-r /empty/requirements.txt (line 6))\n", - " Obtaining dependency information for responses<0.19 from https://files.pythonhosted.org/packages/79/f3/2b3a6dc5986303b3dd1bbbcf482022acb2583c428cd23f0b6d37b1a1a519/responses-0.18.0-py3-none-any.whl.metadata\n", - " Downloading responses-0.18.0-py3-none-any.whl.metadata (29 kB)\n", - "Requirement already satisfied: scipy>=1.1.0 in /opt/conda/lib/python3.9/site-packages (from scikit-learn~=1.0.2->-r /empty/requirements.txt (line 7)) (1.12.0)\n", - "Requirement already satisfied: joblib>=0.11 in /opt/conda/lib/python3.9/site-packages (from scikit-learn~=1.0.2->-r /empty/requirements.txt (line 7)) (1.3.2)\n", - "Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.9/site-packages (from scikit-learn~=1.0.2->-r /empty/requirements.txt (line 7)) (3.3.0)\n", - "Requirement already satisfied: botocore<1.31.18,>=1.31.17 in /opt/conda/lib/python3.9/site-packages (from aiobotocore<2.8,>=2.5.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.31.17)\n", - "Requirement already satisfied: wrapt<2.0.0,>=1.10.10 in /opt/conda/lib/python3.9/site-packages (from aiobotocore<2.8,>=2.5.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.16.0)\n", - "Requirement already satisfied: aioitertools<1.0.0,>=0.5.1 in /opt/conda/lib/python3.9/site-packages (from aiobotocore<2.8,>=2.5.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.11.0)\n", - "Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.9/site-packages (from aiohttp~=3.9->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.3.1)\n", - "Requirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.9/site-packages (from aiohttp~=3.9->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (23.2.0)\n", - "Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.9/site-packages (from aiohttp~=3.9->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.4.1)\n", - "Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.9/site-packages (from aiohttp~=3.9->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (6.0.5)\n", - "Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.9/site-packages (from aiohttp~=3.9->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.9.4)\n", - "Requirement already satisfied: async-timeout<5.0,>=4.0 in /opt/conda/lib/python3.9/site-packages (from aiohttp~=3.9->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.0.3)\n", - "Requirement already satisfied: idna>=2.8 in /opt/conda/lib/python3.9/site-packages (from anyio~=3.7->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.4)\n", - "Requirement already satisfied: sniffio>=1.1 in /opt/conda/lib/python3.9/site-packages (from anyio~=3.7->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.3.1)\n", - "Requirement already satisfied: exceptiongroup in /opt/conda/lib/python3.9/site-packages (from anyio~=3.7->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.2.0)\n", - "Requirement already satisfied: six>=1.11.0 in /opt/conda/lib/python3.9/site-packages (from azure-core~=1.24->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.16.0)\n", - "Requirement already satisfied: cryptography>=2.5 in /opt/conda/lib/python3.9/site-packages (from azure-identity~=1.5->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (42.0.2)\n", - "Requirement already satisfied: msal<2.0.0,>=1.24.0 in /opt/conda/lib/python3.9/site-packages (from azure-identity~=1.5->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.27.0)\n", - "Requirement already satisfied: msal-extensions<2.0.0,>=0.3.0 in /opt/conda/lib/python3.9/site-packages (from azure-identity~=1.5->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.1.0)\n", - "Requirement already satisfied: isodate>=0.6.1 in /opt/conda/lib/python3.9/site-packages (from azure-keyvault-secrets~=4.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.6.1)\n", - "Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/conda/lib/python3.9/site-packages (from boto3<1.29.0,>=1.28.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.0.1)\n", - "Requirement already satisfied: s3transfer<0.7.0,>=0.6.0 in /opt/conda/lib/python3.9/site-packages (from boto3<1.29.0,>=1.28.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.6.2)\n", - "Requirement already satisfied: cloudpickle>=1.5.0 in /opt/conda/lib/python3.9/site-packages (from dask~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.2.1)\n", - "Requirement already satisfied: partd>=1.2.0 in /opt/conda/lib/python3.9/site-packages (from dask~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.4.1)\n", - "Requirement already satisfied: toolz>=0.10.0 in /opt/conda/lib/python3.9/site-packages (from dask~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.12.0)\n", - "Requirement already satisfied: importlib-metadata>=4.13.0 in /opt/conda/lib/python3.9/site-packages (from dask~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (7.0.1)\n", - "Requirement already satisfied: locket>=1.0.0 in /opt/conda/lib/python3.9/site-packages (from distributed~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.0.0)\n", - "Requirement already satisfied: msgpack>=1.0.0 in /opt/conda/lib/python3.9/site-packages (from distributed~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.0.7)\n", - "Requirement already satisfied: psutil>=5.7.2 in /opt/conda/lib/python3.9/site-packages (from distributed~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.9.8)\n", - "Requirement already satisfied: sortedcontainers>=2.0.5 in /opt/conda/lib/python3.9/site-packages (from distributed~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.4.0)\n", - "Requirement already satisfied: tblib>=1.6.0 in /opt/conda/lib/python3.9/site-packages (from distributed~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.0)\n", - "Requirement already satisfied: tornado>=6.0.4 in /opt/conda/lib/python3.9/site-packages (from distributed~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (6.4)\n", - "Requirement already satisfied: zict>=3.0.0 in /opt/conda/lib/python3.9/site-packages (from distributed~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.0)\n", - "Requirement already satisfied: gitdb<5,>=4.0.1 in /opt/conda/lib/python3.9/site-packages (from GitPython>=3.1.41,~=3.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.0.11)\n", - "Requirement already satisfied: jedi>=0.16 in /opt/conda/lib/python3.9/site-packages (from ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.19.1)\n", - "Requirement already satisfied: matplotlib-inline in /opt/conda/lib/python3.9/site-packages (from ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.1.6)\n", - "Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in /opt/conda/lib/python3.9/site-packages (from ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.43)\n", - "Requirement already satisfied: pygments>=2.4.0 in /opt/conda/lib/python3.9/site-packages (from ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.17.2)\n", - "Requirement already satisfied: stack-data in /opt/conda/lib/python3.9/site-packages (from ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.6.3)\n", - "Requirement already satisfied: traitlets>=5 in /opt/conda/lib/python3.9/site-packages (from ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.14.1)\n", - "Requirement already satisfied: pexpect>4.3 in /opt/conda/lib/python3.9/site-packages (from ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.9.0)\n", - "Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.9/site-packages (from jinja2>=3.1.3,~=3.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.1.5)\n", - "Requirement already satisfied: absl-py<2,>=0.9 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.4.0)\n", - "Requirement already satisfied: kubernetes<26,>=8.0.0 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (25.3.0)\n", - "Requirement already satisfied: google-api-python-client<2,>=1.7.8 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.12.11)\n", - "Requirement already satisfied: requests-toolbelt<1,>=0.8.0 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.10.1)\n", - "Requirement already satisfied: kfp-server-api<2.0.0,>=1.1.2 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.8.5)\n", - "Requirement already satisfied: jsonschema<5,>=3.0.1 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.21.1)\n", - "Requirement already satisfied: strip-hints<1,>=0.1.8 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.1.10)\n", - "Requirement already satisfied: docstring-parser<1,>=0.7.3 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.15)\n", - "Requirement already satisfied: kfp-pipeline-spec<0.2.0,>=0.1.16 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.1.16)\n", - "Requirement already satisfied: fire<1,>=0.3.1 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.5.0)\n", - "Requirement already satisfied: uritemplate<4,>=3.0.1 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.1)\n", - "Requirement already satisfied: typer<1.0,>=0.3.2 in /opt/conda/lib/python3.9/site-packages (from kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.9.0)\n", - "Requirement already satisfied: entrypoints<1 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.4)\n", - "Requirement already satisfied: pytz<2024 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.4)\n", - "Requirement already satisfied: sqlparse<1,>=0.4.0 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.4.4)\n", - "Requirement already satisfied: alembic!=1.10.0,<2 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.13.1)\n", - "Requirement already satisfied: docker<8,>=4.0.0 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (7.0.0)\n", - "Requirement already satisfied: Flask<4 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.2)\n", - "Requirement already satisfied: querystring-parser<2 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.2.4)\n", - "Requirement already satisfied: markdown<4,>=3.3 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.5.2)\n", - "Requirement already satisfied: matplotlib<4 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.8.3)\n", - "Requirement already satisfied: gunicorn<22 in /opt/conda/lib/python3.9/site-packages (from mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (21.2.0)\n", - "Requirement already satisfied: requests-oauthlib>=0.5.0 in /opt/conda/lib/python3.9/site-packages (from msrest~=0.6.21->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.3.1)\n", - "Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.9/site-packages (from msrest~=0.6.21->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2024.2.2)\n", - "Requirement already satisfied: nbconvert>=6.4.5 in /opt/conda/lib/python3.9/site-packages (from nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (7.16.1)\n", - "Requirement already satisfied: notebook<7.0.0,>=6.4 in /opt/conda/lib/python3.9/site-packages (from nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (6.5.6)\n", - "Requirement already satisfied: tzdata>=2022.1 in /opt/conda/lib/python3.9/site-packages (from pandas<2.2,>=1.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2024.1)\n", - "Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/lib/python3.9/site-packages (from plotly<5.12.0,~=5.4->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (8.2.3)\n", - "Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.9/site-packages (from requests~=2.31->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.0.4)\n", - "Requirement already satisfied: greenlet!=0.4.17 in /opt/conda/lib/python3.9/site-packages (from sqlalchemy~=1.4->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.3)\n", - "Requirement already satisfied: nuclio-sdk>=0.5.3 in /opt/conda/lib/python3.9/site-packages (from storey~=1.6.18->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.5.9)\n", - "Collecting networkx (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for networkx from https://files.pythonhosted.org/packages/d5/f0/8fbc882ca80cf077f1b246c0e3c3465f7f415439bdea6b899f6b19f61f70/networkx-3.2.1-py3-none-any.whl.metadata\n", - " Downloading networkx-3.2.1-py3-none-any.whl.metadata (5.2 kB)\n", - "Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cuda-nvrtc-cu12==12.1.105 from https://files.pythonhosted.org/packages/b6/9f/c64c03f49d6fbc56196664d05dba14e3a561038a81a638eeb47f4d4cfd48/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)\n", - "Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cuda-runtime-cu12==12.1.105 from https://files.pythonhosted.org/packages/eb/d5/c68b1d2cdfcc59e72e8a5949a37ddb22ae6cade80cd4a57a84d4c8b55472/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)\n", - "Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cuda-cupti-cu12==12.1.105 from https://files.pythonhosted.org/packages/7e/00/6b218edd739ecfc60524e585ba8e6b00554dd908de2c9c66c1af3e44e18d/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)\n", - "Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cudnn-cu12==8.9.2.26 from https://files.pythonhosted.org/packages/ff/74/a2e2be7fb83aaedec84f391f082cf765dfb635e7caa9b49065f73e4835d8/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)\n", - "Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cublas-cu12==12.1.3.1 from https://files.pythonhosted.org/packages/37/6d/121efd7382d5b0284239f4ab1fc1590d86d34ed4a4a2fdb13b30ca8e5740/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)\n", - "Collecting nvidia-cufft-cu12==11.0.2.54 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cufft-cu12==11.0.2.54 from https://files.pythonhosted.org/packages/86/94/eb540db023ce1d162e7bea9f8f5aa781d57c65aed513c33ee9a5123ead4d/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)\n", - "Collecting nvidia-curand-cu12==10.3.2.106 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-curand-cu12==10.3.2.106 from https://files.pythonhosted.org/packages/44/31/4890b1c9abc496303412947fc7dcea3d14861720642b49e8ceed89636705/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)\n", - "Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cusolver-cu12==11.4.5.107 from https://files.pythonhosted.org/packages/bc/1d/8de1e5c67099015c834315e333911273a8c6aaba78923dd1d1e25fc5f217/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)\n", - "Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-cusparse-cu12==12.1.0.106 from https://files.pythonhosted.org/packages/65/5b/cfaeebf25cd9fdec14338ccb16f6b2c4c7fa9163aefcf057d86b9cc248bb/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)\n", - "Collecting nvidia-nccl-cu12==2.19.3 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-nccl-cu12==2.19.3 from https://files.pythonhosted.org/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7a57c45/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl.metadata (1.8 kB)\n", - "Collecting nvidia-nvtx-cu12==12.1.105 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-nvtx-cu12==12.1.105 from https://files.pythonhosted.org/packages/da/d3/8057f0587683ed2fcd4dbfbdfdfa807b9160b809976099d36b8f60d08f03/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata\n", - " Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.7 kB)\n", - "Collecting triton==2.2.0 (from torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for triton==2.2.0 from https://files.pythonhosted.org/packages/6a/5c/01d9f062f719581cf6e60053e1a005d666ec67dcb59630fffaa3a3e5c9d8/triton-2.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading triton-2.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.4 kB)\n", - "Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.9->optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for nvidia-nvjitlink-cu12 from https://files.pythonhosted.org/packages/58/d1/d1c80553f9d5d07b6072bc132607d75a0ef3600e28e1890e11c0f55d7346/nvidia_nvjitlink_cu12-12.4.99-py3-none-manylinux2014_x86_64.whl.metadata\n", - " Downloading nvidia_nvjitlink_cu12-12.4.99-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", - "INFO: pip is looking at multiple versions of transformers[sentencepiece] to determine which version is compatible with other requirements. This could take a while.\n", - "Collecting transformers[sentencepiece]>=4.26.0 (from optimum~=1.6.4->-r /empty/requirements.txt (line 4))\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/a4/73/f620d76193954e16db3d5c53a07d956d7b9c800e570758d3bff91906d4a4/transformers-4.39.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.39.0-py3-none-any.whl.metadata (134 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.8/134.8 kB 115.9 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/b6/4d/fbe6d89fde59d8107f0a02816c4ac4542a8f9a85559fdf33c68282affcc1/transformers-4.38.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.38.2-py3-none-any.whl.metadata (130 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 130.7/130.7 kB 126.3 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/3e/6b/1b589f7b69aaea8193cf5bc91cf97410284aecd97b6312cdb08baedbdffe/transformers-4.38.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.38.1-py3-none-any.whl.metadata (131 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 131.1/131.1 kB 138.2 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/91/89/5416dc364c7ef0711c564fd61a69b03d1e40eeb5c506c38e53ba8a969e79/transformers-4.38.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.38.0-py3-none-any.whl.metadata (131 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 131.1/131.1 kB 186.3 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/85/f6/c5065913119c41ecad148c34e3a861f719e16b89a522287213698da911fc/transformers-4.37.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.37.2-py3-none-any.whl.metadata (129 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.4/129.4 kB 236.8 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/ad/67/b4d6a51dcaf988cb45b31e26c6e33fb169fe34ba5fb168b086309bd7c028/transformers-4.37.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.37.1-py3-none-any.whl.metadata (129 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.4/129.4 kB 156.4 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/3c/45/52133ce6bce49a099cc865599803bf1fad93de887276f728e56848d77a70/transformers-4.37.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.37.0-py3-none-any.whl.metadata (129 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.4/129.4 kB 102.0 MB/s eta 0:00:00\n", - "INFO: pip is still looking at multiple versions of transformers[sentencepiece] to determine which version is compatible with other requirements. This could take a while.\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/20/0a/739426a81f7635b422fbe6cb8d1d99d1235579a6ac8024c13d743efa6847/transformers-4.36.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.36.2-py3-none-any.whl.metadata (126 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.8/126.8 kB 108.8 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/fc/04/0aad491cd98b09236c54ab849863ee85421eeda5138bbf9d33ecc594652b/transformers-4.36.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.36.1-py3-none-any.whl.metadata (126 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.8/126.8 kB 140.6 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/0f/12/d8e27a190ca67811f81deea3183b528d9169f10b74d827e0b9211520ecfa/transformers-4.36.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.36.0-py3-none-any.whl.metadata (126 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.8/126.8 kB 267.8 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/12/dd/f17b11a93a9ca27728e12512d167eb1281c151c4c6881d3ab59eb58f4127/transformers-4.35.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.35.2-py3-none-any.whl.metadata (123 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.5/123.5 kB 130.2 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/92/ba/cfff7e01f7070d9fca3964bf42b2257b86964c3e6763b8d5435436cc1d77/transformers-4.35.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.35.1-py3-none-any.whl.metadata (123 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.1/123.1 kB 183.6 MB/s eta 0:00:00\n", - "INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/9a/06/e4ec2a321e57c03b7e9345d709d554a52c33760e5015fdff0919d9459af0/transformers-4.35.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.35.0-py3-none-any.whl.metadata (123 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.1/123.1 kB 177.3 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/c1/bd/f64d67df4d3b05a460f281defe830ffab6d7940b7ca98ec085e94e024781/transformers-4.34.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.34.1-py3-none-any.whl.metadata (121 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.5/121.5 kB 270.5 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/1a/d1/3bba59606141ae808017f6fde91453882f931957f125009417b87a281067/transformers-4.34.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.34.0-py3-none-any.whl.metadata (121 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.5/121.5 kB 133.4 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/98/46/f6a79f944d5c7763a9bc13b2aa6ac72daf43a6551f5fb03bccf0a9c2fec1/transformers-4.33.3-py3-none-any.whl.metadata\n", - " Downloading transformers-4.33.3-py3-none-any.whl.metadata (119 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 119.9/119.9 kB 163.1 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/1a/06/3817f9bb923437ead9a794f0ac0d03b8b5e0478ab112db4c413dd37c09da/transformers-4.33.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.33.2-py3-none-any.whl.metadata (119 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 119.9/119.9 kB 274.9 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/13/30/54b59e73400df3de506ad8630284e9fd63f4b94f735423d55fc342181037/transformers-4.33.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.33.1-py3-none-any.whl.metadata (119 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 119.9/119.9 kB 274.2 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/e1/9d/4d9fe5c3b820db10773392ac5f4a0c8dab668f70b245ce2ce09785166128/transformers-4.33.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.33.0-py3-none-any.whl.metadata (119 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 119.9/119.9 kB 185.9 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/83/8d/f65f8138365462ace54458a9e164f4b28ce1141361970190eef36bdef986/transformers-4.32.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.32.1-py3-none-any.whl.metadata (118 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.5/118.5 kB 144.4 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/ae/95/283a1c004430bd2a9425d6937fc545dd49a4e4592feb76be0299a14e2378/transformers-4.32.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.32.0-py3-none-any.whl.metadata (118 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.5/118.5 kB 150.3 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/21/02/ae8e595f45b6c8edee07913892b3b41f5f5f273962ad98851dc6a564bbb9/transformers-4.31.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.31.0-py3-none-any.whl.metadata (116 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.9/116.9 kB 156.7 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/5b/0b/e45d26ccd28568013523e04f325432ea88a442b4e3020b757cf4361f0120/transformers-4.30.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.30.2-py3-none-any.whl.metadata (113 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 113.6/113.6 kB 263.7 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/b8/df/b01b5e67cde3883757c9212455cbb9169385dcab5858b7172199126b756d/transformers-4.30.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.30.1-py3-none-any.whl.metadata (113 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 113.6/113.6 kB 263.8 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/e2/72/1af3d38e98fdcceb3876de4567ac395a66c26976e259fe2d46266e052d61/transformers-4.30.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.30.0-py3-none-any.whl.metadata (113 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 113.6/113.6 kB 266.5 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/17/aa/a89864288afe45abe1ab79f002140a20348140e86836d96096d8f8a3bac0/transformers-4.29.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.29.2-py3-none-any.whl.metadata (112 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.3/112.3 kB 272.7 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/e8/b5/ddb16f9de207e6571ab7cc5db0cc538fa2d6d91cf024565496462af4c1ce/transformers-4.29.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.29.1-py3-none-any.whl.metadata (112 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.3/112.3 kB 262.3 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/45/e4/4914b11df70954d95a7c36b74bf9010c8594fcec960471479449b0deb4f7/transformers-4.29.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.29.0-py3-none-any.whl.metadata (111 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 111.9/111.9 kB 269.5 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/d8/a7/a6ff727fd5d96d6625f4658944a2ae230f0c75743a9a117fbda013b03d3d/transformers-4.28.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.28.1-py3-none-any.whl.metadata (109 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 110.0/110.0 kB 245.6 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/8b/13/1ce598763b3669d43f192a7911bf2bf730a328012ab8801b93187a4f70d0/transformers-4.28.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.28.0-py3-none-any.whl.metadata (109 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 110.0/110.0 kB 256.3 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/87/f0/2a152ed10ab8601431e87a606d397f7473c5fa4f8162f4ec5bda6ddb2df4/transformers-4.27.4-py3-none-any.whl.metadata\n", - " Downloading transformers-4.27.4-py3-none-any.whl.metadata (106 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.7/106.7 kB 254.4 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/52/ac/9dc5a17ba60bc354d99250d9d1629f99d76f6729cee438fa91c8cc74bc5d/transformers-4.27.3-py3-none-any.whl.metadata\n", - " Downloading transformers-4.27.3-py3-none-any.whl.metadata (106 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.7/106.7 kB 251.5 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/73/f0/4a795505387a3e7cd7f0c2a2a87f876658f9a07947a38fb67bffceff9246/transformers-4.27.2-py3-none-any.whl.metadata\n", - " Downloading transformers-4.27.2-py3-none-any.whl.metadata (106 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.7/106.7 kB 246.1 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/6d/9b/2f536f9e73390209e0b27b74691355dac494b7ec8154f3012fdc6debbae7/transformers-4.27.1-py3-none-any.whl.metadata\n", - " Downloading transformers-4.27.1-py3-none-any.whl.metadata (106 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.7/106.7 kB 114.0 MB/s eta 0:00:00\n", - " Obtaining dependency information for transformers[sentencepiece]>=4.26.0 from https://files.pythonhosted.org/packages/4d/3e/1378ed266cf991f5ab5fcb29e953d97d793c7f9242ea5dc52f856415ea3a/transformers-4.27.0-py3-none-any.whl.metadata\n", - " Downloading transformers-4.27.0-py3-none-any.whl.metadata (106 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.7/106.7 kB 247.2 MB/s eta 0:00:00\n", - "Collecting sentencepiece!=0.1.92,>=0.1.91 (from transformers~=4.26.1->-r /empty/requirements.txt (line 5))\n", - " Obtaining dependency information for sentencepiece!=0.1.92,>=0.1.91 from https://files.pythonhosted.org/packages/5f/01/c95e42eb86282b2c79305d3e0b0ca5a743f85a61262bb7130999c70b9374/sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata\n", - " Downloading sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.7 kB)\n", - "Collecting protobuf>=3.20.2 (from onnx~=1.14.1->-r /empty/requirements.txt (line 2))\n", - " Obtaining dependency information for protobuf>=3.20.2 from https://files.pythonhosted.org/packages/38/b1/d9b615dceb67ac38e13cbd7680c27182b40154996022cbb244ba1ac7d30f/protobuf-3.20.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata\n", - " Downloading protobuf-3.20.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (679 bytes)\n", - "Requirement already satisfied: future>=0.18.2 in /opt/conda/lib/python3.9/site-packages (from v3io~=0.5.21->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.0.0)\n", - "Requirement already satisfied: ujson>=3 in /opt/conda/lib/python3.9/site-packages (from v3io~=0.5.21->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.9.0)\n", - "Requirement already satisfied: googleapis-common-protos>=1.5.3 in /opt/conda/lib/python3.9/site-packages (from v3io-frames~=0.10.12->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.62.0)\n", - "Requirement already satisfied: grpcio-tools!=1.34.0,<1.49,>=1.30 in /opt/conda/lib/python3.9/site-packages (from v3io-frames~=0.10.12->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.48.2)\n", - "Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime~=1.16.1->-r /empty/requirements.txt (line 3))\n", - " Obtaining dependency information for humanfriendly>=9.1 from https://files.pythonhosted.org/packages/f0/0f/310fb31e39e2d734ccaa2c0fb981ee41f7bd5056ce9bc29b2248bd569169/humanfriendly-10.0-py2.py3-none-any.whl.metadata\n", - " Downloading humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)\n", - "INFO: pip is looking at multiple versions of multiprocess to determine which version is compatible with other requirements. This could take a while.\n", - "Collecting multiprocess (from datasets~=2.10.1->-r /empty/requirements.txt (line 6))\n", - " Obtaining dependency information for multiprocess from https://files.pythonhosted.org/packages/c6/c9/820b5ab056f4ada76fbe05bd481a948f287957d6cbfd59e2dd2618b408c1/multiprocess-0.70.15-py39-none-any.whl.metadata\n", - " Downloading multiprocess-0.70.15-py39-none-any.whl.metadata (7.2 kB)\n", - " Obtaining dependency information for multiprocess from https://files.pythonhosted.org/packages/6a/f4/fbeb03ef7abdda54db4a6a75c971b88ab73d724ff09e3275cc1e99f1c946/multiprocess-0.70.14-py39-none-any.whl.metadata\n", - " Downloading multiprocess-0.70.14-py39-none-any.whl.metadata (6.6 kB)\n", - "Collecting mpmath>=0.19 (from sympy->onnxruntime~=1.16.1->-r /empty/requirements.txt (line 3))\n", - " Obtaining dependency information for mpmath>=0.19 from https://files.pythonhosted.org/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl.metadata\n", - " Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)\n", - "Requirement already satisfied: Mako in /opt/conda/lib/python3.9/site-packages (from alembic!=1.10.0,<2->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.3.2)\n", - "Requirement already satisfied: cffi in /opt/conda/lib/python3.9/site-packages (from azure-datalake-store<0.1,>=0.0.46->adlfs==2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.16.0)\n", - "Requirement already satisfied: termcolor in /opt/conda/lib/python3.9/site-packages (from fire<1,>=0.3.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.4.0)\n", - "Requirement already satisfied: Werkzeug>=3.0.0 in /opt/conda/lib/python3.9/site-packages (from Flask<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.1)\n", - "Requirement already satisfied: itsdangerous>=2.1.2 in /opt/conda/lib/python3.9/site-packages (from Flask<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.1.2)\n", - "Requirement already satisfied: blinker>=1.6.2 in /opt/conda/lib/python3.9/site-packages (from Flask<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.7.0)\n", - "Requirement already satisfied: smmap<6,>=3.0.1 in /opt/conda/lib/python3.9/site-packages (from gitdb<5,>=4.0.1->GitPython>=3.1.41,~=3.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.0.1)\n", - "Requirement already satisfied: httplib2<1dev,>=0.15.0 in /opt/conda/lib/python3.9/site-packages (from google-api-python-client<2,>=1.7.8->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.22.0)\n", - "Requirement already satisfied: google-auth-httplib2>=0.0.3 in /opt/conda/lib/python3.9/site-packages (from google-api-python-client<2,>=1.7.8->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.2.0)\n", - "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.9/site-packages (from google-auth>=1.2->gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.3.3)\n", - "Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/conda/lib/python3.9/site-packages (from google-auth>=1.2->gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.3.0)\n", - "Requirement already satisfied: rsa<5,>=3.1.4 in /opt/conda/lib/python3.9/site-packages (from google-auth>=1.2->gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.9)\n", - "Requirement already satisfied: proto-plus<2.0.0dev,>=1.22.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-bigquery-storage<3.0.0dev,>=2.6.0->google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.23.0)\n", - "Requirement already satisfied: google-crc32c<2.0dev,>=1.0 in /opt/conda/lib/python3.9/site-packages (from google-cloud-storage->gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.5.0)\n", - "Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.9/site-packages (from importlib-metadata>=4.13.0->dask~=2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.17.0)\n", - "Requirement already satisfied: parso<0.9.0,>=0.8.3 in /opt/conda/lib/python3.9/site-packages (from jedi>=0.16->ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.8.3)\n", - "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2023.12.1)\n", - "Requirement already satisfied: referencing>=0.28.4 in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.33.0)\n", - "Requirement already satisfied: rpds-py>=0.7.1 in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.18.0)\n", - "Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /opt/conda/lib/python3.9/site-packages (from kubernetes<26,>=8.0.0->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.7.0)\n", - "Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.9/site-packages (from matplotlib<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.2.0)\n", - "Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.9/site-packages (from matplotlib<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.12.1)\n", - "Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.9/site-packages (from matplotlib<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.49.0)\n", - "Requirement already satisfied: kiwisolver>=1.3.1 in /opt/conda/lib/python3.9/site-packages (from matplotlib<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.4.5)\n", - "Requirement already satisfied: pillow>=8 in /opt/conda/lib/python3.9/site-packages (from matplotlib<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (10.2.0)\n", - "Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.9/site-packages (from matplotlib<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.1.1)\n", - "Requirement already satisfied: importlib-resources>=3.2.0 in /opt/conda/lib/python3.9/site-packages (from matplotlib<4->mlflow~=2.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (6.1.2)\n", - "Requirement already satisfied: PyJWT[crypto]<3,>=1.0.0 in /opt/conda/lib/python3.9/site-packages (from msal<2.0.0,>=1.24.0->azure-identity~=1.5->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.8.0)\n", - "Requirement already satisfied: portalocker<3,>=1.0 in /opt/conda/lib/python3.9/site-packages (from msal-extensions<2.0.0,>=0.3.0->azure-identity~=1.5->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.8.2)\n", - "Requirement already satisfied: beautifulsoup4 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (4.12.3)\n", - "Requirement already satisfied: bleach!=5.0.0 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (6.1.0)\n", - "Requirement already satisfied: defusedxml in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.7.1)\n", - "Requirement already satisfied: jupyter-core>=4.7 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.7.1)\n", - "Requirement already satisfied: jupyterlab-pygments in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.3.0)\n", - "Requirement already satisfied: mistune<4,>=2.0.3 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.0.2)\n", - "Requirement already satisfied: nbclient>=0.5.0 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.9.0)\n", - "Requirement already satisfied: nbformat>=5.7 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (5.9.2)\n", - "Requirement already satisfied: pandocfilters>=1.4.1 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.5.1)\n", - "Requirement already satisfied: tinycss2 in /opt/conda/lib/python3.9/site-packages (from nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.2.1)\n", - "Requirement already satisfied: pyzmq<25,>=17 in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (24.0.1)\n", - "Requirement already satisfied: argon2-cffi in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (23.1.0)\n", - "Requirement already satisfied: jupyter-client<8,>=5.3.4 in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (7.4.9)\n", - "Requirement already satisfied: ipython-genutils in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.2.0)\n", - "Requirement already satisfied: ipykernel in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (6.29.3)\n", - "Requirement already satisfied: Send2Trash>=1.8.0 in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.8.2)\n", - "Requirement already satisfied: terminado>=0.8.3 in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.18.0)\n", - "Requirement already satisfied: prometheus-client in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.20.0)\n", - "Requirement already satisfied: nbclassic>=0.4.7 in /opt/conda/lib/python3.9/site-packages (from notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.0.0)\n", - "Requirement already satisfied: ptyprocess>=0.5 in /opt/conda/lib/python3.9/site-packages (from pexpect>4.3->ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.7.0)\n", - "Requirement already satisfied: wcwidth in /opt/conda/lib/python3.9/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.2.13)\n", - "Requirement already satisfied: oauthlib>=3.0.0 in /opt/conda/lib/python3.9/site-packages (from requests-oauthlib>=0.5.0->msrest~=0.6.21->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.2.2)\n", - "Requirement already satisfied: wheel in /opt/conda/lib/python3.9/site-packages (from strip-hints<1,>=0.1.8->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.41.2)\n", - "Requirement already satisfied: executing>=1.2.0 in /opt/conda/lib/python3.9/site-packages (from stack-data->ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.0.1)\n", - "Requirement already satisfied: asttokens>=2.1.0 in /opt/conda/lib/python3.9/site-packages (from stack-data->ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.4.1)\n", - "Requirement already satisfied: pure-eval in /opt/conda/lib/python3.9/site-packages (from stack-data->ipython~=8.10->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.2.2)\n", - "Requirement already satisfied: webencodings in /opt/conda/lib/python3.9/site-packages (from bleach!=5.0.0->nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.5.1)\n", - "Requirement already satisfied: pycparser in /opt/conda/lib/python3.9/site-packages (from cffi->azure-datalake-store<0.1,>=0.0.46->adlfs==2023.9.0->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.21)\n", - "Requirement already satisfied: grpcio-status<2.0.dev0,>=1.33.2 in /opt/conda/lib/python3.9/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-cloud-bigquery[bqstorage,pandas]==3.14.1->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.48.2)\n", - "Requirement already satisfied: platformdirs>=2.5 in /opt/conda/lib/python3.9/site-packages (from jupyter-core>=4.7->nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (3.10.0)\n", - "Requirement already satisfied: jupyter-server>=1.8 in /opt/conda/lib/python3.9/site-packages (from nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.12.5)\n", - "Requirement already satisfied: notebook-shim>=0.2.3 in /opt/conda/lib/python3.9/site-packages (from nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.2.4)\n", - "Requirement already satisfied: fastjsonschema in /opt/conda/lib/python3.9/site-packages (from nbformat>=5.7->nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.19.1)\n", - "Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /opt/conda/lib/python3.9/site-packages (from pyasn1-modules>=0.2.1->google-auth>=1.2->gcsfs==2023.9.2->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.5.1)\n", - "Requirement already satisfied: argon2-cffi-bindings in /opt/conda/lib/python3.9/site-packages (from argon2-cffi->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (21.2.0)\n", - "Requirement already satisfied: soupsieve>1.2 in /opt/conda/lib/python3.9/site-packages (from beautifulsoup4->nbconvert>=6.4.5->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.5)\n", - "Requirement already satisfied: comm>=0.1.1 in /opt/conda/lib/python3.9/site-packages (from ipykernel->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.2.1)\n", - "Requirement already satisfied: debugpy>=1.6.5 in /opt/conda/lib/python3.9/site-packages (from ipykernel->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.8.1)\n", - "Requirement already satisfied: jupyter-events>=0.9.0 in /opt/conda/lib/python3.9/site-packages (from jupyter-server>=1.8->nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.9.0)\n", - "Requirement already satisfied: jupyter-server-terminals in /opt/conda/lib/python3.9/site-packages (from jupyter-server>=1.8->nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.5.2)\n", - "Requirement already satisfied: overrides in /opt/conda/lib/python3.9/site-packages (from jupyter-server>=1.8->nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (7.7.0)\n", - "Requirement already satisfied: python-json-logger>=2.0.4 in /opt/conda/lib/python3.9/site-packages (from jupyter-events>=0.9.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.0.7)\n", - "Requirement already satisfied: rfc3339-validator in /opt/conda/lib/python3.9/site-packages (from jupyter-events>=0.9.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.1.4)\n", - "Requirement already satisfied: rfc3986-validator>=0.1.1 in /opt/conda/lib/python3.9/site-packages (from jupyter-events>=0.9.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook<7.0.0,>=6.4->nuclio-jupyter~=0.9.15->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (0.1.1)\n", - "Requirement already satisfied: fqdn in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.5.1)\n", - "Requirement already satisfied: isoduration in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (20.11.0)\n", - "Requirement already satisfied: jsonpointer>1.13 in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.1)\n", - "Requirement already satisfied: uri-template in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.3.0)\n", - "Requirement already satisfied: webcolors>=1.11 in /opt/conda/lib/python3.9/site-packages (from jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.13)\n", - "Requirement already satisfied: arrow>=0.15.0 in /opt/conda/lib/python3.9/site-packages (from isoduration->jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (1.3.0)\n", - "Requirement already satisfied: types-python-dateutil>=2.8.10 in /opt/conda/lib/python3.9/site-packages (from arrow>=0.15.0->isoduration->jsonschema<5,>=3.0.1->kfp~=1.8->mlrun[complete]==1.6.1->-r /empty/requirements.txt (line 1)) (2.8.19.20240106)\n", - "Downloading onnx-1.14.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (14.6 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.6/14.6 MB 274.2 MB/s eta 0:00:00\n", - "Downloading onnxruntime-1.16.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.4 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 277.9 MB/s eta 0:00:00\n", - "Downloading optimum-1.6.4-py3-none-any.whl (227 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 227.8/227.8 kB 291.3 MB/s eta 0:00:00\n", - "Downloading transformers-4.26.1-py3-none-any.whl (6.3 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 242.4 MB/s eta 0:00:00\n", - "Downloading datasets-2.10.1-py3-none-any.whl (469 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 469.0/469.0 kB 185.9 MB/s eta 0:00:00\n", - "Downloading scikit_learn-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.4 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 26.4/26.4 MB 275.9 MB/s eta 0:00:00\n", - "Downloading dill-0.3.6-py3-none-any.whl (110 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 110.5/110.5 kB 282.3 MB/s eta 0:00:00\n", - "Downloading huggingface_hub-0.21.4-py3-none-any.whl (346 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 346.4/346.4 kB 311.7 MB/s eta 0:00:00\n", - "Downloading numpy-1.23.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 269.6 MB/s eta 0:00:00\n", - "Downloading regex-2023.12.25-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 773.4/773.4 kB 311.9 MB/s eta 0:00:00\n", - "Downloading responses-0.18.0-py3-none-any.whl (38 kB)\n", - "Downloading tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 264.1 MB/s eta 0:00:00\n", - "Downloading torch-2.2.1-cp39-cp39-manylinux1_x86_64.whl (755.5 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 755.5/755.5 MB 204.0 MB/s eta 0:00:00\n", - "Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 40.3 MB/s eta 0:00:00\n", - "Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 43.0 MB/s eta 0:00:00\n", - "Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 46.9 MB/s eta 0:00:00\n", - "Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 51.0 MB/s eta 0:00:00\n", - "Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 58.2 MB/s eta 0:00:00\n", - "Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 69.0 MB/s eta 0:00:00\n", - "Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 36.0 MB/s eta 0:00:00\n", - "Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 52.8 MB/s eta 0:00:00\n", - "Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 45.9 MB/s eta 0:00:00\n", - "Downloading nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 19.6 MB/s eta 0:00:00\n", - "Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 27.7 MB/s eta 0:00:00\n", - "Downloading triton-2.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (167.9 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.9/167.9 MB 41.3 MB/s eta 0:00:00\n", - "Downloading protobuf-3.20.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 42.8 MB/s eta 0:00:00\n", - "Downloading coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.0/46.0 kB 192.0 MB/s eta 0:00:00\n", - "Downloading filelock-3.13.1-py3-none-any.whl (11 kB)\n", - "Downloading flatbuffers-24.3.7-py2.py3-none-any.whl (26 kB)\n", - "Downloading multiprocess-0.70.14-py39-none-any.whl (132 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 132.9/132.9 kB 100.7 MB/s eta 0:00:00\n", - "Downloading sympy-1.12-py3-none-any.whl (5.7 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 41.4 MB/s eta 0:00:00\n", - "Downloading humanfriendly-10.0-py2.py3-none-any.whl (86 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 86.8/86.8 kB 253.7 MB/s eta 0:00:00\n", - "Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 45.4 MB/s eta 0:00:00\n", - "Downloading sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 46.1 MB/s eta 0:00:00\n", - "Downloading networkx-3.2.1-py3-none-any.whl (1.6 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 43.7 MB/s eta 0:00:00\n", - "Downloading nvidia_nvjitlink_cu12-12.4.99-py3-none-manylinux2014_x86_64.whl (21.1 MB)\n", - " ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 43.8 MB/s eta 0:00:00\n", - "Installing collected packages: tokenizers, sentencepiece, mpmath, flatbuffers, sympy, regex, protobuf, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, humanfriendly, filelock, dill, triton, responses, onnx, nvidia-cusparse-cu12, nvidia-cudnn-cu12, multiprocess, huggingface-hub, coloredlogs, transformers, scikit-learn, onnxruntime, nvidia-cusolver-cu12, torch, datasets, optimum\n", - " Attempting uninstall: protobuf\n", - " Found existing installation: protobuf 3.20.3\n", - " Uninstalling protobuf-3.20.3:\n", - " Successfully uninstalled protobuf-3.20.3\n", - " Attempting uninstall: numpy\n", - " Found existing installation: numpy 1.26.4\n", - " Uninstalling numpy-1.26.4:\n", - " Successfully uninstalled numpy-1.26.4\n", - " Attempting uninstall: scikit-learn\n", - " Found existing installation: scikit-learn 1.4.1.post1\n", - " Uninstalling scikit-learn-1.4.1.post1:\n", - " Successfully uninstalled scikit-learn-1.4.1.post1\n", - "Successfully installed coloredlogs-15.0.1 datasets-2.10.1 dill-0.3.6 filelock-3.13.1 flatbuffers-24.3.7 huggingface-hub-0.21.4 humanfriendly-10.0 mpmath-1.3.0 multiprocess-0.70.14 networkx-3.2.1 numpy-1.23.5 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.4.99 nvidia-nvtx-cu12-12.1.105 onnx-1.14.1 onnxruntime-1.16.3 optimum-1.6.4 protobuf-3.20.2 regex-2023.12.25 responses-0.18.0 scikit-learn-1.0.2 sentencepiece-0.2.0 sympy-1.12 tokenizers-0.13.3 torch-2.2.1 transformers-4.26.1 triton-2.2.0\n", - "WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\n", - "\u001b[36mINFO\u001b[0m[0238] Taking snapshot of full filesystem... \n", - "\u001b[36mINFO\u001b[0m[0463] Pushing image to docker-registry.default-tenant.app.app-lab-2-b688.iguazio-cd2.com/mlrun/func-hugging-face-trainer-avia-hugging-face-classifier-trainer:latest \n", - "\u001b[36mINFO\u001b[0m[0493] Pushed docker-registry.default-tenant.app.app-lab-2-b688.iguazio-cd2.com/mlrun/func-hugging-face-trainer-avia-hugging-face-classifier-trainer@sha256:691d0bb3c23487b4b5d2f84ab323c24735626ee81681475f53a4158b72d4cfee \n" - ] - }, - { - "data": { - "text/plain": [ - "BuildStatus(ready=True, outputs={'image': '.mlrun/func-hugging-face-trainer-avia-hugging-face-classifier-trainer:latest'})" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "project.build_function(\"hugging-face-classifier-trainer\",with_mlrun=True)" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": { - "scrolled": true, - "tags": [] - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:22:42,252 [info] Storing function: {'name': 'hugging-face-classifier-trainer-train', 'uid': '53252ce7aacb4b1aacf86bf3b862daa2', 'db': 'http://mlrun-api:8080'}\n", - "> 2024-03-24 17:22:42,536 [info] Job is running in the background, pod: hugging-face-classifier-trainer-train-dqqfr\n", - "> 2024-03-24 17:24:43,288 [info] 'train_test_split_size' is not provided, setting train_test_split_size to 0.2\n", - "> 2024-03-24 17:24:43,847 [info] Loading and editing Shayanvsf/US_Airline_Sentiment dataset from Hugging Face hub\n", - "Downloading metadata: 100%|██████████| 1.03k/1.03k [00:00<00:00, 6.77MB/s]\n", - "Downloading and preparing dataset None/None (download: 265.13 KiB, generated: 1.50 MiB, post-processed: Unknown size, total: 1.76 MiB) to /root/.cache/huggingface/datasets/Shayanvsf___parquet/Shayanvsf--US_Airline_Sentiment-1319c42f87c44b2f/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec...\n", - "Downloading data files: 0%| | 0/3 [00:00 2024-03-24 17:24:47,076 [info] training 'huggingface-model'\n", - "The following columns in the training set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`, you can safely ignore this message.\n", - "This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n", - "***** Running training *****\n", - " Num examples = 100\n", - " Num Epochs = 3\n", - " Instantaneous batch size per device = 16\n", - " Total train batch size (w. parallel, distributed & accumulation) = 16\n", - " Gradient Accumulation steps = 1\n", - " Total optimization steps = 21\n", - " Number of trainable parameters = 66955010\n", - "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", - "To disable this warning, you can either:\n", - "\t- Avoid using `tokenizers` before the fork if possible\n", - "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n", - " 0%| | 0/21 [00:00 2024-03-24 17:26:00,230 [info] To track results use the CLI: {'info_cmd': 'mlrun get run 53252ce7aacb4b1aacf86bf3b862daa2 -p hugging-face-trainer-avia', 'logs_cmd': 'mlrun logs 53252ce7aacb4b1aacf86bf3b862daa2 -p hugging-face-trainer-avia'}\n", - "> 2024-03-24 17:26:00,231 [info] Or click for UI: {'ui_url': 'https://dashboard.default-tenant.app.app-lab-2-b688.iguazio-cd2.com/mlprojects/hugging-face-trainer-avia/jobs/monitor/53252ce7aacb4b1aacf86bf3b862daa2/overview'}\n", - "> 2024-03-24 17:26:00,231 [info] Run execution finished: {'status': 'completed', 'name': 'hugging-face-classifier-trainer-train'}\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
hugging-face-trainer-avia0Mar 24 17:24:39completedhugging-face-classifier-trainer-train
v3io_user=avia
kind=job
owner=avia
mlrun/client_version=1.6.1
mlrun/client_python_version=3.9.16
host=hugging-face-classifier-trainer-train-dqqfr
hf_dataset=Shayanvsf/US_Airline_Sentiment
drop_columns=['airline_sentiment_confidence', 'negativereason_confidence']
pretrained_tokenizer=distilbert-base-uncased
pretrained_model=distilbert-base-uncased
model_class=transformers.AutoModelForSequenceClassification
label_name=airline_sentiment
num_of_train_samples=100
metrics=['accuracy', 'f1']
random_state=42
TRAIN_output_dir=finetuning-sentiment-model-3000-samples
TRAIN_learning_rate=2e-05
TRAIN_per_device_train_batch_size=16
TRAIN_per_device_eval_batch_size=16
TRAIN_num_train_epochs=3
TRAIN_weight_decay=0.01
TRAIN_push_to_hub=False
TRAIN_evaluation_strategy=epoch
TRAIN_eval_steps=1
TRAIN_logging_steps=1
CLASS_num_labels=2
loss=0.5215
learning_rate=0.0
eval_loss=0.4750453531742096
eval_accuracy=0.7916666666666666
eval_f1=0.0
eval_runtime=1.0524
eval_samples_per_second=22.806
eval_steps_per_second=1.9
train_runtime=55.1543
train_samples_per_second=5.439
train_steps_per_second=0.381
total_flos=3327208489680.0
loss_plot
learning_rate_plot
eval_loss_plot
eval_accuracy_plot
eval_f1_plot
eval_runtime_plot
eval_samples_per_second_plot
eval_steps_per_second_plot
tokenizer
model
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/html": [ - " > to track results use the .show() or .logs() methods or click here to open in UI" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2024-03-24 17:26:09,792 [info] Run execution finished: {'status': 'completed', 'name': 'hugging-face-classifier-trainer-train'}\n" - ] - } - ], - "source": [ - "train_run = hugging_face_classifier_trainer.run(params={\n", - " \"hf_dataset\": \"Shayanvsf/US_Airline_Sentiment\",\n", - " \"drop_columns\": [\n", - " \"airline_sentiment_confidence\",\n", - " \"negativereason_confidence\",\n", - " ],\n", - " \"pretrained_tokenizer\": \"distilbert-base-uncased\",\n", - " \"pretrained_model\": \"distilbert-base-uncased\",\n", - " \"model_class\": \"transformers.AutoModelForSequenceClassification\",\n", - " \"label_name\": \"airline_sentiment\",\n", - " \"num_of_train_samples\": 100,\n", - " \"metrics\": [\"accuracy\", \"f1\"],\n", - " \"random_state\": 42,\n", - " **additional_parameters\n", - " },\n", - " handler=\"train\", \n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, - "source": [ - "[Back to the top](#top)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "mlrun-base", - "language": "python", - "name": "conda-env-mlrun-base-py" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.16" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/hugging_face_classifier_trainer/hugging_face_classifier_trainer.py b/hugging_face_classifier_trainer/hugging_face_classifier_trainer.py deleted file mode 100755 index 29d070395..000000000 --- a/hugging_face_classifier_trainer/hugging_face_classifier_trainer.py +++ /dev/null @@ -1,832 +0,0 @@ -import os -import shutil -import tempfile -import zipfile -from abc import ABC -from typing import Any, Callable, Dict, List, Optional, Tuple, Union - -import mlrun -import mlrun.datastore -import mlrun.utils -import numpy as np -import pandas as pd -import transformers -from datasets import Dataset, load_dataset, load_metric -from mlrun import MLClientCtx -from mlrun import feature_store as fs -from mlrun.artifacts import Artifact, PlotlyArtifact -from mlrun.datastore import DataItem -from mlrun.frameworks._common import CommonTypes, MLRunInterface -from mlrun.utils import create_class -from plotly import graph_objects as go -from sklearn.model_selection import train_test_split -from transformers import ( - AutoTokenizer, - DataCollatorWithPadding, - EvalPrediction, - PreTrainedModel, - PreTrainedTokenizer, - Trainer, - TrainerCallback, - TrainerControl, - TrainerState, - TrainingArguments, -) - - -# ----------------------from MLRUN-------------------------------- -class HFORTOptimizerMLRunInterface(MLRunInterface, ABC): - """ - Interface for adding MLRun features for tensorflow keras API. - """ - - # MLRun's context default name: - DEFAULT_CONTEXT_NAME = "mlrun-huggingface" - - # Attributes to be inserted so the MLRun interface will be fully enabled. - _PROPERTIES = { - "_auto_log": False, - "_context": None, - "_model_name": "model", - "_tag": "", - "_labels": None, - "_extra_data": None, - } - _METHODS = ["enable_auto_logging"] - # Attributes to replace so the MLRun interface will be fully enabled. - _REPLACED_METHODS = [ - "optimize", - ] - - @classmethod - def add_interface( - cls, - obj, - restoration: CommonTypes.MLRunInterfaceRestorationType = None, - ): - """ - Enrich the object with this interface properties, methods and functions, so it will have this TensorFlow.Keras - MLRun's features. - :param obj: The object to enrich his interface. - :param restoration: Restoration information tuple as returned from 'remove_interface' in order to - add the interface in a certain state. - """ - super(HFORTOptimizerMLRunInterface, cls).add_interface( - obj=obj, restoration=restoration - ) - - @classmethod - def mlrun_optimize(cls): - """ - MLRun's tf.keras.Model.fit wrapper. It will setup the optimizer when using horovod. The optimizer must be - passed in a keyword argument and when using horovod, it must be passed as an Optimizer instance, not a string. - - raise MLRunInvalidArgumentError: In case the optimizer provided did not follow the instructions above. - """ - - def wrapper(self, *args, **kwargs): - save_dir = cls._get_function_argument( - self.optimize, - argument_name="save_dir", - passed_args=args, - passed_kwargs=kwargs, - )[0] - - # Call the original optimize method: - result = self.original_optimize(*args, **kwargs) - - if self._auto_log: - # Log the onnx model: - self._context.log_model( - key="model", - db_key=self._model_name, - model_file=f"{save_dir}/model_optimized.onnx", - tag=self._tag, - framework="ONNX", - labels=self._labels, - extra_data=self._extra_data, - ) - - return result - - return wrapper - - def enable_auto_logging( - self, - context: mlrun.MLClientCtx, - model_name: str = "model", - tag: str = "", - labels: Dict[str, str] = None, - extra_data: dict = None, - ): - self._auto_log = True - - self._context = context - self._model_name = model_name - self._tag = tag - self._labels = labels - self._extra_data = extra_data - - -class HFTrainerMLRunInterface(MLRunInterface, ABC): - """ - Interface for adding MLRun features for tensorflow keras API. - """ - - # MLRuns context default name: - DEFAULT_CONTEXT_NAME = "mlrun-huggingface" - - # Attributes to replace so the MLRun interface will be fully enabled. - _REPLACED_METHODS = [ - "train", - # "evaluate" - ] - - @classmethod - def add_interface( - cls, - obj: Trainer, - restoration: CommonTypes.MLRunInterfaceRestorationType = None, - ): - """ - Enrich the object with this interface properties, methods and functions, so it will have this TensorFlow.Keras - MLRuns features. - :param obj: The object to enrich his interface. - :param restoration: Restoration information tuple as returned from 'remove_interface' in order to - add the interface in a certain state. - """ - - super(HFTrainerMLRunInterface, cls).add_interface( - obj=obj, restoration=restoration - ) - - @classmethod - def mlrun_train(cls): - - """ - MLRuns tf.keras.Model.fit wrapper. It will setup the optimizer when using horovod. The optimizer must be - passed in a keyword argument and when using horovod, it must be passed as an Optimizer instance, not a string. - - raise MLRunInvalidArgumentError: In case the optimizer provided did not follow the instructions above. - """ - - def wrapper(self: Trainer, *args, **kwargs): - # Restore the evaluation method as `train` will use it: - # cls._restore_attribute(obj=self, attribute_name="evaluate") - - # Call the original fit method: - result = self.original_train(*args, **kwargs) - - # Replace the evaluation method again: - # cls._replace_function(obj=self, function_name="evaluate") - - return result - - return wrapper - - -class MLRunCallback(TrainerCallback): - """ - Callback for collecting logs during training / evaluation of the `Trainer` API. - """ - - def __init__( - self, - context: mlrun.MLClientCtx = None, - model_name: str = "model", - tag: str = "", - labels: Dict[str, str] = None, - extra_data: dict = None, - ): - super().__init__() - - # Store the configurations: - self._context = ( - context - if context is not None - else mlrun.get_or_create_ctx("./mlrun-huggingface") - ) - self._model_name = model_name - self._tag = tag - self._labels = labels - self._extra_data = extra_data if extra_data is not None else {} - - # Set up the logging mode: - self._is_training = False - self._steps: List[List[int]] = [] - self._metric_scores: Dict[str, List[float]] = {} - self._artifacts: Dict[str, Artifact] = {} - - def on_epoch_begin( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - self._steps.append([]) - - def on_epoch_end( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - self._log_metrics() - - def on_log( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - logs: Dict[str, float] = None, - **kwargs, - ): - recent_logs = state.log_history[-1].copy() - - recent_logs.pop("epoch") - current_step = int(recent_logs.pop("step")) - if current_step not in self._steps[-1]: - self._steps[-1].append(current_step) - - for metric_name, metric_score in recent_logs.items(): - if metric_name.startswith("train_"): - if metric_name.split("train_")[1] not in self._metric_scores: - self._metric_scores[metric_name] = [metric_score] - continue - if metric_name not in self._metric_scores: - self._metric_scores[metric_name] = [] - self._metric_scores[metric_name].append(metric_score) - - def on_train_begin( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - self._is_training = True - - def on_train_end( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - model: PreTrainedModel = None, - tokenizer: PreTrainedTokenizer = None, - **kwargs, - ): - self._log_metrics() - - temp_directory = tempfile.gettempdir() - - # Save and log the tokenizer: - if tokenizer is not None: - # Save tokenizer: - tokenizer_dir = os.path.join(temp_directory, "tokenizer") - tokenizer.save_pretrained(save_directory=tokenizer_dir) - # Zip the tokenizer directory: - tokenizer_zip = shutil.make_archive( - base_name="tokenizer", - format="zip", - root_dir=tokenizer_dir, - ) - # Log the zip file: - self._artifacts["tokenizer"] = self._context.log_artifact( - item="tokenizer", local_path=tokenizer_zip - ) - - # Save the model: - model_dir = os.path.join(temp_directory, "model") - model.save_pretrained(save_directory=model_dir) - - # Zip the model directory: - shutil.make_archive( - base_name="model", - format="zip", - root_dir=model_dir, - ) - - # Log the model: - self._context.log_model( - key="model", - db_key=self._model_name, - model_file="model.zip", - tag=self._tag, - framework="Hugging Face", - labels=self._labels, - extra_data={**self._artifacts, **self._extra_data}, - ) - - def on_evaluate( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - self._log_metrics() - - if self._is_training: - return - - # TODO: Update the model object - - def _log_metrics(self): - for metric_name, metric_scores in self._metric_scores.items(): - self._context.log_result(key=metric_name, value=metric_scores[-1]) - if len(metric_scores) > 1: - self._log_metric_plot(name=metric_name, scores=metric_scores) - self._context.commit(completed=False) - - def _log_metric_plot(self, name: str, scores: List[float]): - # Initialize a plotly figure: - metric_figure = go.Figure() - - # Add titles: - metric_figure.update_layout( - title=name.capitalize().replace("_", " "), - xaxis_title="Samples", - yaxis_title="Scores", - ) - - # Draw: - metric_figure.add_trace( - go.Scatter(x=np.arange(len(scores)), y=scores, mode="lines") - ) - - # Create the plotly artifact: - artifact_name = f"{name}_plot" - artifact = PlotlyArtifact(key=artifact_name, figure=metric_figure) - self._artifacts[artifact_name] = self._context.log_artifact(artifact) - - -def _apply_mlrun_on_trainer( - trainer: transformers.Trainer, - model_name: str = None, - tag: str = "", - context: mlrun.MLClientCtx = None, - auto_log: bool = True, - labels: Dict[str, str] = None, - extra_data: dict = None, - **kwargs, -): - # Get parameters defaults: - if context is None: - context = mlrun.get_or_create_ctx(HFTrainerMLRunInterface.DEFAULT_CONTEXT_NAME) - - HFTrainerMLRunInterface.add_interface(obj=trainer) - - if auto_log: - trainer.add_callback( - MLRunCallback( - context=context, - model_name=model_name, - tag=tag, - labels=labels, - extra_data=extra_data, - ) - ) - - -def _apply_mlrun_on_optimizer( - optimizer, - model_name: str = None, - tag: str = "", - context: mlrun.MLClientCtx = None, - auto_log: bool = True, - labels: Dict[str, str] = None, - extra_data: dict = None, - **kwargs, -): - # Get parameters defaults: - if context is None: - context = mlrun.get_or_create_ctx( - HFORTOptimizerMLRunInterface.DEFAULT_CONTEXT_NAME - ) - - HFORTOptimizerMLRunInterface.add_interface(obj=optimizer) - - if auto_log: - optimizer.enable_auto_logging( - context=context, - model_name=model_name, - tag=tag, - labels=labels, - extra_data=extra_data, - ) - - -def apply_mlrun( - huggingface_object, - model_name: str = None, - tag: str = "", - context: mlrun.MLClientCtx = None, - auto_log: bool = True, - labels: Dict[str, str] = None, - extra_data: dict = None, - **kwargs, -): - """ - Wrap the given model with MLRun's interface providing it with mlrun's additional features. - :param huggingface_object: The model to wrap. Can be loaded from the model path given as well. - :param model_name: The model name to use for storing the model artifact. Default: "model". - :param tag: The model's tag to log with. - :param context: MLRun context to work with. If no context is given it will be retrieved via - 'mlrun.get_or_create_ctx(None)' - :param auto_log: Whether to enable MLRun's auto logging. Default: True. - """ - - if isinstance(huggingface_object, transformers.Trainer): - return _apply_mlrun_on_trainer( - trainer=huggingface_object, - model_name=model_name, - tag=tag, - context=context, - auto_log=auto_log, - labels=labels, - extra_data=extra_data, - ) - import optimum.onnxruntime as optimum_ort - - if isinstance(huggingface_object, optimum_ort.ORTOptimizer): - return _apply_mlrun_on_optimizer( - optimizer=huggingface_object, - model_name=model_name, - tag=tag, - context=context, - auto_log=auto_log, - labels=labels, - extra_data=extra_data, - ) - raise mlrun.errors.MLRunInvalidArgumentError - - -# ---------------------- from auto_trainer-------------------------------- -class KWArgsPrefixes: - MODEL_CLASS = "CLASS_" - FIT = "FIT_" - TRAIN = "TRAIN_" - PREDICT = "PREDICT_" - - -def _get_sub_dict_by_prefix(src: Dict, prefix_key: str) -> Dict[str, Any]: - """ - Collect all the keys from the given dict that starts with the given prefix and creates a new dictionary with these - keys. - - :param src: The source dict to extract the values from. - :param prefix_key: Only keys with this prefix will be returned. The keys in the result dict will be without this - prefix. - """ - return { - key.replace(prefix_key, ""): val - for key, val in src.items() - if key.startswith(prefix_key) - } - - -def _get_dataframe( - context: MLClientCtx, - dataset: DataItem, - label_columns: Optional[Union[str, List[str]]] = None, - drop_columns: Union[str, List[str], int, List[int]] = None, -) -> Tuple[pd.DataFrame, Optional[Union[str, List[str]]]]: - """ - Getting the DataFrame of the dataset and drop the columns accordingly. - - :param context: MLRun context. - :param dataset: The dataset to train the model on. - Can be either a list of lists, dict, URI or a FeatureVector. - :param label_columns: The target label(s) of the column(s) in the dataset. for Regression or - Classification tasks. - :param drop_columns: str/int or a list of strings/ints that represent the column names/indices to drop. - """ - if isinstance(dataset, (list, dict)): - dataset = pd.DataFrame(dataset) - # Checking if drop_columns provided by integer type: - if drop_columns: - if isinstance(drop_columns, str) or ( - isinstance(drop_columns, list) - and any(isinstance(col, str) for col in drop_columns) - ): - context.logger.error( - "drop_columns must be an integer/list of integers if not provided with a URI/FeatureVector dataset" - ) - raise ValueError - dataset.drop(drop_columns, axis=1, inplace=True) - - return dataset, label_columns - - store_uri_prefix, _ = mlrun.datastore.parse_store_uri(dataset.artifact_url) - if mlrun.utils.StorePrefix.FeatureVector == store_uri_prefix: - # feature-vector case: - label_columns = label_columns or dataset.meta.status.label_column - dataset = fs.get_offline_features( - dataset.meta.uri, drop_columns=drop_columns - ).to_dataframe() - - context.logger.info(f"label columns: {label_columns}") - else: - # simple URL case: - dataset = dataset.as_df() - if drop_columns: - if all(col in dataset for col in drop_columns): - dataset = dataset.drop(drop_columns, axis=1) - else: - context.logger.info( - "not all of the columns to drop in the dataset, drop columns process skipped" - ) - return dataset, label_columns - - -# ---------------------- Hugging Face Trainer -------------------------------- - - -def _create_compute_metrics(metrics: List[str]) -> Callable[[EvalPrediction], Dict]: - """ - This function create and returns a function that will be used to compute metrics at evaluation. - :param metrics: List of different metrics for evaluate the model such as f1, accuracy etc. - - :returns: Function that will be used to compute metrics at evaluation. - Must take a [`EvalPrediction`] and return a dictionary string to metric values. - """ - - def _compute_metrics(eval_pred): - logits, labels = eval_pred - predictions = np.argmax(logits, axis=-1) - metric_dict_results = {} - for metric in metrics: - load_met = load_metric(metric) - metric_res = load_met.compute(predictions=predictions, references=labels)[ - metric - ] - metric_dict_results[metric] = metric_res - - return metric_dict_results - - return _compute_metrics - - -def _edit_columns( - dataset: Dataset, - drop_columns: List[str] = None, - rename_columns: [str, str] = None, -) -> Dataset: - """ - Drop and renames that columns of the given dataset - :param dataset: Dataset to process - :param drop_columns: The columns to drop from the dataset. - :param rename_columns: Dict of columns ro rename : {: , ...} - - :returns: The dataset after the desired process - """ - if drop_columns: - dataset = dataset.remove_columns(drop_columns) - if rename_columns: - dataset = dataset.rename_columns(rename_columns) - return dataset - - -def _prepare_dataset( - context: MLClientCtx, - dataset_name: str, - label_name: str = None, - drop_columns: Optional[List[str]] = None, - num_of_train_samples: int = None, - train_test_split_size: float = None, - random_state: int = None, -) -> Tuple[Dataset, Dataset]: - """ - Loading the dataset and editing the columns - - :param context: MLRun contex - :param dataset_name: The name of the dataset to get from the HuggingFace hub - :param label_name: The target label of the column in the dataset. - :param drop_columns: The columns to drop from the dataset. - :param num_of_train_samples: Max number of training samples, for debugging. - :param train_test_split_size: Should be between 0.0 and 1.0 and represent the proportion of the dataset to include - in the test split. - :param random_state: Random state for train_test_split - - """ - - context.logger.info( - f"Loading and editing {dataset_name} dataset from Hugging Face hub" - ) - rename_cols = {label_name: "labels"} - - # Loading and editing dataset: - dataset = load_dataset(dataset_name) - - # train set - train_dataset = dataset["train"] - if num_of_train_samples: - train_dataset = train_dataset.shuffle(seed=random_state).select( - list(range(num_of_train_samples)) - ) - train_dataset = _edit_columns(train_dataset, drop_columns, rename_cols) - - # test set - test_dataset = dataset["test"] - if train_test_split_size or num_of_train_samples: - train_test_split_size = train_test_split_size or 0.2 - num_of_test_samples = int( - (train_dataset.num_rows * train_test_split_size) - // (1 - train_test_split_size) - ) - test_dataset = test_dataset.shuffle(seed=random_state).select( - list(range(num_of_test_samples)) - ) - test_dataset = _edit_columns(test_dataset, drop_columns, rename_cols) - - return train_dataset, test_dataset - - -def train( - context: MLClientCtx, - hf_dataset: str = None, - dataset: DataItem = None, - test_set: DataItem = None, - drop_columns: Optional[List[str]] = None, - pretrained_tokenizer: str = None, - pretrained_model: str = None, - model_class: str = None, - model_name: str = "huggingface-model", - label_name: str = "labels", - text_col: str = "text", - num_of_train_samples: int = None, - train_test_split_size: float = None, - metrics: List[str] = None, - random_state: int = None, -): - """ - Training and evaluating a pretrained model with a pretrained tokenizer over a dataset. - The dataset can be either be the name of the dataset that contains in the HuggingFace hub, - or a URI or a FeatureVector - - :param context: MLRun context - :param hf_dataset: The name of the dataset to get from the HuggingFace hub - :param dataset: The dataset to train the model on. Can be either a URI or a FeatureVector - :param test_set: The test set to train the model with. - :param drop_columns: The columns to drop from the dataset. - :param pretrained_tokenizer: The name of the pretrained tokenizer from the HuggingFace hub. - :param pretrained_model: The name of the pretrained model from the HuggingFace hub. - :param model_name: The model's name to use for storing the model artifact, default to 'model' - :param model_class: The class of the model, e.g. `transformers.AutoModelForSequenceClassification` - :param label_name: The target label of the column in the dataset. - :param text_col: The input text column un the dataset. - :param num_of_train_samples: Max number of training samples, for debugging. - :param train_test_split_size: Should be between 0.0 and 1.0 and represent the proportion of the dataset to include - in the test split. - :param metrics: List of different metrics for evaluate the model such as f1, accuracy etc. - :param random_state: Random state for train_test_split - """ - - if train_test_split_size is None and test_set is None: - context.logger.info( - "'train_test_split_size' is not provided, setting train_test_split_size to 0.2" - ) - train_test_split_size = 0.2 - - # Creating tokenizer: - tokenizer = AutoTokenizer.from_pretrained(pretrained_tokenizer) - - def preprocess_function(examples): - return tokenizer(examples[text_col], truncation=True) - - # prepare data for training - if hf_dataset: - train_dataset, test_dataset = _prepare_dataset( - context, - hf_dataset, - label_name, - drop_columns, - num_of_train_samples, - train_test_split_size, - random_state=random_state, - ) - elif dataset: - # Get DataFrame by URL or by FeatureVector: - train_dataset, label_name = _get_dataframe( - context=context, - dataset=dataset, - label_columns=label_name, - drop_columns=drop_columns, - ) - if test_set: - test_dataset, _ = _get_dataframe( - context=context, - dataset=test_set, - label_columns=label_name, - drop_columns=drop_columns, - ) - else: - train_dataset, test_dataset = train_test_split( - train_dataset, - test_size=train_test_split_size, - random_state=random_state, - ) - train_dataset = Dataset.from_pandas(train_dataset) - test_dataset = Dataset.from_pandas(test_dataset) - else: - raise mlrun.errors.MLRunInvalidArgumentError( - "Training data was not provided. A training dataset is mandatory for training." - " Please provide a training set using one of the arguments 'hf_dataset' or 'dataset'." - ) - - # Mapping datasets with the tokenizer: - tokenized_train = train_dataset.map(preprocess_function, batched=True) - tokenized_test = test_dataset.map(preprocess_function, batched=True) - - # Creating data collator for batching: - data_collator = DataCollatorWithPadding(tokenizer=tokenizer) - - # Parsing kwargs: - train_kwargs = _get_sub_dict_by_prefix( - src=context.parameters, prefix_key=KWArgsPrefixes.TRAIN - ) - model_class_kwargs = _get_sub_dict_by_prefix( - src=context.parameters, prefix_key=KWArgsPrefixes.MODEL_CLASS - ) - - # Loading our pretrained model: - model_class_kwargs["pretrained_model_name_or_path"] = ( - model_class_kwargs.get("pretrained_model_name_or_path") or pretrained_model - ) - train_kwargs["hub_token"] = train_kwargs.get("hub_token") or pretrained_tokenizer - if not model_class_kwargs["pretrained_model_name_or_path"]: - raise mlrun.errors.MLRunRuntimeError( - "Must provide pretrained_model name as " - "function argument or in extra params" - ) - model = create_class(model_class).from_pretrained(**model_class_kwargs) - - # Preparing training arguments: - training_args = TrainingArguments( - **train_kwargs, - ) - - compute_metrics = _create_compute_metrics(metrics) if metrics else None - trainer = Trainer( - model=model, - args=training_args, - train_dataset=tokenized_train, - eval_dataset=tokenized_test, - tokenizer=tokenizer, - data_collator=data_collator, - compute_metrics=compute_metrics, - ) - - apply_mlrun(trainer, model_name=model_name) - - # Apply training with evaluation: - context.logger.info(f"training '{model_name}'") - trainer.train() - - -def _get_model_dir(model_uri: str): - model_file, _, _ = mlrun.artifacts.get_model(model_uri) - model_dir = tempfile.gettempdir() - # Unzip the Model: - with zipfile.ZipFile(model_file, "r") as zip_file: - zip_file.extractall(model_dir) - - return model_dir - - -def optimize( - model_path: str, - model_name: str = "optimized_model", - target_dir: str = "./optimized", - optimization_level: int = 1, -): - """ - Optimizing the transformer model using ONNX optimization. - - - :param model_path: The path of the model to optimize. - :param model_name: Name of the optimized model. - :param target_dir: The directory to save the ONNX model. - :param optimization_level: Optimization level performed by ONNX Runtime of the loaded graph. (default is 1) - """ - # We import these in the function scope so ONNX won't be mandatory for the other handlers: - from optimum.onnxruntime import ORTModelForSequenceClassification, ORTOptimizer - from optimum.onnxruntime.configuration import OptimizationConfig - - model_dir = _get_model_dir(model_uri=model_path) - # Creating configuration for optimization step: - optimization_config = OptimizationConfig(optimization_level=optimization_level) - - # Converting our pretrained model to an ONNX-Runtime model: - ort_model = ORTModelForSequenceClassification.from_pretrained( - model_dir, from_transformers=True - ) - - # Creating an ONNX-Runtime optimizer from ONNX model: - optimizer = ORTOptimizer.from_pretrained(ort_model) - - apply_mlrun(optimizer, model_name=model_name) - # Optimizing and saving the ONNX model: - optimizer.optimize(save_dir=target_dir, optimization_config=optimization_config) diff --git a/hugging_face_classifier_trainer/item.yaml b/hugging_face_classifier_trainer/item.yaml deleted file mode 100755 index 332902b3e..000000000 --- a/hugging_face_classifier_trainer/item.yaml +++ /dev/null @@ -1,33 +0,0 @@ -apiVersion: v1 -categories: -- deep-learning -- huggingface -- machine-learning -- model-training -description: Automatic train and optimize functions for HuggingFace framework -doc: '' -example: hugging_face_classifier_trainer.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: davids -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.6.1 -name: hugging_face_classifier_trainer -platformVersion: 3.5.5 -spec: - filename: hugging_face_classifier_trainer.py - handler: train - image: mlrun/mlrun - kind: job - requirements: - - onnx~=1.14.1 - - onnxruntime~=1.16.1 - - optimum~=1.6.4 - - transformers~=4.26.1 - - datasets~=2.10.1 - - scikit-learn~=1.0.2 -url: '' -version: 0.3.0 diff --git a/hugging_face_classifier_trainer/requirements.txt b/hugging_face_classifier_trainer/requirements.txt deleted file mode 100644 index 9d0db7b43..000000000 --- a/hugging_face_classifier_trainer/requirements.txt +++ /dev/null @@ -1,6 +0,0 @@ -onnx~=1.14.1 -onnxruntime~=1.16.1 -optimum~=1.6.4 -transformers~=4.26.1 -datasets~=2.10.1 -scikit-learn~=1.0.2 \ No newline at end of file diff --git a/hugging_face_classifier_trainer/test_hugging_face_classifier_trainer.py b/hugging_face_classifier_trainer/test_hugging_face_classifier_trainer.py deleted file mode 100644 index a5e0fee9b..000000000 --- a/hugging_face_classifier_trainer/test_hugging_face_classifier_trainer.py +++ /dev/null @@ -1,145 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import os - -import mlrun -import pytest -from mlrun import import_function - -REQUIRED_ENV_VARS = [ - "MLRUN_DBPATH", - "MLRUN_ARTIFACT_PATH", - "V3IO_USERNAME", - "V3IO_API", - "V3IO_ACCESS_KEY", -] - -ADDITIONAL_PARAM_FOR_TRAIN = { - "TRAIN_output_dir": "finetuning-sentiment-model-3000-samples", - "TRAIN_learning_rate": 2e-5, - "TRAIN_per_device_train_batch_size": 16, - "TRAIN_per_device_eval_batch_size": 16, - "TRAIN_num_train_epochs": 2, - "TRAIN_weight_decay": 0.01, - "TRAIN_push_to_hub": False, - "TRAIN_evaluation_strategy": "epoch", - "TRAIN_eval_steps": 1, - "TRAIN_logging_steps": 1, - "CLASS_num_labels": 2, -} - - -def _validate_environment_variables() -> bool: - """ - Checks that all required Environment variables are set. - """ - environment_keys = os.environ.keys() - return all(key in environment_keys for key in REQUIRED_ENV_VARS) - - -def _set_environment(env_file=None): - if env_file: - mlrun.set_env_from_file(env_file) - mlrun.get_or_create_project( - "hugging-face-classifier-trainer-test", context="./", user_project=True - ) - - -@pytest.mark.skipif( - condition=not _validate_environment_variables(), - reason="Project's environment variables are not set", -) -def test_train_sequence_classification(): - _set_environment() - - # Importing function: - fn = import_function("function.yaml") - - train_run = None - - try: - train_run = fn.run( - params={ - "hf_dataset": "Shayanvsf/US_Airline_Sentiment", - "drop_columns": [ - "airline_sentiment_confidence", - "negativereason_confidence", - ], - "pretrained_tokenizer": "distilbert-base-uncased", - "pretrained_model": "distilbert-base-uncased", - "model_class": "transformers.AutoModelForSequenceClassification", - "label_name": "airline_sentiment", - "num_of_train_samples": 100, - "metrics": ["accuracy", "f1"], - "random_state": 42, - **ADDITIONAL_PARAM_FOR_TRAIN, - }, - handler="train", - local=True, - ) - except Exception as exception: - print(f"- The test failed - raised the following error:\n- {exception}") - assert train_run and all( - key in train_run.outputs for key in ["model", "loss"] - ), "outputs should include more data" - - -@pytest.mark.skipif( - condition=not _validate_environment_variables(), - reason="Project's environment variables are not set", -) -def test_train_and_optimize_sequence_classification(): - _set_environment() - - # Importing function: - fn = import_function("function.yaml") - - train_run = None - optimize_run = None - - try: - train_run = fn.run( - params={ - "hf_dataset": "Shayanvsf/US_Airline_Sentiment", - "drop_columns": [ - "airline_sentiment_confidence", - "negativereason_confidence", - ], - "pretrained_tokenizer": "distilbert-base-uncased", - "pretrained_model": "distilbert-base-uncased", - "model_class": "transformers.AutoModelForSequenceClassification", - "label_name": "airline_sentiment", - "num_of_train_samples": 100, - "metrics": ["accuracy", "f1"], - "random_state": 42, - **ADDITIONAL_PARAM_FOR_TRAIN, - }, - handler="train", - local=True, - ) - - optimize_run = fn.run( - params={"model_path": train_run.outputs["model"]}, - handler="optimize", - local=True, - ) - except Exception as exception: - print(f"- The test failed - raised the following error:\n- {exception}") - assert train_run and all( - key in train_run.outputs for key in ["model", "loss"] - ), "outputs should include more data" - assert optimize_run and all( - key in optimize_run.outputs for key in ["model"] - ), "outputs should include more data" diff --git a/huggingface_auto_trainer/function.yaml b/huggingface_auto_trainer/function.yaml deleted file mode 100644 index 702a84016..000000000 --- a/huggingface_auto_trainer/function.yaml +++ /dev/null @@ -1,327 +0,0 @@ -kind: job -metadata: - name: huggingface-auto-trainer - tag: '' - hash: 55c9aa4a822780f7388819ccf633dfe26b31f02e - project: '' - labels: - author: Zeevr - categories: - - huggingface - - genai - - machine-learning - - model-training -spec: - command: '' - args: [] - image: mlrun/mlrun - build: - functionSourceCode:  - commands: [] - code_origin: '' - origin_filename: '' - requirements: [] - entry_points: - add_interface: - name: add_interface - doc: '' - parameters: - - name: cls - - name: obj - type: Trainer - - name: restoration - type: MLRunInterfaceRestorationType - default: null - outputs: [] - lineno: 70 - has_varargs: false - has_kwargs: false - mlrun_train: - name: mlrun_train - doc: '' - parameters: - - name: cls - outputs: [] - lineno: 80 - has_varargs: false - has_kwargs: false - wrapper: - name: wrapper - doc: '' - parameters: - - name: self - type: Trainer - outputs: [] - lineno: 81 - has_varargs: true - has_kwargs: true - on_epoch_begin: - name: on_epoch_begin - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 129 - has_varargs: false - has_kwargs: true - on_epoch_end: - name: on_epoch_end - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 140 - has_varargs: false - has_kwargs: true - on_log: - name: on_log - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - - name: logs - type: Dict[str, float] - default: null - outputs: [] - lineno: 151 - has_varargs: false - has_kwargs: true - on_train_begin: - name: on_train_begin - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 177 - has_varargs: false - has_kwargs: true - on_train_end: - name: on_train_end - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - - name: model - type: PreTrainedModel - default: null - - name: tokenizer - type: PreTrainedTokenizer - default: null - outputs: [] - lineno: 188 - has_varargs: false - has_kwargs: true - on_evaluate: - name: on_evaluate - doc: '' - parameters: - - name: self - - name: args - type: TrainingArguments - - name: state - type: TrainerState - - name: control - type: TrainerControl - outputs: [] - lineno: 201 - has_varargs: false - has_kwargs: true - log_metrics: - name: log_metrics - doc: '' - parameters: - - name: self - outputs: [] - lineno: 215 - has_varargs: false - has_kwargs: false - log_metric_plot: - name: log_metric_plot - doc: '' - parameters: - - name: self - - name: name - type: str - - name: scores - type: List[float] - outputs: [] - lineno: 222 - has_varargs: false - has_kwargs: false - apply_mlrun: - name: apply_mlrun - doc: This is temporary and will be built in mlrun 1.5.0 - parameters: - - name: trainer - type: Trainer - - name: model_name - type: str - default: null - - name: tag - type: str - default: '' - - name: context - type: MLClientCtx - default: null - - name: auto_log - type: bool - default: true - - name: labels - type: Dict[str, str] - default: null - - name: extra_data - type: dict - default: null - outputs: [] - lineno: 244 - has_varargs: false - has_kwargs: true - finetune_llm: - name: finetune_llm - doc: "Fine-tunes a Language Model (LLM) on a specific task using the provided\ - \ dataset.\n The function takes various configuration parameters to customize\ - \ the training process\n and adapt the model to specific tasks using a provided\ - \ dataset." - parameters: - - name: context - type: MLClientCtx - doc: mlrun context in order to log trained model - - name: train_dataset - type: Union[str, mlrun.datastore.DataItem] - doc: The train dataset used for fine-tuning the language model. - - name: eval_dataset - type: str - doc: The eval dataset used for evaluate the language model during training. - default: null - - name: train_load_dataset_kwargs - type: dict - doc: kwargs for dataset loading - default: {} - - name: eval_load_dataset_kwargs - type: dict - doc: kwargs for dataset loading - default: {} - - name: dataset_columns_to_train - type: Union[str, list] - doc: which columns to pass to the model as inputs - default: text - - name: model - type: Union[str, List[str]] - doc: a tuple containing model name and class, or str with model name or path - default: huggingface-model - - name: tokenizer - type: Union[str, List[str]] - doc: a tuple containing tokenizer name and class, or str with tokenizer name - or path - default: null - - name: deepspeed_config - type: Union[dict, bool] - doc: Configuration options for DeepSpeed (optional). - default: false - - name: quantization_config - type: Union[dict, bool] - doc: Configuration options for model quantization (optional). - default: false - - name: lora_config - type: Union[dict, bool] - doc: Configuration options for Low-Rank Approximation (LoRA) (optional). - default: false - - name: training_config - type: dict - doc: Configuration options specific to the fine-tuning training process (optional). - default: {} - - name: model_pretrained_config - type: dict - doc: config to load the pretrained model - default: {} - - name: tokenizer_pretrained_config - type: dict - doc: config to load the pretrained tokenizer - default: {} - - name: data_collator_config - type: dict - doc: Configuration options for data collation during training (optional). - default: {} - - name: task - type: str - doc: A description of the specific task the model is being fine-tuned for. - default: text-generation - - name: use_cuda - type: bool - doc: use gpu or not - default: true - - name: framework - type: str - doc: pt ot tf - default: pt - - name: device_map - type: str - default: auto - outputs: [] - lineno: 630 - has_varargs: false - has_kwargs: true - evaluate: - name: evaluate - doc: 'Evaluating the model using perplexity, for more information visit: - - https://huggingface.co/docs/transformers/perplexity' - parameters: - - name: context - doc: mlrun context - - name: model_path - doc: path to the model directory - - name: data - type: DataFrame - doc: the data to evaluate the model - - name: model_name - type: str - doc: name of base model - default: null - - name: tokenizer_name - type: str - doc: name of base tokenizer - default: null - outputs: [] - lineno: 784 - has_varargs: false - has_kwargs: false - description: fine-tune llm model with ease - default_handler: finetune_llm - disable_auto_mount: false - clone_target_dir: '' - env: [] - priority_class_name: '' - preemption_mode: prevent - affinity: null - tolerations: null - security_context: {} -verbose: false diff --git a/huggingface_auto_trainer/huggingface_auto_trainer.ipynb b/huggingface_auto_trainer/huggingface_auto_trainer.ipynb deleted file mode 100644 index 847fa98d6..000000000 --- a/huggingface_auto_trainer/huggingface_auto_trainer.ipynb +++ /dev/null @@ -1,195 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "a2c5dc6d-33d0-4e74-a875-6eab556e3b2d", - "metadata": {}, - "source": [ - "# Llm auto trainer" - ] - }, - { - "cell_type": "markdown", - "id": "cc7aa261-17b2-4362-bf6a-34af79b0230b", - "metadata": {}, - "source": [ - "## Notebook Introduction: Fine-Tuning a Large Language Model with Ease\n", - "\n", - "Welcome to this example notebook that demonstrates a simplified yet powerful approach to fine-tuning a Large Language Model (LLM) effortlessly. Fine-tuning is a crucial technique that allows you to adapt pre-trained language models to specific tasks, making them more contextually relevant and useful.\n", - "\n", - "In this notebook, we will walk you through a step-by-step process of fine-tuning a state-of-the-art language model using a user-friendly and efficient method. You don't need to be an expert in machine learning or natural language processing to follow along – our approach focuses on simplicity and effectiveness." - ] - }, - { - "cell_type": "markdown", - "id": "425249e9-f43f-45e6-aa25-9f53099049cd", - "metadata": {}, - "source": [ - "### First, we will select the model we wish to fine-tune and take the matching tokenizer and appropriate config" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3410e9c2-0557-4961-995e-0ef0cc07bf82", - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig\n", - "from transformers import logging\n", - "\n", - "logging.set_verbosity(\"CRITICAL\")\n", - "\n", - "model_name = \"tiiuae/falcon-7b\"\n", - "tokenizer = model_name\n", - "generation_config = GenerationConfig.from_pretrained(model_name)" - ] - }, - { - "cell_type": "markdown", - "id": "f33f3c35-cf61-4b0f-8da9-1c30d3b53230", - "metadata": {}, - "source": [ - "### Then, in order to use with mlrun, we will create an mlrun project and create an mlrun function" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a8ee7c35-adf7-4ed8-9e7e-e659b9461cd5", - "metadata": {}, - "outputs": [], - "source": [ - "import mlrun\n", - "\n", - "project = mlrun.get_or_create_project(\n", - " name=\"auto-trainer-test\",\n", - " context=\"./\",\n", - " user_project=True,\n", - " parameters={\n", - " \"default_image\": \"yonishelach/mlrun-llm\",\n", - " },\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d56b834f-adf6-4736-8de7-3348e050f561", - "metadata": {}, - "outputs": [], - "source": [ - "project.set_function(\n", - " \"auto-trainer.py\",\n", - " name=\"auto-trainer\",\n", - " kind=\"job\",\n", - " image=\"yonishelach/mlrun-llm\",\n", - " handler=\"finetune_llm\",\n", - ")\n", - "project.save()" - ] - }, - { - "cell_type": "markdown", - "id": "f42315db-6ddd-4dc1-89f3-c732f92d0d47", - "metadata": {}, - "source": [ - "### we can set the every config or parameter we want, including training arguments, hyper parameters and more, and pass to the function" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8e62e577-15fb-477d-9c56-fa9fb4c2669b", - "metadata": {}, - "outputs": [], - "source": [ - "import transformers\n", - "\n", - "training_arguments = {\n", - " \"per_device_train_batch_size\": 4,\n", - " \"gradient_accumulation_steps\": 1,\n", - " \"warmup_steps\": 2,\n", - " \"max_steps\": 10,\n", - " \"learning_rate\": 2e-4,\n", - " \"fp16\": True,\n", - " \"logging_steps\": 1,\n", - " \"optim\": \"paged_adamw_8bit\",\n", - "}" - ] - }, - { - "cell_type": "markdown", - "id": "284a5772-f88d-46c9-87bc-fc14e434c1b4", - "metadata": {}, - "source": [ - "### Now we simply run the function" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "11ab5888-5870-4bf8-9657-db930adecd77", - "metadata": {}, - "outputs": [], - "source": [ - "training_run = mlrun.run_function(\n", - " function=\"auto-trainer\",\n", - " name=\"auto-trainer\",\n", - " local=True,\n", - " params={\n", - " \"model\": (model_name, \"transformers.AutoModelForCausalLM\"),\n", - " \"tokenizer\": tokenizer,\n", - " \"train_dataset\": \"Abirate/english_quotes\",\n", - " \"training_config\": training_arguments,\n", - " \"quantization_config\": True,\n", - " \"lora_config\": True,\n", - " \"dataset_columns_to_train\": \"quote\",\n", - " \"lora_target_modules\": [\"query_key_value\"],\n", - " \"model_pretrained_config\": {\"trust_remote_code\": True, \"use_cache\": False},\n", - " },\n", - " handler=\"finetune_llm\",\n", - " outputs=[\"model\"],\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0e674d25-5f1f-4ea8-af02-7d22c2fb6760", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7a4dfe9b-407a-43c0-9c5e-56de106477ac", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "mlrun-base", - "language": "python", - "name": "conda-env-mlrun-base-py" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.16" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/huggingface_auto_trainer/huggingface_auto_trainer.py b/huggingface_auto_trainer/huggingface_auto_trainer.py deleted file mode 100644 index d1166318c..000000000 --- a/huggingface_auto_trainer/huggingface_auto_trainer.py +++ /dev/null @@ -1,855 +0,0 @@ -import importlib -import os -import shutil -import tempfile -import zipfile -from abc import ABC -from typing import Dict, List, Tuple, Union - -import mlrun -import numpy as np -import pandas as pd -import peft -import torch -import transformers -from datasets import Dataset, load_dataset -from mlrun.artifacts.manager import Artifact, PlotlyArtifact -from mlrun.datastore import is_store_uri -from mlrun.frameworks._common import CommonTypes, MLRunInterface -from mlrun.utils import logger -from peft import (LoraConfig, PeftModel, get_peft_model, - prepare_model_for_kbit_training) -from plotly import graph_objects as go -from transformers import (AutoModelForCausalLM, AutoTokenizer, - BitsAndBytesConfig, DataCollatorForLanguageModeling, - PreTrainedModel, PreTrainedTokenizer, Trainer, - TrainerCallback, TrainerControl, TrainerState, - TrainingArguments) - -supported_tasks = [ - "question-answering", - "summarization", - "table-question-answering", - "text2text-generation", - "text-classification", - "sentiment-analysis", - "text-generation", - "token-classification", - "translation", - "translation_xx_to_yy", -] - - -class ConfigKeys: - deepspeed = "deepspeed" - quantization = "quantization" - lora = "lora" - training = "training" - tokenizer_pretrained = "tokenizer_pretrained" - model_pretrained = "model_pretrained" - data_collator = "data_collator" - - -# ----------------------from MLRUN-------------------------------- -class HFTrainerMLRunInterface(MLRunInterface, ABC): - """ - This is temporary and will be built in mlrun 1.5.0 - Interface for adding MLRun features for tensorflow keras API. - """ - - # MLRuns context default name: - DEFAULT_CONTEXT_NAME = "mlrun-huggingface" - - # Attributes to replace so the MLRun interface will be fully enabled. - _REPLACED_METHODS = [ - "train", - # "evaluate" - ] - - @classmethod - def add_interface( - cls, - obj: Trainer, - restoration: CommonTypes.MLRunInterfaceRestorationType = None, - ): - super(HFTrainerMLRunInterface, cls).add_interface( - obj=obj, restoration=restoration - ) - - @classmethod - def mlrun_train(cls): - def wrapper(self: Trainer, *args, **kwargs): - # Restore the evaluation method as `train` will use it: - # cls._restore_attribute(obj=self, attribute_name="evaluate") - - # Call the original fit method: - result = self.original_train(*args, **kwargs) - - # Replace the evaluation method again: - # cls._replace_function(obj=self, function_name="evaluate") - - return result - - return wrapper - - -class MLRunCallback(TrainerCallback): - """ - This is temporary and will be built in mlrun 1.5.0 - Callback for collecting logs during training / evaluation of the `Trainer` API. - """ - - def __init__( - self, - context: mlrun.MLClientCtx = None, - model_name: str = "model", - tag: str = "", - labels: Dict[str, str] = None, - extra_data: dict = None, - ): - super().__init__() - - # Store the configurations: - self._context = ( - context - if context is not None - else mlrun.get_or_create_ctx("./mlrun-huggingface") - ) - self._model_name = model_name - self._tag = tag - self._labels = labels - self._extra_data = extra_data if extra_data is not None else {} - - # Set up the logging mode: - self._is_training = False - self._steps: List[List[int]] = [] - self._metric_scores: Dict[str, List[float]] = {} - self._artifacts: Dict[str, Artifact] = {} - - def on_epoch_begin( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - if not state.is_world_process_zero: - return - self._steps.append([]) - - def on_epoch_end( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - if not state.is_world_process_zero: - return - self.log_metrics() - - def on_log( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - logs: Dict[str, float] = None, - **kwargs, - ): - if not state.is_world_process_zero: - return - recent_logs = state.log_history[-1].copy() - - recent_logs.pop("epoch") - current_step = int(recent_logs.pop("step")) - if current_step not in self._steps[-1]: - self._steps[-1].append(current_step) - - for metric_name, metric_score in recent_logs.items(): - if metric_name.startswith("train_"): - if metric_name.split("train_")[1] not in self._metric_scores: - self._metric_scores[metric_name] = [metric_score] - continue - if metric_name not in self._metric_scores: - self._metric_scores[metric_name] = [] - self._metric_scores[metric_name].append(metric_score) - - def on_train_begin( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - if not state.is_world_process_zero: - return - self._is_training = True - - def on_train_end( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - model: PreTrainedModel = None, - tokenizer: PreTrainedTokenizer = None, - **kwargs, - ): - if not state.is_world_process_zero: - return - self.log_metrics() - - def on_evaluate( - self, - args: TrainingArguments, - state: TrainerState, - control: TrainerControl, - **kwargs, - ): - if not state.is_world_process_zero: - return - self.log_metrics() - - if self._is_training: - return - - def log_metrics(self): - for metric_name, metric_scores in self._metric_scores.items(): - self._context.log_result(key=metric_name, value=metric_scores[-1]) - if len(metric_scores) > 1: - self.log_metric_plot(name=metric_name, scores=metric_scores) - self._context.commit(completed=False) - - def log_metric_plot(self, name: str, scores: List[float]): - # Initialize a plotly figure: - metric_figure = go.Figure() - - # Add titles: - metric_figure.update_layout( - title=name.capitalize().replace("_", " "), - xaxis_title="Samples", - yaxis_title="Scores", - ) - - # Draw: - metric_figure.add_trace( - go.Scatter(x=np.arange(len(scores)), y=scores, mode="lines") - ) - - # Create the plotly artifact: - artifact_name = f"{name}_plot" - artifact = PlotlyArtifact(key=artifact_name, figure=metric_figure) - self._artifacts[artifact_name] = self._context.log_artifact(artifact) - - -def apply_mlrun( - trainer: transformers.Trainer, - model_name: str = None, - tag: str = "", - context: mlrun.MLClientCtx = None, - auto_log: bool = True, - labels: Dict[str, str] = None, - extra_data: dict = None, - **kwargs, -): - """ - This is temporary and will be built in mlrun 1.5.0 - """ - # Get parameters defaults: - if context is None: - context = mlrun.get_or_create_ctx(HFTrainerMLRunInterface.DEFAULT_CONTEXT_NAME) - - HFTrainerMLRunInterface.add_interface(obj=trainer) - - if auto_log: - trainer.add_callback( - MLRunCallback( - context=context, - model_name=model_name, - tag=tag, - labels=labels, - extra_data=extra_data, - ) - ) - - -# ----------------------end from MLRUN-------------------------------- - - -def _print_trainable_parameters(model): - """ - Prints the number of trainable parameters in the model. - """ - trainable_params = 0 - all_param = 0 - for _, param in model.named_parameters(): - all_param += param.numel() - if param.requires_grad: - trainable_params += param.numel() - print( - f"trainable params: {trainable_params} || all params: {all_param} || trainable%:" - f" {100 * trainable_params / all_param}" - ) - - -# default configs -# will be used if user provides "True" with config name as input -QUANTIZATION_CONFIG = transformers.BitsAndBytesConfig( - load_in_4bit=True, - bnb_4bit_use_double_quant=True, - bnb_4bit_quant_type="nf4", - bnb_4bit_compute_dtype=torch.bfloat16, -) - -LORA_CONFIG = peft.LoraConfig( - r=8, - lora_alpha=32, - target_modules=["query_key_value"], - lora_dropout=0.05, - bias="none", - task_type="CAUSAL_LM", -) - -DEEPSPEED_CONFIG = { - "train_micro_batch_size_per_gpu": "auto", - "fp16": {"enabled": True}, - "autotuning": { - "enabled": True, - "arg_mappings": { - "train_micro_batch_size_per_gpu": "--per_device_train_batch_size", - "gradient_accumulation_steps ": "--gradient_accumulation_steps", - }, - }, - "zero_optimization": { - "stage": 2, - }, -} - - -def _update_config(src: dict, dst: dict): - """ - update configs according to user, this way the user can add/modify values in default configs for e.g. - - goes over all configs and corresponding prefixes, collect all the keys from the given dict that start - with the prefix and add them to appropriate config - - :param src: dict of all candidate values to update dict. - :param dst: dict containing all configs to update. - """ - - for config_name, config in dst.items(): - - # If given True we use default dict - # Can also be False or a config dict given from user, so we check specifically fo True - if config is True and config_name == "quantization": - config = QUANTIZATION_CONFIG - - if config is True and config_name == "lora": - config = LORA_CONFIG - - if config is True and config_name == "deepspeed": - config = DEEPSPEED_CONFIG - - # in some cases we can get a boolean value, in that case no need to look for args - if isinstance(config, bool): - config = None - - elif isinstance(config, dict): - for key, val in src.items(): - if key.startswith(config_name): - config[key.replace(f"{config_name}_", "")] = val - - # update by config name - else: - for key, val in src.items(): - if key.startswith(config_name): - setattr(config, key.replace(f"{config_name}_", ""), val) - - dst.update({config_name: config}) - - -def _get_class_object(class_path: str) -> type: - """ - given a full class name, this function returns the correct class - - :param class_path: a full class name (ex. 'transformers.AutoModelForCausalLM') - - :return the wanted class object - """ - module_path, class_name = class_path.rsplit(".", 1) - module = importlib.import_module(module_path) - return getattr(module, class_name) - - -def _set_model_and_tokenizer( - model: Union[str, List[str]], - tokenizer: Union[str, List[str]], - task: str, - framework: str, - lora_config: dict, - quantization_config: dict, - use_cuda: bool, - tokenizer_pretrained_config, - model_pretrained_config, - device_map: str, -): - """ - get the correct model and tokenizer according to given user inputs - - :param model: a tuple containing model name and class, or str with model name or path - :param tokenizer: a tuple containing tokenizer name and class, or str with tokenizer name or path - :param task: a supported nlp task, used to choose model if not provided - :param framework: pt or tf - :param lora_config: lora config or None, to load model in appropriate way - :param quantization_config: quantization config or None, to load model in appropriate way - :param use_cuda: use gpu or not - :param tokenizer_pretrained_config: config to load the pretrained tokenizer - :param model_pretrained_config: config to load the pretrained model - :param device_map: a device map for model training if using number of gpu's - - :returns: model and tokenizer - """ - # if task is not supported and no model was given we can't choose one - if task and task not in supported_tasks and not model: - logger.error("unsupported task option chosen") - raise - - # load model from store - if isinstance(model, str) and is_store_uri(model): - pass - # TODO: load both model and tokenizer and return, need guy's help - - # if it's a tuple them we assume it contains of both name and class - if isinstance(model, list): - model_name, model_class = model - model_class = _get_class_object(model_class) - - # in the case we don't get the model class we need the task in order to choose the correct model - else: - if task is None: - logger.error("task must be chosen in order to determine the correct model") - raise Exception( - "this function requires either a supported task or a model and model class to be chosen" - ) - - _, available_classes, task_options = transformers.pipelines.check_task(task) - - if isinstance(model, str): - model_name = model - - # if model is not given, we take the default model for the given task - else: - model_name, _ = transformers.pipelines.get_default_model_and_revision( - available_classes, framework, task_options - ) - if not available_classes.get(framework, tuple()): - logger.error( - "given task's default model is not supported in specified framework" - ) - raise Exception( - "this function requires either a supported task or a model and model class to be chosen" - ) - - model_class = available_classes[framework][0] - - # load the pretrained model - if use_cuda: - device_map = device_map - else: - device_map = None - - model = model_class.from_pretrained( - model_name, - quantization_config=quantization_config, - device_map=device_map, - **model_pretrained_config, - ) - - # If quantization config is given we will load a quantized model, if not a regular one - if quantization_config: - model.gradient_checkpointing_enable() - model = peft.prepare_model_for_kbit_training(model) - - # If lora config was given we want to do lora fine tune, we update model here - if lora_config: - model = peft.get_peft_model(model, lora_config) - - # if not specified we choose the default tokenizer that corresponding to the model - if tokenizer is None: - tokenizer = transformers.AutoTokenizer.from_pretrained(model_name) - return model_name, model, tokenizer - - if isinstance(tokenizer, str): - tokenizer_name = tokenizer - tokenizer_class = transformers.AutoTokenizer - - # if it's not a str then it's a tuple of both name and class - else: - tokenizer_name, tokenizer_class = tokenizer - tokenizer_class = _get_class_object(tokenizer_class) - - tokenizer = tokenizer_class.from_pretrained( - tokenizer_name, **tokenizer_pretrained_config - ) - - tokenizer.pad_token = tokenizer.eos_token - - return model_name, model, tokenizer - - -def _dataset_loader(dataset: str, is_train: bool = True, **kwargs) -> Dataset: - """ - loads the specific dataset provided by the user - - :param dataset: name or path of dataset to load - :param is_train: bool that indicates the purpose of the dataset - :param kwargs: other kwargs for loading the dataset - - :returns: loaded dataset - """ - # if split in kwargs then the user decides how to split the dataset - if "split" in kwargs: - return load_dataset(dataset, **kwargs) - - # if it's a dataset for train we split with train - if is_train: - return load_dataset(dataset, split="train", **kwargs) - - # if it's eval dataset, then a lot of names are acceptable for the set and we check all of them - dataset = load_dataset(dataset, **kwargs) - if "test" in dataset: - return dataset.get("test") - elif "eval" in dataset: - return dataset.get("eval") - elif "validation" in dataset: - return dataset.get("validation") - - -def _prepare_dataset( - train_dataset: str, - eval_dataset: str, - train_load_dataset_kwargs, - eval_load_dataset_kwargs, - tokenizer, - dataset_columns_to_train: Union[str, list], -) -> (Dataset, Union[Dataset, None]): - """ - Loads the train and eval datasets (if provided) passes them through the tokenizer and - returns them ready to use in training - - :param train_dataset: the name or path to the train dataset - :param eval_dataset: the name or path to the eval dataset - :param dataset_columns_to_train: which columns to pass to the model as inputs - (need to pass through the tokenizer first) - :param train_load_dataset_kwargs: kwargs for dataset loading - :param eval_load_dataset_kwargs: kwargs for dataset loading - :param tokenizer: the tokenizer to pass the data through - - :returns: tokenized datasets - """ - if not tokenizer.pad_token: - tokenizer.pad_token = tokenizer.eos_token - - # we take col name/s in a list for easy generalization - if isinstance(dataset_columns_to_train, str): - dataset_columns_to_train = [dataset_columns_to_train] - - if isinstance(train_dataset, mlrun.datastore.DataItem): - train_dataset = Dataset.from_pandas(train_dataset.as_df()) - return ( - train_dataset.map( - lambda examples: tokenizer( - *[examples[col] for col in dataset_columns_to_train], - truncation=True, - padding=True, - ), - batched=True, - ), - None, - ) - - # Load datasets - # if provided two paths/names we load each separately using designated func - if eval_dataset: - train_dataset = _dataset_loader( - dataset=train_dataset, is_train=True, **train_load_dataset_kwargs - ) - eval_dataset = _dataset_loader( - dataset=eval_dataset, is_train=False, **eval_load_dataset_kwargs - ) - - # if only on path is given then we must check if it contains both dataset or if only one should be used - else: - dataset = load_dataset(train_dataset, **train_load_dataset_kwargs) - if "train" in dataset: - train_dataset = dataset.get("train") - if "test" in dataset: - eval_dataset = dataset.get("test") - elif "eval" in dataset: - eval_dataset = dataset.get("eval") - elif "validation" in dataset: - eval_dataset = dataset.get("validation") - else: - # only train dataset given, tokenize and return it - return ( - train_dataset.map( - lambda examples: tokenizer( - *[examples[col] for col in dataset_columns_to_train], - truncation=True, - padding=True, - ), - batched=True, - ), - None, - ) - else: - logger.error("train dataset is mandatory") - raise KeyError("no train dataset found in given dataset") - - # Tokenize the data so the model can understand it - tokenized_train_dataset = train_dataset.map( - lambda examples: tokenizer( - *[examples[col] for col in dataset_columns_to_train], - truncation=True, - padding=True, - ), - batched=True, - ) - - tokenized_eval_dataset = eval_dataset.map( - lambda examples: tokenizer( - *[examples[col] for col in dataset_columns_to_train], - truncation=True, - padding=True, - ), - batched=True, - ) - - return tokenized_train_dataset, tokenized_eval_dataset - - -def finetune_llm( - context: mlrun.MLClientCtx, - train_dataset: Union[str, mlrun.datastore.DataItem], - eval_dataset: str = None, - train_load_dataset_kwargs: dict = {}, - eval_load_dataset_kwargs: dict = {}, - dataset_columns_to_train: Union[str, list] = "text", - model: Union[str, List[str]] = "huggingface-model", - tokenizer: Union[str, List[str]] = None, - deepspeed_config: Union[dict, bool] = False, - quantization_config: Union[dict, bool] = False, - lora_config: Union[dict, bool] = False, - training_config: dict = {}, - model_pretrained_config: dict = {}, - tokenizer_pretrained_config: dict = {}, - data_collator_config: dict = {}, - task: str = "text-generation", - use_cuda: bool = True, - framework: str = "pt", - device_map: str = "auto", - **kwargs, -): - """ - Fine-tunes a Language Model (LLM) on a specific task using the provided dataset. - The function takes various configuration parameters to customize the training process - and adapt the model to specific tasks using a provided dataset. - - :param context: mlrun context in order to log trained model - :param dataset_columns_to_train: which columns to pass to the model as inputs - :param eval_load_dataset_kwargs: kwargs for dataset loading - :param train_load_dataset_kwargs: kwargs for dataset loading - :param framework: pt ot tf - :param use_cuda: use gpu or not - :param tokenizer_pretrained_config: config to load the pretrained tokenizer - :param model_pretrained_config: config to load the pretrained model - :param tokenizer: a tuple containing tokenizer name and class, or str with tokenizer name or path - :param model: a tuple containing model name and class, or str with model name or path - :param train_dataset: The train dataset used for fine-tuning the language model. - :param eval_dataset: The eval dataset used for evaluate the language model during training. - :param deepspeed_config: Configuration options for DeepSpeed (optional). - :param quantization_config: Configuration options for model quantization (optional). - :param lora_config: Configuration options for Low-Rank Approximation (LoRA) (optional). - :param training_config: Configuration options specific to the fine-tuning training process (optional). - :param data_collator_config: Configuration options for data collation during training (optional). - :param task: A description of the specific task the model is being fine-tuned for. - :param kwargs: Additional keyword arguments. - """ - - # TODO: match forward.keyword to dataset.keyword - check if relevant in new design - # TODO: add warning for label, and add option to modify dataset col names - check if relevant in new design - - # Look for updates to configs given in kwargs - configs = { - ConfigKeys.deepspeed: deepspeed_config, - ConfigKeys.quantization: quantization_config, - ConfigKeys.lora: lora_config, - ConfigKeys.training: training_config, - ConfigKeys.model_pretrained: model_pretrained_config, - ConfigKeys.tokenizer_pretrained: tokenizer_pretrained_config, - ConfigKeys.data_collator: data_collator_config, - } - _update_config(dst=configs, src=kwargs) - - # check gpu permission and availability - if use_cuda: - if torch.cuda.is_available(): - # Clean gpu cache - torch.cuda.empty_cache() - else: - logger.warning("'use_cuda' is set to True, but no cuda device is available") - - # get model and tokenizer - model_name, model, tokenizer = _set_model_and_tokenizer( - model=model, - tokenizer=tokenizer, - task=task, - framework=framework, - lora_config=configs[ConfigKeys.lora], - quantization_config=configs[ConfigKeys.quantization], - use_cuda=use_cuda, - tokenizer_pretrained_config=tokenizer_pretrained_config, - model_pretrained_config=configs[ConfigKeys.model_pretrained], - device_map=device_map, - ) - - # Load datasets - tokenized_train, tokenized_eval = _prepare_dataset( - train_dataset=train_dataset, - eval_dataset=eval_dataset, - train_load_dataset_kwargs=train_load_dataset_kwargs, - eval_load_dataset_kwargs=eval_load_dataset_kwargs, - tokenizer=tokenizer, - dataset_columns_to_train=dataset_columns_to_train, - ) - - # Initialize the data collator for the trainer to use in order to create batches of data - data_collator = transformers.DataCollatorForLanguageModeling( - tokenizer=tokenizer, mlm=False, **data_collator_config - ) - - # Initialize training kwargs from user kwargs: - train_kwargs = configs[ConfigKeys.training] - - # If deepspeed config given we add it to training kwargs - if configs[ConfigKeys.deepspeed]: - train_kwargs["deepspeed"] = configs[ConfigKeys.deepspeed] - - # Take a look at the trainable parameters in the model - _print_trainable_parameters(model) - - # Preparing training arguments: - training_args = transformers.TrainingArguments( - output_dir=tempfile.mkdtemp(), - **train_kwargs, - ) - - trainer = transformers.Trainer( - model=model, - train_dataset=tokenized_train, - eval_dataset=tokenized_eval, - tokenizer=tokenizer, - data_collator=data_collator, - args=training_args, - ) - - apply_mlrun(trainer, model_name=model_name.split("/")[-1]) - model.config.use_cache = ( - False # silence the warnings. Please re-enable for inference! - ) - - # Apply training with evaluation: - context.logger.info(f"training '{model_name}'") - trainer.train() - - temp_directory = tempfile.TemporaryDirectory().name - trainer.save_model(temp_directory) - - # Zip the model directory: - shutil.make_archive( - base_name="model", - format="zip", - root_dir=temp_directory, - ) - - # Log the model: - context.log_model( - key="model", - db_key=model_name.split("/")[-1], - model_file="model.zip", - tag="", - framework="Hugging Face", - ) - - -def evaluate( - context, - model_path, - data: pd.DataFrame, - model_name: str = None, - tokenizer_name: str = None, -): - """ - Evaluating the model using perplexity, for more information visit: - https://huggingface.co/docs/transformers/perplexity - - :param context: mlrun context - :param model_path: path to the model directory - :param data: the data to evaluate the model - :param model_name: name of base model - :param tokenizer_name: name of base tokenizer - """ - # Get the model artifact and file: - ( - model_file, - model_artifact, - extra_data, - ) = mlrun.artifacts.get_model(model_path) - - # Read the name: - _model_name = model_artifact.spec.db_key - - # Extract logged model files: - model_directory = os.path.join(os.path.dirname(model_file), _model_name) - with zipfile.ZipFile(model_file, "r") as zip_file: - zip_file.extractall(model_directory) - - # Loading the saved pretrained tokenizer and model: - dataset = Dataset.from_pandas(data) - tokenizer = AutoTokenizer.from_pretrained(tokenizer_name) - pad_token_id = tokenizer.eos_token_id - model = AutoModelForCausalLM.from_pretrained( - model_name, device_map="cuda:0", trust_remote_code=True, load_in_8bit=True - ) - model = PeftModel.from_pretrained(model, model_directory) - model.eval() - encodings = tokenizer("\n\n".join(dataset["text"][:5]), return_tensors="pt") - - max_length = 1024 - stride = 512 - seq_len = encodings.input_ids.size(1) - - nlls = [] - prev_end_loc = 0 - for begin_loc in range(0, seq_len, stride): - end_loc = min(begin_loc + max_length, seq_len) - trg_len = end_loc - prev_end_loc # may be different from stride on last loop - input_ids = encodings.input_ids[:, begin_loc:end_loc] - target_ids = input_ids.clone() - target_ids[:, :-trg_len] = -100 - - with torch.no_grad(): - outputs = model(input_ids.cuda(), labels=target_ids) - - # loss is calculated using CrossEntropyLoss which averages over valid labels - # N.B. the model only calculates loss over trg_len - 1 labels, because it internally shifts the labels - # to the left by 1. - neg_log_likelihood = outputs.loss - - nlls.append(neg_log_likelihood) - - prev_end_loc = end_loc - if end_loc == seq_len: - break - - ppl = torch.exp(torch.stack(nlls).mean()).item() - context.log_result("perplexity", ppl) diff --git a/huggingface_auto_trainer/item.yaml b/huggingface_auto_trainer/item.yaml deleted file mode 100644 index b7c9bbcca..000000000 --- a/huggingface_auto_trainer/item.yaml +++ /dev/null @@ -1,27 +0,0 @@ -apiVersion: v1 -categories: -- huggingface -- genai -- machine-learning -- model-training -description: fine-tune llm model with ease -doc: '' -example: huggingface_auto_trainer.ipynb -generationDate: 2023-08-21:17-25 -hidden: false -icon: '' -labels: - author: Zeevr -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.4.0 -name: huggingface-auto-trainer -platformVersion: 3.5.0 -spec: - filename: huggingface_auto_trainer.py - handler: finetune_llm - image: mlrun/mlrun - kind: job - requirements: [] -url: '' -version: 1.1.0 diff --git a/huggingface_auto_trainer/requirements.txt b/huggingface_auto_trainer/requirements.txt deleted file mode 100644 index 1376b1d00..000000000 --- a/huggingface_auto_trainer/requirements.txt +++ /dev/null @@ -1,5 +0,0 @@ -peft -transformers -torch -datasets -plotly diff --git a/huggingface_auto_trainer/test_huggingface_auto_trainer.py b/huggingface_auto_trainer/test_huggingface_auto_trainer.py deleted file mode 100644 index 53576e4e7..000000000 --- a/huggingface_auto_trainer/test_huggingface_auto_trainer.py +++ /dev/null @@ -1,42 +0,0 @@ -import tempfile - -import mlrun - - -def test_train(): - - model_name = "distilgpt2" - tokenizer = model_name - auto_trainer = mlrun.import_function("function.yaml") - - training_arguments = { - "per_device_train_batch_size": 4, - "gradient_accumulation_steps": 1, - "warmup_steps": 2, - "max_steps": 10, - "learning_rate": 2e-4, - "logging_steps": 1, - } - - params = { - "model": (model_name, "transformers.AutoModelForCausalLM"), - "tokenizer": tokenizer, - "train_dataset": "Abirate/english_quotes", - "training_config": training_arguments, - "dataset_columns_to_train": "quote", - "model_pretrained_config": {"use_cache": False}, - "use_cuda": False, - } - - try: - with tempfile.TemporaryDirectory() as test_directory: - auto_trainer.run( - local=True, - params=params, - handler="finetune_llm", - returns=["model"], - workdir=test_directory, - ) - - except Exception as exception: - print(f"- The training failed - raised the following error:\n- {exception}") diff --git a/ingest/function.yaml b/ingest/function.yaml deleted file mode 100644 index a05ca6698..000000000 --- a/ingest/function.yaml +++ /dev/null @@ -1,87 +0,0 @@ -kind: job -metadata: - name: ingest - tag: '' - hash: 7e28700a86ebdd18d887fe588492201a1e3ef2f6 - project: '' - labels: - author: yonish - categories: - - data-preparation - - data-analysis - - feature-store -spec: - command: '' - args: [] - image: mlrun/mlrun - build: - functionSourceCode: ZnJvbSB0eXBpbmcgaW1wb3J0IFVuaW9uLCBMaXN0LCBEaWN0CgppbXBvcnQgbWxydW4uZmVhdHVyZV9zdG9yZSBhcyBmcwpmcm9tIG1scnVuLmV4ZWN1dGlvbiBpbXBvcnQgTUxDbGllbnRDdHgKZnJvbSBtbHJ1bi5kYXRhX3R5cGVzIGltcG9ydCBJbmZlck9wdGlvbnMKCgpkZWYgaW5nZXN0KAogICAgY29udGV4dDogTUxDbGllbnRDdHgsCiAgICBmZWF0dXJlc2V0OiBzdHIsCiAgICBzb3VyY2U6IHN0ciwKICAgIHRhcmdldHM6IExpc3RbVW5pb25bc3RyLCBEaWN0XV0gPSBOb25lLAogICAgbmFtZXNwYWNlPU5vbmUsCiAgICBpbmZlcl9vcHRpb25zPU5vbmUsCiAgICBydW5fY29uZmlnOiBVbmlvbltzdHIsIERpY3RdID0gTm9uZSwKICAgIHNwYXJrX2NvbnRleHQ9Tm9uZSwKICAgIG92ZXJ3cml0ZT1Ob25lLAopOgogICAgIiIiUmVhZCBsb2NhbCBEYXRhRnJhbWUsIGZpbGUsIFVSTCwgb3Igc291cmNlIGludG8gdGhlIGZlYXR1cmUgc3RvcmUKICAgIEluZ2VzdCByZWFkcyBmcm9tIHRoZSBzb3VyY2UsIHJ1biB0aGUgZ3JhcGggdHJhbnNmb3JtYXRpb25zLCBpbmZlcnMgIG1ldGFkYXRhIGFuZCBzdGF0cwogICAgYW5kIHdyaXRlcyB0aGUgcmVzdWx0cyB0byB0aGUgZGVmYXVsdCBvZiBzcGVjaWZpZWQgdGFyZ2V0cwoKICAgIHdoZW4gdGFyZ2V0cyBhcmUgbm90IHNwZWNpZmllZCBkYXRhIGlzIHN0b3JlZCBpbiB0aGUgY29uZmlndXJlZCBkZWZhdWx0IHRhcmdldHMKICAgICh3aWxsIHVzdWFsbHkgYmUgTm9TUUwgZm9yIHJlYWwtdGltZSBhbmQgUGFycXVldCBmb3Igb2ZmbGluZSkuCgogICAgZXhhbXBsZTo6CgogICAgICAgIHN0b2Nrc19zZXQgPSBGZWF0dXJlU2V0KCJzdG9ja3MiLCBlbnRpdGllcz1bRW50aXR5KCJ0aWNrZXIiKV0pCiAgICAgICAgc3RvY2tzID0gcGQucmVhZF9jc3YoInN0b2Nrcy5jc3YiKQogICAgICAgIGRmID0gaW5nZXN0KHN0b2Nrc19zZXQsIHN0b2NrcywgaW5mZXJfb3B0aW9ucz1mc3RvcmUuSW5mZXJPcHRpb25zLmRlZmF1bHQoKSkKCiAgICAgICAgIyBmb3IgcnVubmluZyBhcyByZW1vdGUgam9iCiAgICAgICAgY29uZmlnID0gUnVuQ29uZmlnKGltYWdlPSdtbHJ1bi9tbHJ1bicpLmFwcGx5KG1vdW50X3YzaW8oKSkKICAgICAgICBkZiA9IGluZ2VzdChzdG9ja3Nfc2V0LCBzdG9ja3MsIHJ1bl9jb25maWc9Y29uZmlnKQoKICAgICAgICAjIHNwZWNpZnkgc291cmNlIGFuZCB0YXJnZXRzCiAgICAgICAgc291cmNlID0gQ1NWU291cmNlKCJteWNzdiIsIHBhdGg9Im1lYXN1cmVtZW50cy5jc3YiKQogICAgICAgIHRhcmdldHMgPSBbQ1NWVGFyZ2V0KCJteWNzdiIsIHBhdGg9Ii4vbXljc3YuY3N2IildCiAgICAgICAgaW5nZXN0KG1lYXN1cmVtZW50cywgc291cmNlLCB0YXJnZXRzKQoKICAgIDpwYXJhbSBjb250ZXh0OiAgICAgICBNTFJ1biBjb250ZXh0CiAgICA6cGFyYW0gZmVhdHVyZXNldDogICAgZmVhdHVyZSBzZXQgb2JqZWN0IG9yIGZlYXR1cmVzZXQudXJpLiAodXJpIG11c3QgYmUgb2YgYSBmZWF0dXJlIHNldCB0aGF0IGlzIGluIHRoZSBEQiwKICAgICAgICAgICAgICAgICAgICAgICAgICBjYWxsIGAuc2F2ZSgpYCBpZiBpdCdzIG5vdCkKICAgIDpwYXJhbSBzb3VyY2U6ICAgICAgICBzb3VyY2UgZGF0YWZyYW1lIG9yIGZpbGUgcGF0aAogICAgOnBhcmFtIHRhcmdldHM6ICAgICAgIG9wdGlvbmFsIGxpc3Qgb2YgZGF0YSB0YXJnZXQgb2JqZWN0cwogICAgOnBhcmFtIG5hbWVzcGFjZTogICAgIG5hbWVzcGFjZSBvciBtb2R1bGUgY29udGFpbmluZyBncmFwaCBjbGFzc2VzCiAgICA6cGFyYW0gaW5mZXJfb3B0aW9uczogc2NoZW1hIGFuZCBzdGF0cyBpbmZlciBvcHRpb25zCiAgICA6cGFyYW0gcnVuX2NvbmZpZzogICAgZnVuY3Rpb24gYW5kL29yIHJ1biBjb25maWd1cmF0aW9uIGZvciByZW1vdGUgam9icywKICAgICAgICAgICAgICAgICAgICAgICAgICBzZWUgOnB5OmNsYXNzOmB+bWxydW4uZmVhdHVyZV9zdG9yZS5SdW5Db25maWdgCiAgICA6cGFyYW0gc3BhcmtfY29udGV4dDogbG9jYWwgc3Bhcmsgc2Vzc2lvbiBmb3Igc3BhcmsgaW5nZXN0aW9uLCBleGFtcGxlIGZvciBjcmVhdGluZyB0aGUgc3BhcmsgY29udGV4dDoKICAgICAgICAgICAgICAgICAgICAgICAgICBgc3BhcmsgPSBTcGFya1Nlc3Npb24uYnVpbGRlci5hcHBOYW1lKCJTcGFyayBmdW5jdGlvbiIpLmdldE9yQ3JlYXRlKClgCiAgICAgICAgICAgICAgICAgICAgICAgICAgRm9yIHJlbW90ZSBzcGFyayBpbmdlc3Rpb24sIHRoaXMgc2hvdWxkIGNvbnRhaW4gdGhlIHJlbW90ZSBzcGFyayBzZXJ2aWNlIG5hbWUKICAgIDpwYXJhbSBvdmVyd3JpdGU6ICAgICBkZWxldGUgdGhlIHRhcmdldHMnIGRhdGEgcHJpb3IgdG8gaW5nZXN0aW9uCiAgICAgICAgICAgICAgICAgICAgICAgICAgKGRlZmF1bHQ6IFRydWUgZm9yIG5vbi1zY2hlZHVsZWQgaW5nZXN0IC0gZGVsZXRlcyB0aGUgdGFyZ2V0cyB0aGF0IGFyZSBhYm91dCB0byBiZSBpbmdlc3RlZC4KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgRmFsc2UgZm9yIHNjaGVkdWxlZCBpbmdlc3QgLSBkb2VzIG5vdCBkZWxldGUgdGhlIHRhcmdldCkKCiAgICAiIiIKICAgICMgU2V0dGluZyBpbmZlcl9vcHRpb25zIHRvIGRlZmF1bHQ6CiAgICBjb250ZXh0Ll9wYXJhbWV0ZXJzWyJpbmZlcl9vcHRpb25zIl0gPSBpbmZlcl9vcHRpb25zIG9yIEluZmVyT3B0aW9ucy5kZWZhdWx0KCkKCiAgICBjb250ZXh0LmxvZ2dlci5pbmZvKGYiQ2FsbGluZyBpbmdlc3Rpb24gdGFzayB3aXRoOiB7ZmVhdHVyZXNldH0iKQoKICAgICMgaW5nZXN0IGNhbGxlZCB3aXRoIG1scnVuX2NvbnRleHQsIGZlYXR1cmVfc2V0LCBzb3VyY2UgYW5kIHRhcmdldHMgcGFzc2VkIHdpdGggY29udGV4dAogICAgIyBUaGlzIHBhcmFtcyBoZXJlIGZvciBkb2N1bWVudGF0aW9uIHB1cnBvc2VzIG9ubHkKICAgIGZzLmluZ2VzdCgKICAgICAgICBtbHJ1bl9jb250ZXh0PWNvbnRleHQsCiAgICAgICAgbmFtZXNwYWNlPW5hbWVzcGFjZSwKICAgICAgICBzcGFya19jb250ZXh0PXNwYXJrX2NvbnRleHQsCiAgICApCiAgICBjb250ZXh0LmxvZ19yZXN1bHQoImZlYXR1cmVzZXQiLCBmZWF0dXJlc2V0KQo= - commands: [] - code_origin: https://github.com/mlrun/functions.git#886a88217c2a2570c81a14877f9c1dfb1ac8a244:C:\Users\yonatans\projects\functions\ingest\ingest.py - origin_filename: C:\Users\yonatans\projects\functions\ingest\ingest.py - entry_points: - ingest: - name: ingest - doc: "Read local DataFrame, file, URL, or source into the feature store\nIngest\ - \ reads from the source, run the graph transformations, infers metadata and\ - \ stats\nand writes the results to the default of specified targets\n\nwhen\ - \ targets are not specified data is stored in the configured default targets\n\ - (will usually be NoSQL for real-time and Parquet for offline).\n\nexample::\n\ - \n stocks_set = FeatureSet(\"stocks\", entities=[Entity(\"ticker\")])\n\ - \ stocks = pd.read_csv(\"stocks.csv\")\n df = ingest(stocks_set, stocks,\ - \ infer_options=fstore.InferOptions.default())\n\n # for running as remote\ - \ job\n config = RunConfig(image='mlrun/mlrun').apply(mount_v3io())\n \ - \ df = ingest(stocks_set, stocks, run_config=config)\n\n # specify source\ - \ and targets\n source = CSVSource(\"mycsv\", path=\"measurements.csv\"\ - )\n targets = [CSVTarget(\"mycsv\", path=\"./mycsv.csv\")]\n ingest(measurements,\ - \ source, targets)" - parameters: - - name: context - type: MLClientCtx - doc: MLRun context - default: '' - - name: featureset - type: str - doc: feature set object or featureset.uri. (uri must be of a feature set that - is in the DB, call `.save()` if it's not) - default: '' - - name: source - type: str - doc: source dataframe or file path - default: '' - - name: targets - type: List[Union[str, Dict]] - doc: optional list of data target objects - default: null - - name: namespace - doc: namespace or module containing graph classes - default: null - - name: infer_options - doc: schema and stats infer options - default: null - - name: run_config - type: Union[str, Dict] - doc: function and/or run configuration for remote jobs, see :py:class:`~mlrun.feature_store.RunConfig` - default: null - - name: spark_context - doc: 'local spark session for spark ingestion, example for creating the spark - context: `spark = SparkSession.builder.appName("Spark function").getOrCreate()` - For remote spark ingestion, this should contain the remote spark service - name' - default: null - - name: overwrite - doc: 'delete the targets'' data prior to ingestion (default: True for non-scheduled - ingest - deletes the targets that are about to be ingested. False for scheduled - ingest - does not delete the target)' - default: null - outputs: - - default: '' - lineno: 8 - description: Feature Store ingest function that runs the transformation graph on - the source of the featureset. - default_handler: ingest - disable_auto_mount: false - env: [] - priority_class_name: '' - affinity: null -verbose: false diff --git a/ingest/ingest.ipynb b/ingest/ingest.ipynb deleted file mode 100644 index 7da398b4f..000000000 --- a/ingest/ingest.ipynb +++ /dev/null @@ -1,762 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Feature Store Ingest" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Read local DataFrame, file, URL, or source into the feature store\n", - "Ingest reads from the source, run the graph transformations, infers metadata and stats\n", - "and writes the results to the default of specified targets." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Creating Project" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import mlrun\n", - "import mlrun.feature_store as fstore\n", - "from mlrun.datastore.sources import CSVSource\n", - "from mlrun.feature_store.steps import *\n", - "from mlrun.features import MinMaxValidator\n", - "import pandas as pd\n", - "import datetime" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2022-01-31 13:52:16,939 [info] loaded project ingest from MLRun DB\n" - ] - } - ], - "source": [ - "# Initialize the MLRun project object\n", - "project = mlrun.get_or_create_project('ingest', context=\"./\", user_project=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Create Sample Data For Demo" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "quotes = pd.DataFrame(\n", - " {\n", - " \"time\": [\n", - " pd.Timestamp(\"2016-05-25 13:30:00.023\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.023\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.030\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.041\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.048\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.049\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.072\"),\n", - " pd.Timestamp(\"2016-05-25 13:30:00.075\"),\n", - " ],\n", - " \"ticker\": [\"GOOG\", \"MSFT\", \"MSFT\", \"MSFT\", \"GOOG\", \"AAPL\", \"GOOG\", \"MSFT\"],\n", - " \"bid\": [720.50, 51.95, 51.97, 51.99, 720.50, 97.99, 720.50, 52.01],\n", - " \"ask\": [720.93, 51.96, 51.98, 52.00, 720.93, 98.01, 720.88, 52.03],\n", - " }\n", - ")\n", - "\n", - "# move date:\n", - "max_date = quotes[\"time\"].max()\n", - "now_date = datetime.datetime.now()\n", - "delta = now_date - max_date\n", - "quotes[\"time\"] = quotes[\"time\"] + delta" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
timetickerbidask
02022-01-31 13:52:16.905388GOOG720.50720.93
12022-01-31 13:52:16.905388MSFT51.9551.96
22022-01-31 13:52:16.912388MSFT51.9751.98
32022-01-31 13:52:16.923388MSFT51.9952.00
42022-01-31 13:52:16.930388GOOG720.50720.93
52022-01-31 13:52:16.931388AAPL97.9998.01
62022-01-31 13:52:16.954388GOOG720.50720.88
72022-01-31 13:52:16.957388MSFT52.0152.03
\n", - "
" - ], - "text/plain": [ - " time ticker bid ask\n", - "0 2022-01-31 13:52:16.905388 GOOG 720.50 720.93\n", - "1 2022-01-31 13:52:16.905388 MSFT 51.95 51.96\n", - "2 2022-01-31 13:52:16.912388 MSFT 51.97 51.98\n", - "3 2022-01-31 13:52:16.923388 MSFT 51.99 52.00\n", - "4 2022-01-31 13:52:16.930388 GOOG 720.50 720.93\n", - "5 2022-01-31 13:52:16.931388 AAPL 97.99 98.01\n", - "6 2022-01-31 13:52:16.954388 GOOG 720.50 720.88\n", - "7 2022-01-31 13:52:16.957388 MSFT 52.01 52.03" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "quotes" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Build Advanced Feature Set - With Feature Engineering Pipeline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Define a custom pipeline step (python class)" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "class MyMap(MapClass):\n", - " def __init__(self, multiplier=1, **kwargs):\n", - " super().__init__(**kwargs)\n", - " self._multiplier = multiplier\n", - "\n", - " def do(self, event):\n", - " event[\"multi\"] = event[\"bid\"] * self._multiplier\n", - " return event" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Build and show the transformatiom pipeline" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "data": { - "image/svg+xml": [ - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "mlrun-flow\n", - "\n", - "\n", - "\n", - "_start\n", - "\n", - "start\n", - "\n", - "\n", - "\n", - "map.MyMap\n", - "\n", - "map.MyMap\n", - "\n", - "\n", - "\n", - "_start->map.MyMap\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "storey.Extend\n", - "\n", - "storey.Extend\n", - "\n", - "\n", - "\n", - "map.MyMap->storey.Extend\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "filter\n", - "\n", - "filter\n", - "\n", - "\n", - "\n", - "storey.Extend->filter\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "FeaturesetValidator\n", - "\n", - "FeaturesetValidator\n", - "\n", - "\n", - "\n", - "filter->FeaturesetValidator\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "Aggregates\n", - "\n", - "Aggregates\n", - "\n", - "\n", - "\n", - "FeaturesetValidator->Aggregates\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "parquet\n", - "\n", - "\n", - "parquet\n", - "\n", - "\n", - "\n", - "Aggregates->parquet\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "nosql\n", - "\n", - "\n", - "nosql\n", - "\n", - "\n", - "\n", - "Aggregates->nosql\n", - "\n", - "\n", - "\n", - "\n", - "\n" - ], - "text/plain": [ - "" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "quotes_set = fstore.FeatureSet(\"stock-quotes\", entities=[fstore.Entity(\"ticker\")])\n", - "\n", - "quotes_set.graph.to(\"map.MyMap\", multiplier=3).to(\n", - " \"storey.Extend\", _fn=\"({'extra': event['bid'] * 77})\"\n", - ").to(\"storey.Filter\", \"filter\", _fn=\"(event['bid'] > 51.92)\").to(\n", - " FeaturesetValidator()\n", - ")\n", - "\n", - "quotes_set.add_aggregation(\"ask\", [\"sum\", \"max\"], \"1h\", \"10m\", name=\"asks1\")\n", - "quotes_set.add_aggregation(\"ask\", [\"sum\", \"max\"], \"5h\", \"10m\", name=\"asks5\")\n", - "quotes_set.add_aggregation(\"bid\", [\"min\", \"max\"], \"1h\", \"10m\", name=\"bids\")\n", - "\n", - "# add feature validation policy\n", - "quotes_set[\"bid\"] = fstore.Feature(\n", - " validator=MinMaxValidator(min=52, severity=\"info\")\n", - ")\n", - "\n", - "# add default target definitions and plot\n", - "quotes_set.set_targets()\n", - "quotes_set.plot(rankdir=\"LR\", with_targets=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Saving the feature set in the feature store " - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "quotes_set.save()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Creating the data source of the feature set to apply the ingest on:" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "data_uri = 'quotes.csv'\n", - "quotes.to_csv(data_uri, index=False)\n", - "source = CSVSource('quotes', data_uri).to_dict()\n", - "source" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Import ingest function" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "ingest_fn = mlrun.import_function(\"function.yaml\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Running the function locally" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2022-01-31 13:52:17,201 [info] starting run ingest-ingest uid=4bd5d12691a8439d90bf53847f59df1a DB=http://mlrun-api:8080\n", - "> 2022-01-31 13:52:17,354 [info] Ingesting the FeatureSet: store://feature-sets/ingest-yonatan/stock-quotes\n", - "> 2022-01-31 13:52:17,427 [info] starting ingestion task to store://feature-sets/ingest-yonatan/stock-quotes:latest.\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 13:52:19.466055 args={'min': 52, 'value': 51.95}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 13:52:19.466072 args={'min': 52, 'value': 51.97}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 13:52:19.466085 args={'min': 52, 'value': 51.99}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 13:52:19.671677 args={'min': 52, 'value': 51.95}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 13:52:19.671692 args={'min': 52, 'value': 51.97}\n", - "info! bid value is smaller than min, key=['MSFT'] time=2022-01-31 13:52:19.671708 args={'min': 52, 'value': 51.99}\n", - "> 2022-01-31 13:52:19,915 [info] ingestion task completed, targets:\n", - "> 2022-01-31 13:52:19,915 [info] [{'name': 'parquet', 'kind': 'parquet', 'path': 'v3io:///projects/ingest-yonatan/FeatureStore/stock-quotes/parquet/sets/stock-quotes-latest', 'status': 'created', 'updated': '2022-01-31T13:52:19.649303+00:00', 'last_written': datetime.datetime(2022, 1, 31, 13, 52, 19, 671753)}, {'name': 'nosql', 'kind': 'nosql', 'path': 'v3io:///projects/ingest-yonatan/FeatureStore/stock-quotes/nosql/sets/stock-quotes-latest', 'status': 'created', 'updated': '2022-01-31T13:52:19.650044+00:00'}]\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
ingest-yonatan0Jan 31 13:52:17completedingest-ingest
v3io_user=yonatan
kind=
owner=yonatan
host=jupyter-yoni-647b99c95d-w4jlc
featureset=store://feature-sets/ingest-yonatan/stock-quotes
source={'kind': 'csv', 'name': 'quotes', 'path': 'quotes.csv'}
infer_options=63
overwrite=None
targets=None
featureset=store://feature-sets/ingest-yonatan/stock-quotes
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/html": [ - " > to track results use the .show() or .logs() methods or click here to open in UI" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2022-01-31 13:52:20,045 [info] run executed, status=completed\n" - ] - } - ], - "source": [ - "ingest_run = ingest_fn.run(\n", - " handler=\"ingest\",\n", - " params={\n", - " \"featureset\": quotes_set.uri,\n", - " \"source\": source,\n", - " },\n", - " local=True,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## View of the targets' state after run" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'created'" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fstore.get_feature_set(ingest_run.outputs['featureset']).status.state" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/ingest/ingest.py b/ingest/ingest.py deleted file mode 100644 index 1412cbaf5..000000000 --- a/ingest/ingest.py +++ /dev/null @@ -1,84 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from typing import Union, List, Dict - -import mlrun.feature_store as fs -from mlrun.execution import MLClientCtx -from mlrun.data_types import InferOptions - - -def ingest( - context: MLClientCtx, - featureset: str, - source: str, - targets: List[Union[str, Dict]] = None, - namespace=None, - infer_options=None, - run_config: Union[str, Dict] = None, - spark_context=None, - overwrite=None, -): - """Read local DataFrame, file, URL, or source into the feature store - Ingest reads from the source, run the graph transformations, infers metadata and stats - and writes the results to the default of specified targets - - when targets are not specified data is stored in the configured default targets - (will usually be NoSQL for real-time and Parquet for offline). - - example:: - - stocks_set = FeatureSet("stocks", entities=[Entity("ticker")]) - stocks = pd.read_csv("stocks.csv") - df = ingest(stocks_set, stocks, infer_options=fstore.InferOptions.default()) - - # for running as remote job - config = RunConfig(image='mlrun/mlrun').apply(mount_v3io()) - df = ingest(stocks_set, stocks, run_config=config) - - # specify source and targets - source = CSVSource("mycsv", path="measurements.csv") - targets = [CSVTarget("mycsv", path="./mycsv.csv")] - ingest(measurements, source, targets) - - :param context: MLRun context - :param featureset: feature set object or featureset.uri. (uri must be of a feature set that is in the DB, - call `.save()` if it's not) - :param source: source dataframe or file path - :param targets: optional list of data target objects - :param namespace: namespace or module containing graph classes - :param infer_options: schema and stats infer options - :param run_config: function and/or run configuration for remote jobs, - see :py:class:`~mlrun.feature_store.RunConfig` - :param spark_context: local spark session for spark ingestion, example for creating the spark context: - `spark = SparkSession.builder.appName("Spark function").getOrCreate()` - For remote spark ingestion, this should contain the remote spark service name - :param overwrite: delete the targets' data prior to ingestion - (default: True for non-scheduled ingest - deletes the targets that are about to be ingested. - False for scheduled ingest - does not delete the target) - - """ - # Setting infer_options to default: - context._parameters["infer_options"] = infer_options or InferOptions.default() - - context.logger.info(f"Calling ingestion task with: {featureset}") - - # ingest called with mlrun_context, feature_set, source and targets passed with context - # This params here for documentation purposes only - fs.ingest( - mlrun_context=context, - namespace=namespace, - spark_context=spark_context, - ) - context.log_result("featureset", featureset) diff --git a/ingest/item.yaml b/ingest/item.yaml deleted file mode 100644 index 8665e88f4..000000000 --- a/ingest/item.yaml +++ /dev/null @@ -1,27 +0,0 @@ -apiVersion: v1 -categories: -- data-preparation -- data-analysis -- feature-store -description: Feature Store ingest function that runs the transformation graph on the - source of the featureset. -doc: '' -example: ingest.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: yonish -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: ingest -platformVersion: 3.5.0 -spec: - filename: ingest.py - handler: ingest - image: mlrun/mlrun - kind: job - requirements: [] -url: '' -version: 1.1.0 diff --git a/ingest/test_ingest.py b/ingest/test_ingest.py deleted file mode 100644 index 224f520b4..000000000 --- a/ingest/test_ingest.py +++ /dev/null @@ -1,171 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import os -import tempfile -import shutil -import datetime -import pytest - -import mlrun -import mlrun.feature_store as fstore -from mlrun.datastore.sources import CSVSource -from mlrun.feature_store.steps import * -from mlrun.features import MinMaxValidator -import pandas as pd - -REQUIRED_ENV_VARS = [ - "MLRUN_DBPATH", - "MLRUN_ARTIFACT_PATH", - "V3IO_USERNAME", - "V3IO_API", - "V3IO_ACCESS_KEY", -] - - -def _validate_environment_variables() -> bool: - """ - Checks that all required Environment variables are set. - """ - environment_keys = os.environ.keys() - return all(key in environment_keys for key in REQUIRED_ENV_VARS) - - -def _set_environment(): - artifact_path = tempfile.TemporaryDirectory().name - os.makedirs(artifact_path) - project = mlrun.new_project("ingest-test") - return artifact_path, project - - -def _cleanup_environment(artifact_path: str): - """ - Cleanup the test environment, deleting files and artifacts created during the test. - - :param artifact_path: The artifact path to delete. - """ - # Clean the local directory: - for test_output in [ - *os.listdir(artifact_path), - "schedules", - "runs", - "artifacts", - "functions", - ]: - test_output_path = os.path.abspath(f"./{test_output}") - if os.path.exists(test_output_path): - if os.path.isdir(test_output_path): - shutil.rmtree(test_output_path) - else: - os.remove(test_output_path) - - # Clean the artifacts' directory: - shutil.rmtree(artifact_path) - - -def create_dataframes() -> (pd.DataFrame, pd.DataFrame): - quotes = pd.DataFrame( - { - "time": [ - pd.Timestamp("2016-05-25 13:30:00.023"), - pd.Timestamp("2016-05-25 13:30:00.023"), - pd.Timestamp("2016-05-25 13:30:00.030"), - pd.Timestamp("2016-05-25 13:30:00.041"), - pd.Timestamp("2016-05-25 13:30:00.048"), - pd.Timestamp("2016-05-25 13:30:00.049"), - pd.Timestamp("2016-05-25 13:30:00.072"), - pd.Timestamp("2016-05-25 13:30:00.075"), - ], - "ticker": ["GOOG", "MSFT", "MSFT", "MSFT", "GOOG", "AAPL", "GOOG", "MSFT"], - "bid": [720.50, 51.95, 51.97, 51.99, 720.50, 97.99, 720.50, 52.01], - "ask": [720.93, 51.96, 51.98, 52.00, 720.93, 98.01, 720.88, 52.03], - } - ) - - # move date: - max_date = quotes["time"].max() - now_date = datetime.datetime.now() - delta = now_date - max_date - quotes["time"] = quotes["time"] + delta - - return quotes - - -class MyMap(MapClass): - def __init__(self, multiplier=1, **kwargs): - super().__init__(**kwargs) - self._multiplier = multiplier - - def do(self, event): - event["multi"] = event["bid"] * self._multiplier - return event - - -def _create_feature_set(): - quotes_set = fstore.FeatureSet("stock-quotes", entities=[fstore.Entity("ticker")]) - - quotes_set.graph.to("test_ingest.MyMap", multiplier=3).to( - "storey.Extend", _fn="({'extra': event['bid'] * 77})" - ).to("storey.Filter", "filter", _fn="(event['bid'] > 51.92)").to( - FeaturesetValidator() - ) - - quotes_set.add_aggregation("ask", ["sum", "max"], "1h", "10m", name="asks1") - quotes_set.add_aggregation("ask", ["sum", "max"], "5h", "10m", name="asks5") - quotes_set.add_aggregation("bid", ["min", "max"], "1h", "10m", name="bids") - - # add feature validation policy - quotes_set["bid"] = fstore.Feature( - validator=MinMaxValidator(min=52, severity="info") - ) - - # add default target definitions - quotes_set.set_targets() - return quotes_set - - -@pytest.mark.skipif( - condition=not _validate_environment_variables(), - reason="Project's environment variables are not set", -) -def test_ingest(): - artifact_path, project = _set_environment() - ingest_fn = mlrun.import_function("function.yaml") - quotes = create_dataframes() - - quotes_set = _create_feature_set() - quotes_set.save() - - data_uri = os.path.join(artifact_path, "quotes.csv") - quotes.to_csv(data_uri, index=False) - source = CSVSource("quotes", data_uri).to_dict() - - ingest_run = None - try: - ingest_run = ingest_fn.run( - handler="ingest", - params={ - "featureset": quotes_set.uri, - "source": source, - }, - local=True, - ) - - except Exception as exception: - print(f"- The test failed - raised the following error:\n- {exception}") - assert ( - fstore.get_feature_set(ingest_run.outputs["featureset"]).status.state - == "created" - ), "Targets not created successfully" - _cleanup_environment(artifact_path) diff --git a/load_dask/function.yaml b/load_dask/function.yaml deleted file mode 100644 index a0f73c5fe..000000000 --- a/load_dask/function.yaml +++ /dev/null @@ -1,75 +0,0 @@ -kind: dask -metadata: - name: load-dask - tag: '' - hash: 2af86b4c6ce0bc3e3d0468c1b66a5358482f383e - project: '' - labels: - author: yjb - categories: - - data-preparation - - etl -spec: - command: '' - image: mlrun/ml-models - env: [] - build: - functionSourceCode: ZnJvbSBtbHJ1bi5leGVjdXRpb24gaW1wb3J0IE1MQ2xpZW50Q3R4CmZyb20gbWxydW4uZGF0YXN0b3JlIGltcG9ydCBEYXRhSXRlbQoKZnJvbSB0eXBpbmcgaW1wb3J0IExpc3QsIE9wdGlvbmFsCgoKZGVmIGxvYWRfZGFzaygKICAgICAgICBjb250ZXh0OiBNTENsaWVudEN0eCwKICAgICAgICBzcmNfZGF0YTogRGF0YUl0ZW0sCiAgICAgICAgZGFza19rZXk6IHN0ciA9ICJkYXNrX2tleSIsCiAgICAgICAgaW5jX2NvbHM6IE9wdGlvbmFsW0xpc3Rbc3RyXV0gPSBOb25lLAogICAgICAgIGluZGV4X2NvbHM6IE9wdGlvbmFsW0xpc3Rbc3RyXV0gPSBOb25lLAogICAgICAgIGRhc2tfcGVyc2lzdDogYm9vbCA9IFRydWUsCiAgICAgICAgcmVmcmVzaF9kYXRhOiBib29sID0gVHJ1ZSwKICAgICAgICBzY2hlZHVsZXJfa2V5OiBzdHIgPSAic2NoZWR1bGVyIgopIC0+IE5vbmU6CiAgICAiIiJMb2FkIGRhdGFzZXQgaW50byBhbiBleGlzdGluZyBkYXNrIGNsdXN0ZXIKCiAgICBkYXNrIGpvYnMgZGVmaW5lIHRoZSBkYXNrIGNsaWVudCBwYXJhbWV0ZXJzIGF0IHRoZSBqb2IgbGV2ZWwsIHRoaXMgbWV0aG9kIHdpbGwgcmFpc2UgYW4gZXJyb3IgaWYgbm8gY2xpZW50IGlzIGRldGVjdGVkLgoKICAgIDpwYXJhbSBjb250ZXh0OiAgICAgICAgIHRoZSBmdW5jdGlvbiBjb250ZXh0CiAgICA6cGFyYW0gc3JjX2RhdGE6ICAgICAgICB1cmwgb2YgdGhlIGRhdGEgZmlsZSBvciBwYXJ0aXRpb25lZCBkYXRhc2V0IGFzIGVpdGhlcgogICAgICAgICAgICAgICAgICAgICAgICAgICAgYXJ0aWZhY3QgRGF0YUl0ZW0sIHN0cmluZywgb3IgcGF0aCBvYmplY3QgKHNpbWlsYXIgdG8KICAgICAgICAgICAgICAgICAgICAgICAgICAgIHBhbmRhcyByZWFkX2NzdikKICAgIDpwYXJhbSBkYXNrX2tleTogICAgICAgIGRlc3RpbmF0aW9uIGtleSBvZiBkYXRhIG9uIGRhc2sgY2x1c3RlciBhbmQgYXJ0aWZhY3Qgc3RvcmUKICAgIDpwYXJhbSBpbmNfY29sczogICAgICAgIGluY2x1ZGUgb25seSB0aGVzZSBjb2x1bW5zICh2ZXJ5IGZhc3QpCiAgICA6cGFyYW0gaW5kZXhfY29sczogICAgICBsaXN0IG9mIGluZGV4IGNvbHVtbiBuYW1lcyAoY2FuIGJlIGEgbG9uZy1ydW5uaW5nIHByb2Nlc3MpCiAgICA6cGFyYW0gZGFza19wZXJzaXN0OiAgICAoVHJ1ZSkgc2hvdWxkIHRoZSBkYXRhIGJlIHBlcnNpc3RlZCAodGhyb3VnaCB0aGUgYGNsaWVudC5wZXJzaXN0YCBvcCkKICAgIDpwYXJhbSByZWZyZXNoX2RhdGE6ICAgIChGYWxzZSkgaWYgdGhlIGRhc2tfa2V5IGFscmVhZHkgZXhpc3RzIGluIHRoZSBkYXNrIGNsdXN0ZXIsIHRoaXMgd2lsbAogICAgICAgICAgICAgICAgICAgICAgICAgICAgcmFpc2UgYW4gRXhjZXB0aW9uLiAgU2V0IHRvIFRydWUgdG8gcmVwbGFjZSB0aGUgZXhpc3RpbmcgY2x1c3RlciBkYXRhLgogICAgOnBhcmFtIHNjaGVkdWxlcl9rZXk6ICAgKHNjaGVkdWxlcikgdGhlIGRhc2sgc2NoZWR1bGVyIGNvbmZpZ3VyYXRpb24sIGpzb24gYWxzbyBsb2dnZWQgYXMgYW4gYXJ0aWZhY3QKICAgICIiIgogICAgaWYgaGFzYXR0cihjb250ZXh0LCAiZGFza19jbGllbnQiKToKICAgICAgICBkYXNrX2NsaWVudCA9IGNvbnRleHQuZGFza19jbGllbnQKICAgIGVsc2U6CiAgICAgICAgcmFpc2UgRXhjZXB0aW9uKCJhIGRhc2sgY2xpZW50IHdhcyBub3QgZm91bmQgaW4gdGhlIGV4ZWN1dGlvbiBjb250ZXh0IikKCiAgICBkZiA9IHNyY19kYXRhLmFzX2RmKGRmX21vZHVsZT1kZCkKCiAgICBpZiBkYXNrX3BlcnNpc3Q6CiAgICAgICAgZGYgPSBkYXNrX2NsaWVudC5wZXJzaXN0KGRmKQogICAgICAgIGlmIGRhc2tfY2xpZW50LmRhdGFzZXRzIGFuZCBkYXNrX2tleSBpbiBkYXNrX2NsaWVudC5kYXRhc2V0czoKICAgICAgICAgICAgZGFza19jbGllbnQudW5wdWJsaXNoX2RhdGFzZXQoZGFza19rZXkpCiAgICAgICAgZGFza19jbGllbnQucHVibGlzaF9kYXRhc2V0KGRmLCBuYW1lPWRhc2tfa2V5KQoKICAgIGlmIGNvbnRleHQ6CiAgICAgICAgY29udGV4dC5kYXNrX2NsaWVudCA9IGRhc2tfY2xpZW50CgogICAgIyBzaGFyZSB0aGUgc2NoZWR1bGVyLCB3aGV0aGVyIGRhdGEgaXMgcGVyc2lzdGVkIG9yIG5vdAogICAgZGFza19jbGllbnQud3JpdGVfc2NoZWR1bGVyX2ZpbGUoc2NoZWR1bGVyX2tleSArICIuanNvbiIpCgogICAgIyB3ZSBkb24ndCB1c2UgbG9nX2RhdGFzZXQgaGVyZSB1bnRpbCBpdCBjYW4gdGFrZSBpbnRvIGFjY291bnQKICAgICMgZGFzayBvcmlnaW4gYW5kIGFwcGx5IGRhc2sgZGVzY3JpYmUuCiAgICBjb250ZXh0LmxvZ19hcnRpZmFjdChzY2hlZHVsZXJfa2V5LCBsb2NhbF9wYXRoPXNjaGVkdWxlcl9rZXkgKyAiLmpzb24iKQ== - commands: [] - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/load_dask/load_dask.py - default_handler: load_dask - entry_points: - load_dask: - name: load_dask - doc: 'Load dataset into an existing dask cluster - - - dask jobs define the dask client parameters at the job level, this method - will raise an error if no client is detected.' - parameters: - - name: context - type: MLClientCtx - doc: the function context - default: '' - - name: src_data - type: DataItem - doc: url of the data file or partitioned dataset as either artifact DataItem, - string, or path object (similar to pandas read_csv) - default: '' - - name: dask_key - type: str - doc: destination key of data on dask cluster and artifact store - default: dask_key - - name: inc_cols - type: Optional[List[str]] - doc: include only these columns (very fast) - default: null - - name: index_cols - type: Optional[List[str]] - doc: list of index column names (can be a long-running process) - default: null - - name: dask_persist - type: bool - doc: (True) should the data be persisted (through the `client.persist` op) - default: true - - name: refresh_data - type: bool - doc: (False) if the dask_key already exists in the dask cluster, this will - raise an Exception. Set to True to replace the existing cluster data. - default: true - - name: scheduler_key - type: str - doc: (scheduler) the dask scheduler configuration, json also logged as an - artifact - default: scheduler - outputs: - - default: '' - lineno: 7 - description: load dask cluster with data - remote: true - nthreads: 1 - min_replicas: 0 - max_replicas: 16 - scheduler_timeout: 60 minutes - affinity: null -verbose: false diff --git a/load_dask/item.yaml b/load_dask/item.yaml deleted file mode 100644 index 3d923e370..000000000 --- a/load_dask/item.yaml +++ /dev/null @@ -1,25 +0,0 @@ -apiVersion: v1 -categories: -- data-preparation -- etl -description: load dask cluster with data -doc: '' -example: load_dask.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: yjb -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: load-dask -platformVersion: 3.5.0 -spec: - filename: load_dask.py - handler: load_dask - image: mlrun/ml-models - kind: dask - requirements: [] -url: '' -version: 1.1.0 diff --git a/load_dask/load_dask.ipynb b/load_dask/load_dask.ipynb deleted file mode 100644 index 3dfcdddb5..000000000 --- a/load_dask/load_dask.ipynb +++ /dev/null @@ -1,309 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# load dask cluster with data\n", - "load a parquet dataset into a dask cluster" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%nuclio config kind = \"dask\"\n", - "%nuclio config spec.image = \"mlrun/ml-models\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "import json\n", - "import numpy as np\n", - "import pandas as pd\n", - "\n", - "import dask\n", - "import dask.dataframe as dd\n", - "from dask.distributed import Client, LocalCluster\n", - "\n", - "from mlrun.execution import MLClientCtx\n", - "from mlrun.datastore import DataItem\n", - "\n", - "from typing import List, Optional\n", - "\n", - "def load_dask(\n", - " context: MLClientCtx,\n", - " src_data: DataItem,\n", - " dask_key: str = \"dask_key\",\n", - " inc_cols: Optional[List[str]] = None,\n", - " index_cols: Optional[List[str]] = None,\n", - " dask_persist: bool = True,\n", - " refresh_data: bool = True,\n", - " scheduler_key: str = \"scheduler\"\n", - ") -> None:\n", - " \"\"\"Load dataset into an existing dask cluster\n", - " \n", - " dask jobs define the dask client parameters at the job level, this method will raise an error if no client is detected.\n", - " \n", - " :param context: the function context\n", - " :param src_data: url of the data file or partitioned dataset as either\n", - " artifact DataItem, string, or path object (similar to \n", - " pandas read_csv)\n", - " :param dask_key: destination key of data on dask cluster and artifact store\n", - " :param inc_cols: include only these columns (very fast)\n", - " :param index_cols: list of index column names (can be a long-running process)\n", - " :param dask_persist: (True) should the data be persisted (through the `client.persist` op)\n", - " :param refresh_data: (False) if the dask_key already exists in the dask cluster, this will \n", - " raise an Exception. Set to True to replace the existing cluster data.\n", - " :param scheduler_key: (scheduler) the dask scheduler configuration, json also logged as an artifact\n", - " \"\"\"\n", - " if hasattr(context, \"dask_client\"):\n", - " dask_client = context.dask_client\n", - " else:\n", - " raise Exception(\"a dask client was not found in the execution context\")\n", - " \n", - " df = src_data.as_df(df_module=dd)\n", - "\n", - " if dask_persist:\n", - " df = dask_client.persist(df)\n", - " if dask_client.datasets and dask_key in dask_client.datasets:\n", - " dask_client.unpublish_dataset(dask_key)\n", - " dask_client.publish_dataset(df, name=dask_key)\n", - " \n", - " if context:\n", - " context.dask_client = dask_client\n", - " \n", - " # share the scheduler, whether data is persisted or not\n", - " dask_client.write_scheduler_file(scheduler_key+\".json\")\n", - " \n", - " # we don't use log_dataset here until it can take into account\n", - " # dask origin and apply dask describe.\n", - " context.log_artifact(scheduler_key, local_path=scheduler_key+\".json\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### mlconfig" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import mlconf\n", - "import os\n", - "\n", - "mlconf.dbpath = mlconf.dbpath or 'http://mlrun-api:8080'\n", - "mlconf.artifact_path = mlconf.artifact_path or f'{os.environ[\"HOME\"]}/artifacts'" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### save" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import code_to_function \n", - "# create job function object from notebook code\n", - "fn = code_to_function(\"load_dask\", handler='load_dask')\n", - "\n", - "# add metadata (for templates and reuse)\n", - "fn.spec.description = \"load dask cluster with data\"\n", - "fn.metadata.categories = [\"data-movement\", \"utils\"]\n", - "fn.metadata.labels = {\"author\": \"yjb\"}\n", - "fn.spec.remote = True\n", - "fn.spec.replicas = 4\n", - "fn.spec.max_replicas = 8\n", - "fn.spec.service_type = \"NodePort\"\n", - "fn.export(\"function.yaml\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### test" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# load function from marketplacen\n", - "from mlrun import import_function\n", - "\n", - "# vcs_branch = 'development'\n", - "# base_vcs = f'https://raw.githubusercontent.com/mlrun/functions/{vcs_branch}/'\n", - "# mlconf.hub_url = mlconf.hub_url or base_vcs + f'{name}/function.yaml'\n", - "# fn = import_function(\"hub://load_dask\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "if \"V3IO_HOME\" in list(os.environ):\n", - " from mlrun import mount_v3io\n", - " fn.apply(mount_v3io())\n", - "else:\n", - " # is you set up mlrun using the instructions at https://github.com/mlrun/mlrun/blob/master/hack/local/README.md\n", - " from mlrun.platforms import mount_pvc\n", - " fn.apply(mount_pvc('nfsvol', 'nfsvol', '/home/joyan/data'))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import NewTask, run_local\n", - "\n", - "task_params = {\n", - " \"name\": \"tasks load dask cluster with data\",\n", - " \"params\" : {\n", - " \"persist\" : True,\n", - " \"refresh_data\" : True,\n", - " \"dask_key\" : \"dask_key\"}}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run = fn.run(NewTask(**task_params), \n", - " handler=load_dask, \n", - " inputs={\"src_data\" : os.path.join(mlconf.artifact_path, 'iris.parquet') },\n", - " artifact_path=mlconf.artifact_path)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "func.status.to_dict()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import dask\n", - "import dask.dataframe as dd\n", - "from dask.distributed import Client, LocalCluster" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### TODO: this client dash board wont work -- wrong port!\n", - "\n", - "...even though its the correct client" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "client = Client(func.status.to_dict()['scheduler_address'])\n", - "client" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list(client.list_datasets())" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "client.datasets['dask_key']" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/load_dask/load_dask.py b/load_dask/load_dask.py deleted file mode 100644 index 76c53c216..000000000 --- a/load_dask/load_dask.py +++ /dev/null @@ -1,68 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from mlrun.execution import MLClientCtx -from mlrun.datastore import DataItem - -from typing import List, Optional - - -def load_dask( - context: MLClientCtx, - src_data: DataItem, - dask_key: str = "dask_key", - inc_cols: Optional[List[str]] = None, - index_cols: Optional[List[str]] = None, - dask_persist: bool = True, - refresh_data: bool = True, - scheduler_key: str = "scheduler" -) -> None: - """Load dataset into an existing dask cluster - - dask jobs define the dask client parameters at the job level, this method will raise an error if no client is detected. - - :param context: the function context - :param src_data: url of the data file or partitioned dataset as either - artifact DataItem, string, or path object (similar to - pandas read_csv) - :param dask_key: destination key of data on dask cluster and artifact store - :param inc_cols: include only these columns (very fast) - :param index_cols: list of index column names (can be a long-running process) - :param dask_persist: (True) should the data be persisted (through the `client.persist` op) - :param refresh_data: (False) if the dask_key already exists in the dask cluster, this will - raise an Exception. Set to True to replace the existing cluster data. - :param scheduler_key: (scheduler) the dask scheduler configuration, json also logged as an artifact - """ - if hasattr(context, "dask_client"): - dask_client = context.dask_client - else: - raise Exception("a dask client was not found in the execution context") - - df = src_data.as_df(df_module=dd) - - if dask_persist: - df = dask_client.persist(df) - if dask_client.datasets and dask_key in dask_client.datasets: - dask_client.unpublish_dataset(dask_key) - dask_client.publish_dataset(df, name=dask_key) - - if context: - context.dask_client = dask_client - - # share the scheduler, whether data is persisted or not - dask_client.write_scheduler_file(scheduler_key + ".json") - - # we don't use log_dataset here until it can take into account - # dask origin and apply dask describe. - context.log_artifact(scheduler_key, local_path=scheduler_key + ".json") \ No newline at end of file diff --git a/model_monitoring_stream/function.yaml b/model_monitoring_stream/function.yaml deleted file mode 100644 index aa285638b..000000000 --- a/model_monitoring_stream/function.yaml +++ /dev/null @@ -1,267 +0,0 @@ -kind: remote -metadata: - name: model-monitoring-stream - tag: '' - hash: 33f4d6de0858b3dfc9d150fc82fbed6feb05534c - project: '' - categories: - - monitoring -spec: - command: '' - args: [] - image: livsmichael/mlrun-api:automation - entry_points: - consume: - name: consume - doc: '' - parameters: - - name: self - default: '' - - name: event - type: Dict - default: '' - outputs: - - default: '' - lineno: 293 - compute_predictions_per_second: - name: compute_predictions_per_second - doc: '' - parameters: - - name: event - type: dict - default: '' - outputs: - - default: '' - lineno: 311 - process_before_kv: - name: process_before_kv - doc: '' - parameters: - - name: self - default: '' - - name: event - type: dict - default: '' - outputs: - - default: '' - lineno: 316 - process_before_events_tsdb: - name: process_before_events_tsdb - doc: '' - parameters: - - name: event - type: Dict - default: '' - outputs: - - default: '' - lineno: 325 - process_before_parquet: - name: process_before_parquet - doc: '' - parameters: - - name: event - type: dict - default: '' - outputs: - - default: '' - lineno: 362 - set_none_if_empty: - name: set_none_if_empty - doc: '' - parameters: - - name: _event - type: dict - default: '' - - name: keys - type: List[str] - default: '' - outputs: - - default: '' - lineno: 364 - drop_if_exists: - name: drop_if_exists - doc: '' - parameters: - - name: _event - type: dict - default: '' - - name: keys - type: List[str] - default: '' - outputs: - - default: '' - lineno: 369 - unpack_if_exists: - name: unpack_if_exists - doc: '' - parameters: - - name: _event - type: dict - default: '' - - name: keys - type: List[str] - default: '' - outputs: - - default: '' - lineno: 373 - do: - name: do - doc: '' - parameters: - - name: self - default: '' - - name: event - type: Dict - default: '' - outputs: - - default: '' - lineno: 702 - resume_state: - name: resume_state - doc: '' - parameters: - - name: self - default: '' - - name: endpoint_id - default: '' - outputs: - - default: '' - lineno: 475 - is_valid: - name: is_valid - doc: '' - parameters: - - name: self - default: '' - - name: endpoint_id - type: str - default: '' - - name: validation_function - default: '' - - name: field - type: Any - default: '' - - name: dict_path - type: List[str] - default: '' - outputs: - - default: '' - lineno: 495 - handle_errors: - name: handle_errors - doc: '' - parameters: - - name: self - default: '' - - name: endpoint_id - default: '' - - name: event - default: '' - outputs: - - default: '' - type: bool - lineno: 503 - enrich_even_details: - name: enrich_even_details - doc: '' - parameters: - - name: event - default: '' - outputs: - - default: '' - lineno: 511 - is_not_none: - name: is_not_none - doc: '' - parameters: - - name: field - type: Any - default: '' - - name: dict_path - type: List[str] - default: '' - outputs: - - default: '' - lineno: 536 - is_list_of_numerics: - name: is_list_of_numerics - doc: '' - parameters: - - name: field - type: List[Union[int, float, dict, list]] - default: '' - - name: dict_path - type: List[str] - default: '' - outputs: - - default: '' - lineno: 545 - get_endpoint_record: - name: get_endpoint_record - doc: '' - parameters: - - name: kv_container - type: str - default: '' - - name: kv_path - type: str - default: '' - - name: endpoint_id - type: str - default: '' - - name: access_key - type: str - default: '' - outputs: - - default: '' - lineno: 717 - init_context: - name: init_context - doc: '' - parameters: - - name: context - type: MLClientCtx - default: '' - outputs: - - default: '' - lineno: 743 - handler: - name: handler - doc: '' - parameters: - - name: context - type: MLClientCtx - default: '' - - name: event - type: Event - default: '' - outputs: - - default: '' - lineno: 751 - description: '' - min_replicas: 1 - max_replicas: 4 - env: [] - base_spec: - apiVersion: nuclio.io/v1 - kind: Function - metadata: - name: model-monitoring-stream - labels: {} - annotations: - nuclio.io/generated_by: function generated from /home/michaell/projects/functions/model_monitoring_stream/model_monitoring_stream.py - spec: - runtime: python:3.9 - handler: model_monitoring_stream:handler - env: [] - volumes: [] - build: - commands: [] - noBaseImagesPull: true - functionSourceCode: aW1wb3J0IGpzb24KaW1wb3J0IG9zCmZyb20gY29sbGVjdGlvbnMgaW1wb3J0IGRlZmF1bHRkaWN0CmZyb20gZGF0ZXRpbWUgaW1wb3J0IGRhdGV0aW1lCmZyb20gb3MgaW1wb3J0IGVudmlyb24KZnJvbSB0eXBpbmcgaW1wb3J0IERpY3QsIExpc3QsIFNldCwgT3B0aW9uYWwsIEFueSwgVW5pb24KCmltcG9ydCBwYW5kYXMgYXMgcGQKaW1wb3J0IHYzaW8KZnJvbSBtbHJ1bi5jb25maWcgaW1wb3J0IGNvbmZpZwpmcm9tIG1scnVuLnJ1biBpbXBvcnQgTUxDbGllbnRDdHgKZnJvbSBtbHJ1bi51dGlscyBpbXBvcnQgbG9nZ2VyCmZyb20gbWxydW4udXRpbHMubW9kZWxfbW9uaXRvcmluZyBpbXBvcnQgKAogICAgcGFyc2VfbW9kZWxfZW5kcG9pbnRfc3RvcmVfcHJlZml4LAogICAgY3JlYXRlX21vZGVsX2VuZHBvaW50X2lkLAopCmZyb20gbWxydW4udXRpbHMudjNpb19jbGllbnRzIGltcG9ydCBnZXRfdjNpb19jbGllbnQsIGdldF9mcmFtZXNfY2xpZW50CmZyb20gbnVjbGlvIGltcG9ydCBFdmVudApmcm9tIHN0b3JleSBpbXBvcnQgKAogICAgRmllbGRBZ2dyZWdhdG9yLAogICAgTm9vcERyaXZlciwKICAgIFRhYmxlLAogICAgTWFwLAogICAgTWFwQ2xhc3MsCiAgICBBZ2dyZWdhdGVCeUtleSwKICAgIGJ1aWxkX2Zsb3csCiAgICBGaWx0ZXIsCiAgICBGbGF0TWFwLAogICAgVFNEQlRhcmdldCwKICAgIFBhcnF1ZXRUYXJnZXQsCiAgICBTeW5jRW1pdFNvdXJjZSwKKQpmcm9tIHN0b3JleS5kdHlwZXMgaW1wb3J0IFNsaWRpbmdXaW5kb3dzCmZyb20gc3RvcmV5LnN0ZXBzIGltcG9ydCBTYW1wbGVXaW5kb3cKIyBDb25zdGFudHMKZnJvbSB2M2lvLmRhdGFwbGFuZSBpbXBvcnQgUmFpc2VGb3JTdGF0dXMKCklTT184MDYxX1VUQyA9ICIlWS0lbS0lZCAlSDolTTolUy4lZiV6IgpGVU5DVElPTl9VUkkgPSAiZnVuY3Rpb25fdXJpIgpNT0RFTCA9ICJtb2RlbCIKVkVSU0lPTiA9ICJ2ZXJzaW9uIgpWRVJTSU9ORURfTU9ERUwgPSAidmVyc2lvbmVkX21vZGVsIgpNT0RFTF9DTEFTUyA9ICJtb2RlbF9jbGFzcyIKVElNRVNUQU1QID0gInRpbWVzdGFtcCIKRU5EUE9JTlRfSUQgPSAiZW5kcG9pbnRfaWQiClJFUVVFU1RfSUQgPSAicmVxdWVzdF9pZCIKTEFCRUxTID0gImxhYmVscyIKVU5QQUNLRURfTEFCRUxTID0gInVucGFja2VkX2xhYmVscyIKTEFURU5DWV9BVkdfNU0gPSAibGF0ZW5jeV9hdmdfNW0iCkxBVEVOQ1lfQVZHXzFIID0gImxhdGVuY3lfYXZnXzFoIgpQUkVESUNUSU9OU19QRVJfU0VDT05EID0gInByZWRpY3Rpb25zX3Blcl9zZWNvbmQiClBSRURJQ1RJT05TX0NPVU5UXzVNID0gInByZWRpY3Rpb25zX2NvdW50XzVtIgpQUkVESUNUSU9OU19DT1VOVF8xSCA9ICJwcmVkaWN0aW9uc19jb3VudF8xaCIKRklSU1RfUkVRVUVTVCA9ICJmaXJzdF9yZXF1ZXN0IgpMQVNUX1JFUVVFU1QgPSAibGFzdF9yZXF1ZXN0IgpFUlJPUl9DT1VOVCA9ICJlcnJvcl9jb3VudCIKRU5USVRJRVMgPSAiZW50aXRpZXMiCkZFQVRVUkVfTkFNRVMgPSAiZmVhdHVyZV9uYW1lcyIKTEFCRUxfQ09MVU1OUyA9ICJsYWJlbF9jb2x1bW5zIgpMQVRFTkNZID0gImxhdGVuY3kiClJFQ09SRF9UWVBFID0gInJlY29yZF90eXBlIgpGRUFUVVJFUyA9ICJmZWF0dXJlcyIKUFJFRElDVElPTiA9ICJwcmVkaWN0aW9uIgpQUkVESUNUSU9OUyA9ICJwcmVkaWN0aW9ucyIKTkFNRURfRkVBVFVSRVMgPSAibmFtZWRfZmVhdHVyZXMiCk5BTUVEX1BSRURJQ1RJT05TID0gIm5hbWVkX3ByZWRpY3Rpb25zIgpCQVNFX01FVFJJQ1MgPSAiYmFzZV9tZXRyaWNzIgpDVVNUT01fTUVUUklDUyA9ICJjdXN0b21fbWV0cmljcyIKRU5EUE9JTlRfRkVBVFVSRVMgPSAiZW5kcG9pbnRfZmVhdHVyZXMiCk1FVFJJQ1MgPSAibWV0cmljcyIKQkFUQ0hfVElNRVNUQU1QID0gImJhdGNoX3RpbWVzdGFtcCIKVElNRV9GT1JNQVQ6IHN0ciA9ICIlWS0lbS0lZCAlSDolTTolUy4lZiIgICMgSVNPIDgwNjEKCgojIFN0cmVhbSBwcm9jZXNzaW5nIGNvZGUKY2xhc3MgRXZlbnRTdHJlYW1Qcm9jZXNzb3I6CiAgICBkZWYgX19pbml0X18oCiAgICAgICAgc2VsZiwKICAgICAgICBwcm9qZWN0OiBzdHIsCiAgICAgICAgc2FtcGxlX3dpbmRvdzogaW50ID0gMTAsCiAgICAgICAgdHNkYl9iYXRjaGluZ19tYXhfZXZlbnRzOiBpbnQgPSAxMCwKICAgICAgICB0c2RiX2JhdGNoaW5nX3RpbWVvdXRfc2VjczogaW50ID0gNjAgKiA1LCAgIyBEZWZhdWx0IDUgbWludXRlcwogICAgICAgIHBhcnF1ZXRfYmF0Y2hpbmdfbWF4X2V2ZW50czogaW50ID0gMTBfMDAwLAogICAgICAgIHBhcnF1ZXRfYmF0Y2hpbmdfdGltZW91dF9zZWNzOiBpbnQgPSA2MCAqIDYwLCAgIyBEZWZhdWx0IDEgaG91cgogICAgICAgIGFnZ3JlZ2F0ZV9jb3VudF93aW5kb3dzOiBPcHRpb25hbFtMaXN0W3N0cl1dID0gTm9uZSwKICAgICAgICBhZ2dyZWdhdGVfY291bnRfcGVyaW9kOiBzdHIgPSAiMzBzIiwKICAgICAgICBhZ2dyZWdhdGVfYXZnX3dpbmRvd3M6IE9wdGlvbmFsW0xpc3Rbc3RyXV0gPSBOb25lLAogICAgICAgIGFnZ3JlZ2F0ZV9hdmdfcGVyaW9kOiBzdHIgPSAiMzBzIiwKICAgICAgICB2M2lvX2FjY2Vzc19rZXk6IE9wdGlvbmFsW3N0cl0gPSBOb25lLAogICAgICAgIHYzaW9fZnJhbWVzZDogT3B0aW9uYWxbc3RyXSA9IE5vbmUsCiAgICAgICAgdjNpb19hcGk6IE9wdGlvbmFsW3N0cl0gPSBOb25lLAogICAgKToKICAgICAgICBzZWxmLnByb2plY3QgPSBwcm9qZWN0CiAgICAgICAgc2VsZi5zYW1wbGVfd2luZG93ID0gc2FtcGxlX3dpbmRvdwogICAgICAgIHNlbGYudHNkYl9iYXRjaGluZ19tYXhfZXZlbnRzID0gdHNkYl9iYXRjaGluZ19tYXhfZXZlbnRzCiAgICAgICAgc2VsZi50c2RiX2JhdGNoaW5nX3RpbWVvdXRfc2VjcyA9IHRzZGJfYmF0Y2hpbmdfdGltZW91dF9zZWNzCiAgICAgICAgc2VsZi5wYXJxdWV0X2JhdGNoaW5nX21heF9ldmVudHMgPSBwYXJxdWV0X2JhdGNoaW5nX21heF9ldmVudHMKICAgICAgICBzZWxmLnBhcnF1ZXRfYmF0Y2hpbmdfdGltZW91dF9zZWNzID0gcGFycXVldF9iYXRjaGluZ190aW1lb3V0X3NlY3MKICAgICAgICBzZWxmLmFnZ3JlZ2F0ZV9jb3VudF93aW5kb3dzID0gYWdncmVnYXRlX2NvdW50X3dpbmRvd3Mgb3IgWyI1bSIsICIxaCJdCiAgICAgICAgc2VsZi5hZ2dyZWdhdGVfY291bnRfcGVyaW9kID0gYWdncmVnYXRlX2NvdW50X3BlcmlvZAogICAgICAgIHNlbGYuYWdncmVnYXRlX2F2Z193aW5kb3dzID0gYWdncmVnYXRlX2F2Z193aW5kb3dzIG9yIFsiNW0iLCAiMWgiXQogICAgICAgIHNlbGYuYWdncmVnYXRlX2F2Z19wZXJpb2QgPSBhZ2dyZWdhdGVfYXZnX3BlcmlvZAoKICAgICAgICBzZWxmLnYzaW9fZnJhbWVzZCA9IHYzaW9fZnJhbWVzZCBvciBjb25maWcudjNpb19mcmFtZXNkCiAgICAgICAgc2VsZi52M2lvX2FwaSA9IHYzaW9fYXBpIG9yIGNvbmZpZy52M2lvX2FwaQoKICAgICAgICBzZWxmLnYzaW9fYWNjZXNzX2tleSA9IHYzaW9fYWNjZXNzX2tleSBvciBlbnZpcm9uLmdldCgiVjNJT19BQ0NFU1NfS0VZIikKICAgICAgICBzZWxmLm1vZGVsX21vbml0b3JpbmdfYWNjZXNzX2tleSA9ICgKICAgICAgICAgICAgb3MuZW52aXJvbi5nZXQoIk1PREVMX01PTklUT1JJTkdfQUNDRVNTX0tFWSIpIG9yIHNlbGYudjNpb19hY2Nlc3Nfa2V5CiAgICAgICAgKQoKICAgICAgICB0ZW1wbGF0ZSA9IGNvbmZpZy5tb2RlbF9lbmRwb2ludF9tb25pdG9yaW5nLnN0b3JlX3ByZWZpeGVzLmRlZmF1bHQKCiAgICAgICAga3ZfcGF0aCA9IHRlbXBsYXRlLmZvcm1hdChwcm9qZWN0PXByb2plY3QsIGtpbmQ9ImVuZHBvaW50cyIpCiAgICAgICAgXywgc2VsZi5rdl9jb250YWluZXIsIHNlbGYua3ZfcGF0aCA9IHBhcnNlX21vZGVsX2VuZHBvaW50X3N0b3JlX3ByZWZpeChrdl9wYXRoKQoKICAgICAgICB0c2RiX3BhdGggPSB0ZW1wbGF0ZS5mb3JtYXQocHJvamVjdD1wcm9qZWN0LCBraW5kPSJldmVudHMiKQogICAgICAgIF8sIHNlbGYudHNkYl9jb250YWluZXIsIHNlbGYudHNkYl9wYXRoID0gcGFyc2VfbW9kZWxfZW5kcG9pbnRfc3RvcmVfcHJlZml4KAogICAgICAgICAgICB0c2RiX3BhdGgKICAgICAgICApCiAgICAgICAgc2VsZi50c2RiX3BhdGggPSBmIntzZWxmLnRzZGJfY29udGFpbmVyfS97c2VsZi50c2RiX3BhdGh9IgoKICAgICAgICBzZWxmLnBhcnF1ZXRfcGF0aCA9IGNvbmZpZy5tb2RlbF9lbmRwb2ludF9tb25pdG9yaW5nLnN0b3JlX3ByZWZpeGVzLnVzZXJfc3BhY2UuZm9ybWF0KAogICAgICAgICAgICBwcm9qZWN0PXByb2plY3QsIGtpbmQ9InBhcnF1ZXQiCiAgICAgICAgKQoKICAgICAgICBsb2dnZXIuaW5mbygKICAgICAgICAgICAgIlYzSU8gQ29uZmlndXJhdGlvbiIsCiAgICAgICAgICAgIHYzaW9fYWNjZXNzX2tleT1zZWxmLnYzaW9fYWNjZXNzX2tleSwKICAgICAgICAgICAgbW9kZWxfbW9uaXRvcmluZ19hY2Nlc3Nfa2V5PXNlbGYubW9kZWxfbW9uaXRvcmluZ19hY2Nlc3Nfa2V5LAogICAgICAgICAgICBkZWZhdWx0X3N0b3JlX3ByZWZpeD1jb25maWcubW9kZWxfZW5kcG9pbnRfbW9uaXRvcmluZy5zdG9yZV9wcmVmaXhlcy5kZWZhdWx0LAogICAgICAgICAgICB1c2VyX3NwYWNlX3N0b3JlX3ByZWZpeD1jb25maWcubW9kZWxfZW5kcG9pbnRfbW9uaXRvcmluZy5zdG9yZV9wcmVmaXhlcy51c2VyX3NwYWNlLAogICAgICAgICAgICB2M2lvX2FwaT1zZWxmLnYzaW9fYXBpLAogICAgICAgICAgICB2M2lvX2ZyYW1lc2Q9c2VsZi52M2lvX2ZyYW1lc2QsCiAgICAgICAgICAgIGt2X2NvbnRhaW5lcj1zZWxmLmt2X2NvbnRhaW5lciwKICAgICAgICAgICAga3ZfcGF0aD1zZWxmLmt2X3BhdGgsCiAgICAgICAgICAgIHRzZGJfY29udGFpbmVyPXNlbGYudHNkYl9jb250YWluZXIsCiAgICAgICAgICAgIHRzZGJfcGF0aD1zZWxmLnRzZGJfcGF0aCwKICAgICAgICAgICAgcGFycXVldF9wYXRoPXNlbGYucGFycXVldF9wYXRoLAogICAgICAgICkKCiAgICAgICAgc2VsZi5fa3Zfa2V5cyA9IFsKICAgICAgICAgICAgRlVOQ1RJT05fVVJJLAogICAgICAgICAgICBNT0RFTCwKICAgICAgICAgICAgTU9ERUxfQ0xBU1MsCiAgICAgICAgICAgIFRJTUVTVEFNUCwKICAgICAgICAgICAgRU5EUE9JTlRfSUQsCiAgICAgICAgICAgIExBQkVMUywKICAgICAgICAgICAgVU5QQUNLRURfTEFCRUxTLAogICAgICAgICAgICBMQVRFTkNZX0FWR181TSwKICAgICAgICAgICAgTEFURU5DWV9BVkdfMUgsCiAgICAgICAgICAgIFBSRURJQ1RJT05TX1BFUl9TRUNPTkQsCiAgICAgICAgICAgIFBSRURJQ1RJT05TX0NPVU5UXzVNLAogICAgICAgICAgICBQUkVESUNUSU9OU19DT1VOVF8xSCwKICAgICAgICAgICAgRklSU1RfUkVRVUVTVCwKICAgICAgICAgICAgTEFTVF9SRVFVRVNULAogICAgICAgICAgICBFUlJPUl9DT1VOVCwKICAgICAgICBdCgogICAgICAgIHNlbGYuX2Zsb3cgPSBidWlsZF9mbG93KAogICAgICAgICAgICBbCiAgICAgICAgICAgICAgICBTeW5jRW1pdFNvdXJjZSgpLAogICAgICAgICAgICAgICAgUHJvY2Vzc0VuZHBvaW50RXZlbnQoCiAgICAgICAgICAgICAgICAgICAga3ZfY29udGFpbmVyPXNlbGYua3ZfY29udGFpbmVyLAogICAgICAgICAgICAgICAgICAgIGt2X3BhdGg9c2VsZi5rdl9wYXRoLAogICAgICAgICAgICAgICAgICAgIHYzaW9fYWNjZXNzX2tleT1zZWxmLnYzaW9fYWNjZXNzX2tleSwKICAgICAgICAgICAgICAgICksCiAgICAgICAgICAgICAgICBGaWx0ZXJOb3ROb25lKCksCiAgICAgICAgICAgICAgICBGbGF0TWFwKGxhbWJkYSB4OiB4KSwKICAgICAgICAgICAgICAgIE1hcEZlYXR1cmVOYW1lcygKICAgICAgICAgICAgICAgICAgICBrdl9jb250YWluZXI9c2VsZi5rdl9jb250YWluZXIsCiAgICAgICAgICAgICAgICAgICAga3ZfcGF0aD1zZWxmLmt2X3BhdGgsCiAgICAgICAgICAgICAgICAgICAgYWNjZXNzX2tleT1zZWxmLnYzaW9fYWNjZXNzX2tleSwKICAgICAgICAgICAgICAgICksCiAgICAgICAgICAgICAgICAjIEJyYW5jaCAxOiBBZ2dyZWdhdGUgZXZlbnRzLCBjb3VudCBhdmVyYWdlcyBhbmQgdXBkYXRlIFRTREIgYW5kIEtWCiAgICAgICAgICAgICAgICBbCiAgICAgICAgICAgICAgICAgICAgQWdncmVnYXRlQnlLZXkoCiAgICAgICAgICAgICAgICAgICAgICAgIGFnZ3JlZ2F0ZXM9WwogICAgICAgICAgICAgICAgICAgICAgICAgICAgRmllbGRBZ2dyZWdhdG9yKAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFBSRURJQ1RJT05TLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIEVORFBPSU5UX0lELAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFsiY291bnQiXSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBTbGlkaW5nV2luZG93cygKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgc2VsZi5hZ2dyZWdhdGVfY291bnRfd2luZG93cywKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgc2VsZi5hZ2dyZWdhdGVfY291bnRfcGVyaW9kLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICApLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgRmllbGRBZ2dyZWdhdG9yKAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIExBVEVOQ1ksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgTEFURU5DWSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBbImF2ZyJdLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFNsaWRpbmdXaW5kb3dzKAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzZWxmLmFnZ3JlZ2F0ZV9hdmdfd2luZG93cywKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgc2VsZi5hZ2dyZWdhdGVfYXZnX3BlcmlvZCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICApLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgKSwKICAgICAgICAgICAgICAgICAgICAgICAgXSwKICAgICAgICAgICAgICAgICAgICAgICAgdGFibGU9VGFibGUoIm5vdGFibGUiLCBOb29wRHJpdmVyKCkpLAogICAgICAgICAgICAgICAgICAgICksCiAgICAgICAgICAgICAgICAgICAgU2FtcGxlV2luZG93KAogICAgICAgICAgICAgICAgICAgICAgICBzZWxmLnNhbXBsZV93aW5kb3cKICAgICAgICAgICAgICAgICAgICApLCAgIyBBZGQgcmVxdWlyZWQgZ2FwIGJldHdlZW4gZXZlbnQgdG8gYXBwbHkgc2FtcGxpbmcKICAgICAgICAgICAgICAgICAgICBNYXAoc2VsZi5jb21wdXRlX3ByZWRpY3Rpb25zX3Blcl9zZWNvbmQpLAogICAgICAgICAgICAgICAgICAgICMgQnJhbmNoIDEuMTogVXBkYXRlZCBLVgogICAgICAgICAgICAgICAgICAgIFsKICAgICAgICAgICAgICAgICAgICAgICAgTWFwKHNlbGYucHJvY2Vzc19iZWZvcmVfa3YpLAogICAgICAgICAgICAgICAgICAgICAgICBXcml0ZVRvS1YoY29udGFpbmVyPXNlbGYua3ZfY29udGFpbmVyLCB0YWJsZT1zZWxmLmt2X3BhdGgpLAogICAgICAgICAgICAgICAgICAgICAgICBJbmZlclNjaGVtYSgKICAgICAgICAgICAgICAgICAgICAgICAgICAgIHYzaW9fYWNjZXNzX2tleT1zZWxmLnYzaW9fYWNjZXNzX2tleSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIHYzaW9fZnJhbWVzZD1zZWxmLnYzaW9fZnJhbWVzZCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIGNvbnRhaW5lcj1zZWxmLmt2X2NvbnRhaW5lciwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRhYmxlPXNlbGYua3ZfcGF0aCwKICAgICAgICAgICAgICAgICAgICAgICAgKSwKICAgICAgICAgICAgICAgICAgICBdLAogICAgICAgICAgICAgICAgICAgICMgQnJhbmNoIDEuMjogVXBkYXRlIFRTREIKICAgICAgICAgICAgICAgICAgICBbCiAgICAgICAgICAgICAgICAgICAgICAgICMgTWFwIHRoZSBldmVudCBpbnRvIHRhZ2dhYmxlIGZpZWxkcywgYWRkIHJlY29yZCB0eXBlIHRvIGVhY2ggZmllbGQKICAgICAgICAgICAgICAgICAgICAgICAgTWFwKHNlbGYucHJvY2Vzc19iZWZvcmVfZXZlbnRzX3RzZGIpLAogICAgICAgICAgICAgICAgICAgICAgICBbCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBGaWx0ZXJLZXlzKEJBU0VfTUVUUklDUyksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBVbnBhY2tWYWx1ZXMoQkFTRV9NRVRSSUNTKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIFRTREJUYXJnZXQoCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgcGF0aD1zZWxmLnRzZGJfcGF0aCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICByYXRlPSIxMC9tIiwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB0aW1lX2NvbD1USU1FU1RBTVAsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgY29udGFpbmVyPXNlbGYudHNkYl9jb250YWluZXIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgYWNjZXNzX2tleT1zZWxmLnYzaW9fYWNjZXNzX2tleSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB2M2lvX2ZyYW1lcz1zZWxmLnYzaW9fZnJhbWVzZCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBpbmRleF9jb2xzPVtFTkRQT0lOVF9JRCwgUkVDT1JEX1RZUEVdLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMgU2V0dGluZ3MgZm9yIF9CYXRjaGluZwogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIG1heF9ldmVudHM9c2VsZi50c2RiX2JhdGNoaW5nX21heF9ldmVudHMsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgdGltZW91dF9zZWNzPXNlbGYudHNkYl9iYXRjaGluZ190aW1lb3V0X3NlY3MsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAga2V5PUVORFBPSU5UX0lELAogICAgICAgICAgICAgICAgICAgICAgICAgICAgKSwKICAgICAgICAgICAgICAgICAgICAgICAgXSwKICAgICAgICAgICAgICAgICAgICAgICAgWwogICAgICAgICAgICAgICAgICAgICAgICAgICAgRmlsdGVyS2V5cyhFTkRQT0lOVF9GRUFUVVJFUyksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBVbnBhY2tWYWx1ZXMoRU5EUE9JTlRfRkVBVFVSRVMpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgVFNEQlRhcmdldCgKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBwYXRoPXNlbGYudHNkYl9wYXRoLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHJhdGU9IjEwL20iLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRpbWVfY29sPVRJTUVTVEFNUCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBjb250YWluZXI9c2VsZi50c2RiX2NvbnRhaW5lciwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBhY2Nlc3Nfa2V5PXNlbGYudjNpb19hY2Nlc3Nfa2V5LAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHYzaW9fZnJhbWVzPXNlbGYudjNpb19mcmFtZXNkLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGluZGV4X2NvbHM9W0VORFBPSU5UX0lELCBSRUNPUkRfVFlQRV0sCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIyBTZXR0aW5ncyBmb3IgX0JhdGNoaW5nCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbWF4X2V2ZW50cz1zZWxmLnRzZGJfYmF0Y2hpbmdfbWF4X2V2ZW50cywKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB0aW1lb3V0X3NlY3M9c2VsZi50c2RiX2JhdGNoaW5nX3RpbWVvdXRfc2VjcywKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBrZXk9RU5EUE9JTlRfSUQsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICApLAogICAgICAgICAgICAgICAgICAgICAgICBdLAogICAgICAgICAgICAgICAgICAgICAgICBbCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBGaWx0ZXJLZXlzKENVU1RPTV9NRVRSSUNTKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIEZpbHRlck5vdE5vbmUoKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIFVucGFja1ZhbHVlcyhDVVNUT01fTUVUUklDUyksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBUU0RCVGFyZ2V0KAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHBhdGg9c2VsZi50c2RiX3BhdGgsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgcmF0ZT0iMTAvbSIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgdGltZV9jb2w9VElNRVNUQU1QLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGNvbnRhaW5lcj1zZWxmLnRzZGJfY29udGFpbmVyLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGFjY2Vzc19rZXk9c2VsZi52M2lvX2FjY2Vzc19rZXksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgdjNpb19mcmFtZXM9c2VsZi52M2lvX2ZyYW1lc2QsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgaW5kZXhfY29scz1bRU5EUE9JTlRfSUQsIFJFQ09SRF9UWVBFXSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjIFNldHRpbmdzIGZvciBfQmF0Y2hpbmcKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBtYXhfZXZlbnRzPXNlbGYudHNkYl9iYXRjaGluZ19tYXhfZXZlbnRzLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRpbWVvdXRfc2Vjcz1zZWxmLnRzZGJfYmF0Y2hpbmdfdGltZW91dF9zZWNzLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGtleT1FTkRQT0lOVF9JRCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICksCiAgICAgICAgICAgICAgICAgICAgICAgIF0sCiAgICAgICAgICAgICAgICAgICAgXSwKICAgICAgICAgICAgICAgIF0sCiAgICAgICAgICAgICAgICAjIEJyYW5jaCAyOiBCYXRjaCBldmVudHMsIHdyaXRlIHRvIHBhcnF1ZXQKICAgICAgICAgICAgICAgIFsKICAgICAgICAgICAgICAgICAgICBNYXAoc2VsZi5wcm9jZXNzX2JlZm9yZV9wYXJxdWV0KSwKICAgICAgICAgICAgICAgICAgICBQYXJxdWV0VGFyZ2V0KAogICAgICAgICAgICAgICAgICAgICAgICBwYXRoPXNlbGYucGFycXVldF9wYXRoLAogICAgICAgICAgICAgICAgICAgICAgICBwYXJ0aXRpb25fY29scz1bIiRrZXkiLCAiJHllYXIiLCAiJG1vbnRoIiwgIiRkYXkiLCAiJGhvdXIiXSwKICAgICAgICAgICAgICAgICAgICAgICAgaW5mZXJfY29sdW1uc19mcm9tX2RhdGE9VHJ1ZSwKICAgICAgICAgICAgICAgICAgICAgICAgIyBTZXR0aW5ncyBmb3IgX0JhdGNoaW5nCiAgICAgICAgICAgICAgICAgICAgICAgIG1heF9ldmVudHM9c2VsZi5wYXJxdWV0X2JhdGNoaW5nX21heF9ldmVudHMsCiAgICAgICAgICAgICAgICAgICAgICAgIHRpbWVvdXRfc2Vjcz1zZWxmLnBhcnF1ZXRfYmF0Y2hpbmdfdGltZW91dF9zZWNzLAogICAgICAgICAgICAgICAgICAgICAgICAjIFNldHRpbmdzIGZvciB2M2lvIHN0b3JhZ2UKICAgICAgICAgICAgICAgICAgICAgICAgc3RvcmFnZV9vcHRpb25zPXsKICAgICAgICAgICAgICAgICAgICAgICAgICAgICJ2M2lvX2FwaSI6IHNlbGYudjNpb19hcGksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAidjNpb19hY2Nlc3Nfa2V5Ijogc2VsZi5tb2RlbF9tb25pdG9yaW5nX2FjY2Vzc19rZXksCiAgICAgICAgICAgICAgICAgICAgICAgIH0sCiAgICAgICAgICAgICAgICAgICAgKSwKICAgICAgICAgICAgICAgIF0sCiAgICAgICAgICAgIF0KICAgICAgICApLnJ1bigpCgogICAgZGVmIGNvbnN1bWUoc2VsZiwgZXZlbnQ6IERpY3QpOgogICAgICAgIGV2ZW50cyA9IFtdCiAgICAgICAgaWYgImhlYWRlcnMiIGluIGV2ZW50IGFuZCAidmFsdWVzIiBpbiBldmVudDoKICAgICAgICAgICAgZm9yIHZhbHVlcyBpbiBldmVudFsidmFsdWVzIl06CiAgICAgICAgICAgICAgICBldmVudHMuYXBwZW5kKHtrOiB2IGZvciBrLCB2IGluIHppcChldmVudFsiaGVhZGVycyJdLCB2YWx1ZXMpfSkKICAgICAgICBlbHNlOgogICAgICAgICAgICBldmVudHMuYXBwZW5kKGV2ZW50KQoKICAgICAgICBmb3IgZW5yaWNoZWQgaW4gbWFwKGVucmljaF9ldmVuX2RldGFpbHMsIGV2ZW50cyk6CiAgICAgICAgICAgIGlmIGVucmljaGVkIGlzIG5vdCBOb25lOgogICAgICAgICAgICAgICAgc2VsZi5fZmxvdy5lbWl0KAogICAgICAgICAgICAgICAgICAgIGVucmljaGVkLAogICAgICAgICAgICAgICAgICAgIGtleT1lbnJpY2hlZFtFTkRQT0lOVF9JRF0sCiAgICAgICAgICAgICAgICAgICAgZXZlbnRfdGltZT1kYXRldGltZS5zdHJwdGltZShlbnJpY2hlZFsid2hlbiJdLCBJU09fODA2MV9VVEMpLAogICAgICAgICAgICAgICAgKQogICAgICAgICAgICBlbHNlOgogICAgICAgICAgICAgICAgcGFzcwoKICAgIEBzdGF0aWNtZXRob2QKICAgIGRlZiBjb21wdXRlX3ByZWRpY3Rpb25zX3Blcl9zZWNvbmQoZXZlbnQ6IGRpY3QpOgogICAgICAgIGV2ZW50W1BSRURJQ1RJT05TX1BFUl9TRUNPTkRdID0gZmxvYXQoZXZlbnRbUFJFRElDVElPTlNfQ09VTlRfNU1dKSAvIDYwMAogICAgICAgIHJldHVybiBldmVudAoKICAgIGRlZiBwcm9jZXNzX2JlZm9yZV9rdihzZWxmLCBldmVudDogZGljdCk6CiAgICAgICAgIyBGaWx0ZXIgcmVsZXZhbnQga2V5cwogICAgICAgIGUgPSB7azogZXZlbnRba10gZm9yIGsgaW4gc2VsZi5fa3Zfa2V5c30KICAgICAgICAjIFVucGFjayBsYWJlbHMgZGljdGlvbmFyeQogICAgICAgIGUgPSB7KiplLCAqKmUucG9wKFVOUEFDS0VEX0xBQkVMUywge30pfQogICAgICAgICMgV3JpdGUgbGFiZWxzIHRvIGt2IGFzIGpzb24gc3RyaW5nIHRvIGJlIHByZXNlbnRhYmxlIGxhdGVyCiAgICAgICAgZVtMQUJFTFNdID0ganNvbi5kdW1wcyhlW0xBQkVMU10pCiAgICAgICAgcmV0dXJuIGUKCiAgICBAc3RhdGljbWV0aG9kCiAgICBkZWYgcHJvY2Vzc19iZWZvcmVfZXZlbnRzX3RzZGIoZXZlbnQ6IERpY3QpOgogICAgICAgIGJhc2VfZmllbGRzID0gW1RJTUVTVEFNUCwgRU5EUE9JTlRfSURdCgogICAgICAgIGJhc2VfZXZlbnQgPSB7azogZXZlbnRba10gZm9yIGsgaW4gYmFzZV9maWVsZHN9CiAgICAgICAgYmFzZV9ldmVudFtUSU1FU1RBTVBdID0gcGQudG9fZGF0ZXRpbWUoCiAgICAgICAgICAgIGJhc2VfZXZlbnRbVElNRVNUQU1QXSwgZm9ybWF0PVRJTUVfRk9STUFUCiAgICAgICAgKQoKICAgICAgICBiYXNlX21ldHJpY3MgPSB7CiAgICAgICAgICAgIFJFQ09SRF9UWVBFOiBCQVNFX01FVFJJQ1MsCiAgICAgICAgICAgIFBSRURJQ1RJT05TX1BFUl9TRUNPTkQ6IGV2ZW50W1BSRURJQ1RJT05TX1BFUl9TRUNPTkRdLAogICAgICAgICAgICBQUkVESUNUSU9OU19DT1VOVF81TTogZXZlbnRbUFJFRElDVElPTlNfQ09VTlRfNU1dLAogICAgICAgICAgICBQUkVESUNUSU9OU19DT1VOVF8xSDogZXZlbnRbUFJFRElDVElPTlNfQ09VTlRfMUhdLAogICAgICAgICAgICBMQVRFTkNZX0FWR181TTogZXZlbnRbTEFURU5DWV9BVkdfNU1dLAogICAgICAgICAgICBMQVRFTkNZX0FWR18xSDogZXZlbnRbTEFURU5DWV9BVkdfMUhdLAogICAgICAgICAgICAqKmJhc2VfZXZlbnQsCiAgICAgICAgfQoKICAgICAgICBlbmRwb2ludF9mZWF0dXJlcyA9IHsKICAgICAgICAgICAgUkVDT1JEX1RZUEU6IEVORFBPSU5UX0ZFQVRVUkVTLAogICAgICAgICAgICAqKmV2ZW50W05BTUVEX1BSRURJQ1RJT05TXSwKICAgICAgICAgICAgKipldmVudFtOQU1FRF9GRUFUVVJFU10sCiAgICAgICAgICAgICoqYmFzZV9ldmVudCwKICAgICAgICB9CgogICAgICAgIHByb2Nlc3NlZCA9IHtCQVNFX01FVFJJQ1M6IGJhc2VfbWV0cmljcywgRU5EUE9JTlRfRkVBVFVSRVM6IGVuZHBvaW50X2ZlYXR1cmVzfQoKICAgICAgICBpZiBldmVudFtNRVRSSUNTXToKICAgICAgICAgICAgcHJvY2Vzc2VkW0NVU1RPTV9NRVRSSUNTXSA9IHsKICAgICAgICAgICAgICAgIFJFQ09SRF9UWVBFOiBDVVNUT01fTUVUUklDUywKICAgICAgICAgICAgICAgICoqZXZlbnRbTUVUUklDU10sCiAgICAgICAgICAgICAgICAqKmJhc2VfZXZlbnQsCiAgICAgICAgICAgIH0KCiAgICAgICAgcmV0dXJuIHByb2Nlc3NlZAoKICAgIEBzdGF0aWNtZXRob2QKICAgIGRlZiBwcm9jZXNzX2JlZm9yZV9wYXJxdWV0KGV2ZW50OiBkaWN0KToKICAgICAgICBkZWYgc2V0X25vbmVfaWZfZW1wdHkoX2V2ZW50OiBkaWN0LCBrZXlzOiBMaXN0W3N0cl0pOgogICAgICAgICAgICBmb3Iga2V5IGluIGtleXM6CiAgICAgICAgICAgICAgICBpZiBub3QgX2V2ZW50LmdldChrZXkpOgogICAgICAgICAgICAgICAgICAgIF9ldmVudFtrZXldID0gTm9uZQoKICAgICAgICBkZWYgZHJvcF9pZl9leGlzdHMoX2V2ZW50OiBkaWN0LCBrZXlzOiBMaXN0W3N0cl0pOgogICAgICAgICAgICBmb3Iga2V5IGluIGtleXM6CiAgICAgICAgICAgICAgICBfZXZlbnQucG9wKGtleSwgTm9uZSkKCiAgICAgICAgZGVmIHVucGFja19pZl9leGlzdHMoX2V2ZW50OiBkaWN0LCBrZXlzOiBMaXN0W3N0cl0pOgogICAgICAgICAgICBmb3Iga2V5IGluIGtleXM6CiAgICAgICAgICAgICAgICB2YWx1ZSA9IF9ldmVudC5nZXQoa2V5KQogICAgICAgICAgICAgICAgaWYgdmFsdWUgaXMgbm90IE5vbmU6CiAgICAgICAgICAgICAgICAgICAgX2V2ZW50ID0geyoqdmFsdWUsICoqZXZlbnR9CgogICAgICAgIGRyb3BfaWZfZXhpc3RzKGV2ZW50LCBbVU5QQUNLRURfTEFCRUxTLCBGRUFUVVJFU10pCiAgICAgICAgdW5wYWNrX2lmX2V4aXN0cyhldmVudCwgW0VOVElUSUVTXSkKICAgICAgICBzZXRfbm9uZV9pZl9lbXB0eShldmVudCwgW0xBQkVMUywgTUVUUklDUywgRU5USVRJRVNdKQogICAgICAgIHJldHVybiBldmVudAoKCmNsYXNzIFByb2Nlc3NFbmRwb2ludEV2ZW50KE1hcENsYXNzKToKICAgIGRlZiBfX2luaXRfXyhzZWxmLCBrdl9jb250YWluZXI6IHN0ciwga3ZfcGF0aDogc3RyLCB2M2lvX2FjY2Vzc19rZXk6IHN0ciwgKiprd2FyZ3MpOgogICAgICAgIHN1cGVyKCkuX19pbml0X18oKiprd2FyZ3MpCiAgICAgICAgc2VsZi5rdl9jb250YWluZXI6IHN0ciA9IGt2X2NvbnRhaW5lcgogICAgICAgIHNlbGYua3ZfcGF0aDogc3RyID0ga3ZfcGF0aAogICAgICAgIHNlbGYudjNpb19hY2Nlc3Nfa2V5OiBzdHIgPSB2M2lvX2FjY2Vzc19rZXkKICAgICAgICBzZWxmLmZpcnN0X3JlcXVlc3Q6IERpY3Rbc3RyLCBzdHJdID0gZGljdCgpCiAgICAgICAgc2VsZi5sYXN0X3JlcXVlc3Q6IERpY3Rbc3RyLCBzdHJdID0gZGljdCgpCiAgICAgICAgc2VsZi5lcnJvcl9jb3VudDogRGljdFtzdHIsIGludF0gPSBkZWZhdWx0ZGljdChpbnQpCiAgICAgICAgc2VsZi5lbmRwb2ludHM6IFNldFtzdHJdID0gc2V0KCkKCiAgICBkZWYgZG8oc2VsZiwgZXZlbnQ6IGRpY3QpOgogICAgICAgIGZ1bmN0aW9uX3VyaSA9IGV2ZW50W0ZVTkNUSU9OX1VSSV0KICAgICAgICB2ZXJzaW9uZWRfbW9kZWwgPSBldmVudFtWRVJTSU9ORURfTU9ERUxdCiAgICAgICAgZW5kcG9pbnRfaWQgPSBldmVudFtFTkRQT0lOVF9JRF0KCiAgICAgICAgIyBJbiBjYXNlIHRoaXMgcHJvY2VzcyBmYWlscywgcmVzdW1lIHN0YXRlIGZyb20gZXhpc3RpbmcgcmVjb3JkCiAgICAgICAgc2VsZi5yZXN1bWVfc3RhdGUoZW5kcG9pbnRfaWQpCgogICAgICAgICMgSGFuZGxlIGVycm9ycyBjb21pbmcgZnJvbSBzdHJlYW0KICAgICAgICBmb3VuZF9lcnJvcnMgPSBzZWxmLmhhbmRsZV9lcnJvcnMoZW5kcG9pbnRfaWQsIGV2ZW50KQogICAgICAgIGlmIGZvdW5kX2Vycm9yczoKICAgICAgICAgICAgcmV0dXJuIE5vbmUKCiAgICAgICAgIyBWYWxpZGF0ZSBldmVudCBmaWVsZHMKICAgICAgICBtb2RlbF9jbGFzcyA9IGV2ZW50LmdldCgibW9kZWxfY2xhc3MiKSBvciBldmVudC5nZXQoImNsYXNzIikKICAgICAgICB0aW1lc3RhbXAgPSBldmVudC5nZXQoIndoZW4iKQogICAgICAgIHJlcXVlc3RfaWQgPSBldmVudC5nZXQoInJlcXVlc3QiLCB7fSkuZ2V0KCJpZCIpCiAgICAgICAgbGF0ZW5jeSA9IGV2ZW50LmdldCgibWljcm9zZWMiKQogICAgICAgIGZlYXR1cmVzID0gZXZlbnQuZ2V0KCJyZXF1ZXN0Iiwge30pLmdldCgiaW5wdXRzIikKICAgICAgICBwcmVkaWN0aW9ucyA9IGV2ZW50LmdldCgicmVzcCIsIHt9KS5nZXQoIm91dHB1dHMiKQoKICAgICAgICBpZiBub3Qgc2VsZi5pc192YWxpZChlbmRwb2ludF9pZCwgaXNfbm90X25vbmUsIHRpbWVzdGFtcCwgWyJ3aGVuIl0sKToKICAgICAgICAgICAgcmV0dXJuIE5vbmUKCiAgICAgICAgaWYgZW5kcG9pbnRfaWQgbm90IGluIHNlbGYuZmlyc3RfcmVxdWVzdDoKICAgICAgICAgICAgc2VsZi5maXJzdF9yZXF1ZXN0W2VuZHBvaW50X2lkXSA9IHRpbWVzdGFtcAogICAgICAgIHNlbGYubGFzdF9yZXF1ZXN0W2VuZHBvaW50X2lkXSA9IHRpbWVzdGFtcAoKICAgICAgICBpZiBub3Qgc2VsZi5pc192YWxpZChlbmRwb2ludF9pZCwgaXNfbm90X25vbmUsIHJlcXVlc3RfaWQsIFsicmVxdWVzdCIsICJpZCJdLCk6CiAgICAgICAgICAgIHJldHVybiBOb25lCiAgICAgICAgaWYgbm90IHNlbGYuaXNfdmFsaWQoZW5kcG9pbnRfaWQsIGlzX25vdF9ub25lLCBsYXRlbmN5LCBbIm1pY3Jvc2VjIl0sKToKICAgICAgICAgICAgcmV0dXJuIE5vbmUKICAgICAgICBpZiBub3Qgc2VsZi5pc192YWxpZCgKICAgICAgICAgICAgZW5kcG9pbnRfaWQsIGlzX25vdF9ub25lLCBmZWF0dXJlcywgWyJyZXF1ZXN0IiwgImlucHV0cyJdLAogICAgICAgICk6CiAgICAgICAgICAgIHJldHVybiBOb25lCiAgICAgICAgaWYgbm90IHNlbGYuaXNfdmFsaWQoCiAgICAgICAgICAgIGVuZHBvaW50X2lkLCBpc19ub3Rfbm9uZSwgcHJlZGljdGlvbnMsIFsicmVzcCIsICJvdXRwdXRzIl0sCiAgICAgICAgKToKICAgICAgICAgICAgcmV0dXJuIE5vbmUKCiAgICAgICAgdW5wYWNrZWRfbGFiZWxzID0ge2YiX3trfSI6IHYgZm9yIGssIHYgaW4gZXZlbnQuZ2V0KExBQkVMUywge30pLml0ZW1zKCl9CgogICAgICAgICMgU2VwYXJhdGUgZWFjaCBtb2RlbCBpbnZvY2F0aW9uIGludG8gc3ViIGV2ZW50cwogICAgICAgIGV2ZW50cyA9IFtdCiAgICAgICAgZm9yIGksIChmZWF0dXJlLCBwcmVkaWN0aW9uKSBpbiBlbnVtZXJhdGUoemlwKGZlYXR1cmVzLCBwcmVkaWN0aW9ucykpOgogICAgICAgICAgICBpZiBub3Qgc2VsZi5pc192YWxpZCgKICAgICAgICAgICAgICAgIGVuZHBvaW50X2lkLAogICAgICAgICAgICAgICAgaXNfbGlzdF9vZl9udW1lcmljcywKICAgICAgICAgICAgICAgIGZlYXR1cmUsCiAgICAgICAgICAgICAgICBbInJlcXVlc3QiLCAiaW5wdXRzIiwgZiJbe2l9XSJdLAogICAgICAgICAgICApOgogICAgICAgICAgICAgICAgcmV0dXJuIE5vbmUKCiAgICAgICAgICAgIGlmIG5vdCBpc2luc3RhbmNlKHByZWRpY3Rpb24sIGxpc3QpOgogICAgICAgICAgICAgICAgcHJlZGljdGlvbiA9IFtwcmVkaWN0aW9uXQoKICAgICAgICAgICAgZXZlbnRzLmFwcGVuZCgKICAgICAgICAgICAgICAgIHsKICAgICAgICAgICAgICAgICAgICBGVU5DVElPTl9VUkk6IGZ1bmN0aW9uX3VyaSwKICAgICAgICAgICAgICAgICAgICBNT0RFTDogdmVyc2lvbmVkX21vZGVsLAogICAgICAgICAgICAgICAgICAgIE1PREVMX0NMQVNTOiBtb2RlbF9jbGFzcywKICAgICAgICAgICAgICAgICAgICBUSU1FU1RBTVA6IHRpbWVzdGFtcCwKICAgICAgICAgICAgICAgICAgICBFTkRQT0lOVF9JRDogZW5kcG9pbnRfaWQsCiAgICAgICAgICAgICAgICAgICAgUkVRVUVTVF9JRDogcmVxdWVzdF9pZCwKICAgICAgICAgICAgICAgICAgICBMQVRFTkNZOiBsYXRlbmN5LAogICAgICAgICAgICAgICAgICAgIEZFQVRVUkVTOiBmZWF0dXJlLAogICAgICAgICAgICAgICAgICAgIFBSRURJQ1RJT046IHByZWRpY3Rpb24sCiAgICAgICAgICAgICAgICAgICAgRklSU1RfUkVRVUVTVDogc2VsZi5maXJzdF9yZXF1ZXN0W2VuZHBvaW50X2lkXSwKICAgICAgICAgICAgICAgICAgICBMQVNUX1JFUVVFU1Q6IHNlbGYubGFzdF9yZXF1ZXN0W2VuZHBvaW50X2lkXSwKICAgICAgICAgICAgICAgICAgICBFUlJPUl9DT1VOVDogc2VsZi5lcnJvcl9jb3VudFtlbmRwb2ludF9pZF0sCiAgICAgICAgICAgICAgICAgICAgTEFCRUxTOiBldmVudC5nZXQoTEFCRUxTLCB7fSksCiAgICAgICAgICAgICAgICAgICAgTUVUUklDUzogZXZlbnQuZ2V0KE1FVFJJQ1MsIHt9KSwKICAgICAgICAgICAgICAgICAgICBFTlRJVElFUzogZXZlbnQuZ2V0KCJyZXF1ZXN0Iiwge30pLmdldChFTlRJVElFUywge30pLAogICAgICAgICAgICAgICAgICAgIFVOUEFDS0VEX0xBQkVMUzogdW5wYWNrZWRfbGFiZWxzLAogICAgICAgICAgICAgICAgfQogICAgICAgICAgICApCiAgICAgICAgcmV0dXJuIGV2ZW50cwoKICAgIGRlZiByZXN1bWVfc3RhdGUoc2VsZiwgZW5kcG9pbnRfaWQpOgogICAgICAgICMgTWFrZSBzdXJlIHByb2Nlc3MgaXMgcmVzdW1hYmxlLCBpZiBwcm9jZXNzIGZhaWxzIGZvciBhbnkgcmVhc29uLCBiZSBhYmxlIHRvIHBpY2sgdGhpbmdzIHVwIGNsb3NlIHRvIHdoZXJlIHdlCiAgICAgICAgIyBsZWZ0IHRoZW0KICAgICAgICBpZiBlbmRwb2ludF9pZCBub3QgaW4gc2VsZi5lbmRwb2ludHM6CiAgICAgICAgICAgIGxvZ2dlci5pbmZvKCJUcnlpbmcgdG8gcmVzdW1lIHN0YXRlIiwgZW5kcG9pbnRfaWQ9ZW5kcG9pbnRfaWQpCiAgICAgICAgICAgIGVuZHBvaW50X3JlY29yZCA9IGdldF9lbmRwb2ludF9yZWNvcmQoCiAgICAgICAgICAgICAgICBrdl9jb250YWluZXI9c2VsZi5rdl9jb250YWluZXIsCiAgICAgICAgICAgICAgICBrdl9wYXRoPXNlbGYua3ZfcGF0aCwKICAgICAgICAgICAgICAgIGVuZHBvaW50X2lkPWVuZHBvaW50X2lkLAogICAgICAgICAgICAgICAgYWNjZXNzX2tleT1zZWxmLnYzaW9fYWNjZXNzX2tleSwKICAgICAgICAgICAgKQogICAgICAgICAgICBpZiBlbmRwb2ludF9yZWNvcmQ6CiAgICAgICAgICAgICAgICBmaXJzdF9yZXF1ZXN0ID0gZW5kcG9pbnRfcmVjb3JkLmdldChGSVJTVF9SRVFVRVNUKQogICAgICAgICAgICAgICAgaWYgZmlyc3RfcmVxdWVzdDoKICAgICAgICAgICAgICAgICAgICBzZWxmLmZpcnN0X3JlcXVlc3RbZW5kcG9pbnRfaWRdID0gZmlyc3RfcmVxdWVzdAogICAgICAgICAgICAgICAgZXJyb3JfY291bnQgPSBlbmRwb2ludF9yZWNvcmQuZ2V0KEVSUk9SX0NPVU5UKQogICAgICAgICAgICAgICAgaWYgZXJyb3JfY291bnQ6CiAgICAgICAgICAgICAgICAgICAgc2VsZi5lcnJvcl9jb3VudFtlbmRwb2ludF9pZF0gPSBlcnJvcl9jb3VudAogICAgICAgICAgICBzZWxmLmVuZHBvaW50cy5hZGQoZW5kcG9pbnRfaWQpCgogICAgZGVmIGlzX3ZhbGlkKAogICAgICAgIHNlbGYsIGVuZHBvaW50X2lkOiBzdHIsIHZhbGlkYXRpb25fZnVuY3Rpb24sIGZpZWxkOiBBbnksIGRpY3RfcGF0aDogTGlzdFtzdHJdCiAgICApOgogICAgICAgIGlmIHZhbGlkYXRpb25fZnVuY3Rpb24oZmllbGQsIGRpY3RfcGF0aCk6CiAgICAgICAgICAgIHJldHVybiBUcnVlCiAgICAgICAgc2VsZi5lcnJvcl9jb3VudFtlbmRwb2ludF9pZF0gKz0gMQogICAgICAgIHJldHVybiBGYWxzZQoKICAgIGRlZiBoYW5kbGVfZXJyb3JzKHNlbGYsIGVuZHBvaW50X2lkLCBldmVudCkgLT4gYm9vbDoKICAgICAgICBpZiAiZXJyb3IiIGluIGV2ZW50OgogICAgICAgICAgICBzZWxmLmVycm9yX2NvdW50W2VuZHBvaW50X2lkXSArPSAxCiAgICAgICAgICAgIHJldHVybiBUcnVlCgogICAgICAgIHJldHVybiBGYWxzZQoKCmRlZiBlbnJpY2hfZXZlbl9kZXRhaWxzKGV2ZW50KSAtPiBPcHRpb25hbFtkaWN0XToKICAgIGZ1bmN0aW9uX3VyaSA9IGV2ZW50LmdldChGVU5DVElPTl9VUkkpCgogICAgaWYgbm90IGlzX25vdF9ub25lKGZ1bmN0aW9uX3VyaSwgW0ZVTkNUSU9OX1VSSV0pOgogICAgICAgIHJldHVybiBOb25lCgogICAgbW9kZWwgPSBldmVudC5nZXQoTU9ERUwpCiAgICBpZiBub3QgaXNfbm90X25vbmUobW9kZWwsIFtNT0RFTF0pOgogICAgICAgIHJldHVybiBOb25lCgogICAgdmVyc2lvbiA9IGV2ZW50LmdldChWRVJTSU9OKQogICAgdmVyc2lvbmVkX21vZGVsID0gZiJ7bW9kZWx9Ont2ZXJzaW9ufSIgaWYgdmVyc2lvbiBlbHNlIGYie21vZGVsfTpsYXRlc3QiCgogICAgZW5kcG9pbnRfaWQgPSBjcmVhdGVfbW9kZWxfZW5kcG9pbnRfaWQoCiAgICAgICAgZnVuY3Rpb25fdXJpPWZ1bmN0aW9uX3VyaSwgdmVyc2lvbmVkX21vZGVsPXZlcnNpb25lZF9tb2RlbCwKICAgICkKCiAgICBlbmRwb2ludF9pZCA9IHN0cihlbmRwb2ludF9pZCkKCiAgICBldmVudFtWRVJTSU9ORURfTU9ERUxdID0gdmVyc2lvbmVkX21vZGVsCiAgICBldmVudFtFTkRQT0lOVF9JRF0gPSBlbmRwb2ludF9pZAoKICAgIHJldHVybiBldmVudAoKCmRlZiBpc19ub3Rfbm9uZShmaWVsZDogQW55LCBkaWN0X3BhdGg6IExpc3Rbc3RyXSk6CiAgICBpZiBmaWVsZCBpcyBub3QgTm9uZToKICAgICAgICByZXR1cm4gVHJ1ZQogICAgbG9nZ2VyLmVycm9yKAogICAgICAgIGYiRXhwZWN0ZWQgZXZlbnQgZmllbGQgaXMgbWlzc2luZzoge2ZpZWxkfSBbRXZlbnQgLT4geycnLmpvaW4oZGljdF9wYXRoKX1dIgogICAgKQogICAgcmV0dXJuIEZhbHNlCgoKZGVmIGlzX2xpc3Rfb2ZfbnVtZXJpY3MoCiAgICBmaWVsZDogTGlzdFtVbmlvbltpbnQsIGZsb2F0LCBkaWN0LCBsaXN0XV0sIGRpY3RfcGF0aDogTGlzdFtzdHJdCik6CiAgICBpZiBhbGwoaXNpbnN0YW5jZSh4LCBpbnQpIG9yIGlzaW5zdGFuY2UoeCwgZmxvYXQpIGZvciB4IGluIGZpZWxkKToKICAgICAgICByZXR1cm4gVHJ1ZQogICAgbG9nZ2VyLmVycm9yKAogICAgICAgIGYiRXhwZWN0ZWQgZXZlbnQgZmllbGQgaXMgbWlzc2luZzoge2ZpZWxkfSBbRXZlbnQgLT4geycnLmpvaW4oZGljdF9wYXRoKX1dIgogICAgKQogICAgcmV0dXJuIEZhbHNlCgoKY2xhc3MgRmlsdGVyTm90Tm9uZShGaWx0ZXIpOgogICAgZGVmIF9faW5pdF9fKHNlbGYsICoqa3dhcmdzKToKICAgICAgICBzdXBlcigpLl9faW5pdF9fKGZuPWxhbWJkYSBldmVudDogZXZlbnQgaXMgbm90IE5vbmUsICoqa3dhcmdzKQoKCmNsYXNzIEZpbHRlcktleXMoTWFwQ2xhc3MpOgogICAgZGVmIF9faW5pdF9fKHNlbGYsICphcmdzLCAqKmt3YXJncyk6CiAgICAgICAgc3VwZXIoKS5fX2luaXRfXygqKmt3YXJncykKICAgICAgICBzZWxmLmtleXMgPSBsaXN0KGFyZ3MpCgogICAgZGVmIGRvKHNlbGYsIGV2ZW50KToKICAgICAgICBuZXdfZXZlbnQgPSB7fQogICAgICAgIGZvciBrZXkgaW4gc2VsZi5rZXlzOgogICAgICAgICAgICBpZiBrZXkgaW4gZXZlbnQ6CiAgICAgICAgICAgICAgICBuZXdfZXZlbnRba2V5XSA9IGV2ZW50W2tleV0KCiAgICAgICAgcmV0dXJuIG5ld19ldmVudCBpZiBuZXdfZXZlbnQgZWxzZSBOb25lCgoKY2xhc3MgVW5wYWNrVmFsdWVzKE1hcENsYXNzKToKICAgIGRlZiBfX2luaXRfXyhzZWxmLCAqYXJncywgKiprd2FyZ3MpOgogICAgICAgIHN1cGVyKCkuX19pbml0X18oKiprd2FyZ3MpCiAgICAgICAgc2VsZi5rZXlzX3RvX3VucGFjayA9IHNldChhcmdzKQoKICAgIGRlZiBkbyhzZWxmLCBldmVudCk6CiAgICAgICAgdW5wYWNrZWQgPSB7fQogICAgICAgIGZvciBrZXkgaW4gZXZlbnQua2V5cygpOgogICAgICAgICAgICBpZiBrZXkgaW4gc2VsZi5rZXlzX3RvX3VucGFjazoKICAgICAgICAgICAgICAgIHVucGFja2VkID0geyoqdW5wYWNrZWQsICoqZXZlbnRba2V5XX0KICAgICAgICAgICAgZWxzZToKICAgICAgICAgICAgICAgIHVucGFja2VkW2tleV0gPSBldmVudFtrZXldCiAgICAgICAgcmV0dXJuIHVucGFja2VkCgoKY2xhc3MgTWFwRmVhdHVyZU5hbWVzKE1hcENsYXNzKToKICAgIGRlZiBfX2luaXRfXyhzZWxmLCBrdl9jb250YWluZXI6IHN0ciwga3ZfcGF0aDogc3RyLCBhY2Nlc3Nfa2V5OiBzdHIsICoqa3dhcmdzKToKICAgICAgICBzdXBlcigpLl9faW5pdF9fKCoqa3dhcmdzKQogICAgICAgIHNlbGYua3ZfY29udGFpbmVyID0ga3ZfY29udGFpbmVyCiAgICAgICAgc2VsZi5rdl9wYXRoID0ga3ZfcGF0aAogICAgICAgIHNlbGYuYWNjZXNzX2tleSA9IGFjY2Vzc19rZXkKICAgICAgICBzZWxmLmZlYXR1cmVfbmFtZXMgPSB7fQogICAgICAgIHNlbGYubGFiZWxfY29sdW1ucyA9IHt9CgogICAgZGVmIGRvKHNlbGYsIGV2ZW50OiBEaWN0KToKICAgICAgICBlbmRwb2ludF9pZCA9IGV2ZW50W0VORFBPSU5UX0lEXQoKICAgICAgICBpZiBlbmRwb2ludF9pZCBub3QgaW4gc2VsZi5mZWF0dXJlX25hbWVzOgogICAgICAgICAgICBlbmRwb2ludF9yZWNvcmQgPSBnZXRfZW5kcG9pbnRfcmVjb3JkKAogICAgICAgICAgICAgICAga3ZfY29udGFpbmVyPXNlbGYua3ZfY29udGFpbmVyLAogICAgICAgICAgICAgICAga3ZfcGF0aD1zZWxmLmt2X3BhdGgsCiAgICAgICAgICAgICAgICBlbmRwb2ludF9pZD1lbmRwb2ludF9pZCwKICAgICAgICAgICAgICAgIGFjY2Vzc19rZXk9c2VsZi5hY2Nlc3Nfa2V5LAogICAgICAgICAgICApCiAgICAgICAgICAgIGZlYXR1cmVfbmFtZXMgPSBlbmRwb2ludF9yZWNvcmQuZ2V0KEZFQVRVUkVfTkFNRVMpCiAgICAgICAgICAgIGZlYXR1cmVfbmFtZXMgPSBqc29uLmxvYWRzKGZlYXR1cmVfbmFtZXMpIGlmIGZlYXR1cmVfbmFtZXMgZWxzZSBOb25lCgogICAgICAgICAgICBsYWJlbF9jb2x1bW5zID0gZW5kcG9pbnRfcmVjb3JkLmdldChMQUJFTF9DT0xVTU5TKQogICAgICAgICAgICBsYWJlbF9jb2x1bW5zID0ganNvbi5sb2FkcyhsYWJlbF9jb2x1bW5zKSBpZiBsYWJlbF9jb2x1bW5zIGVsc2UgTm9uZQoKICAgICAgICAgICAgaWYgbm90IGZlYXR1cmVfbmFtZXM6CiAgICAgICAgICAgICAgICBsb2dnZXIud2FybigKICAgICAgICAgICAgICAgICAgICBmIkZlYXR1cmUgbmFtZXMgYXJlIG5vdCBpbml0aWFsaXplZCwgdGhleSB3aWxsIGJlIGF1dG9tYXRpY2FsbHkgZ2VuZXJhdGVkIiwKICAgICAgICAgICAgICAgICAgICBlbmRwb2ludF9pZD1lbmRwb2ludF9pZCwKICAgICAgICAgICAgICAgICkKICAgICAgICAgICAgICAgIGZlYXR1cmVfbmFtZXMgPSBbZiJme2l9IiBmb3IgaSwgXyBpbiBlbnVtZXJhdGUoZXZlbnRbRkVBVFVSRVNdKV0KICAgICAgICAgICAgICAgIGdldF92M2lvX2NsaWVudCgpLmt2LnVwZGF0ZSgKICAgICAgICAgICAgICAgICAgICBjb250YWluZXI9c2VsZi5rdl9jb250YWluZXIsCiAgICAgICAgICAgICAgICAgICAgdGFibGVfcGF0aD1zZWxmLmt2X3BhdGgsCiAgICAgICAgICAgICAgICAgICAgYWNjZXNzX2tleT1zZWxmLmFjY2Vzc19rZXksCiAgICAgICAgICAgICAgICAgICAga2V5PWV2ZW50W0VORFBPSU5UX0lEXSwKICAgICAgICAgICAgICAgICAgICBhdHRyaWJ1dGVzPXtGRUFUVVJFX05BTUVTOiBqc29uLmR1bXBzKGZlYXR1cmVfbmFtZXMpfSwKICAgICAgICAgICAgICAgICAgICByYWlzZV9mb3Jfc3RhdHVzPVJhaXNlRm9yU3RhdHVzLmFsd2F5cywKICAgICAgICAgICAgICAgICkKCiAgICAgICAgICAgIGlmIG5vdCBsYWJlbF9jb2x1bW5zOgogICAgICAgICAgICAgICAgbG9nZ2VyLndhcm4oCiAgICAgICAgICAgICAgICAgICAgZiJsYWJlbCBjb2x1bW4gbmFtZXMgYXJlIG5vdCBpbml0aWFsaXplZCwgdGhleSB3aWxsIGJlIGF1dG9tYXRpY2FsbHkgZ2VuZXJhdGVkIiwKICAgICAgICAgICAgICAgICAgICBlbmRwb2ludF9pZD1lbmRwb2ludF9pZCwKICAgICAgICAgICAgICAgICkKICAgICAgICAgICAgICAgIGxhYmVsX2NvbHVtbnMgPSBbZiJwe2l9IiBmb3IgaSwgXyBpbiBlbnVtZXJhdGUoZXZlbnRbUFJFRElDVElPTl0pXQogICAgICAgICAgICAgICAgZ2V0X3YzaW9fY2xpZW50KCkua3YudXBkYXRlKAogICAgICAgICAgICAgICAgICAgIGNvbnRhaW5lcj1zZWxmLmt2X2NvbnRhaW5lciwKICAgICAgICAgICAgICAgICAgICB0YWJsZV9wYXRoPXNlbGYua3ZfcGF0aCwKICAgICAgICAgICAgICAgICAgICBhY2Nlc3Nfa2V5PXNlbGYuYWNjZXNzX2tleSwKICAgICAgICAgICAgICAgICAgICBrZXk9ZXZlbnRbRU5EUE9JTlRfSURdLAogICAgICAgICAgICAgICAgICAgIGF0dHJpYnV0ZXM9e0xBQkVMX0NPTFVNTlM6IGpzb24uZHVtcHMobGFiZWxfY29sdW1ucyl9LAogICAgICAgICAgICAgICAgICAgIHJhaXNlX2Zvcl9zdGF0dXM9UmFpc2VGb3JTdGF0dXMuYWx3YXlzLAogICAgICAgICAgICAgICAgKQoKICAgICAgICAgICAgc2VsZi5sYWJlbF9jb2x1bW5zW2VuZHBvaW50X2lkXSA9IGxhYmVsX2NvbHVtbnMKICAgICAgICAgICAgc2VsZi5mZWF0dXJlX25hbWVzW2VuZHBvaW50X2lkXSA9IGZlYXR1cmVfbmFtZXMKCiAgICAgICAgICAgIGxvZ2dlci5pbmZvKAogICAgICAgICAgICAgICAgIkxhYmVsIGNvbHVtbnMiLCBlbmRwb2ludF9pZD1lbmRwb2ludF9pZCwgbGFiZWxfY29sdW1ucz1sYWJlbF9jb2x1bW5zCiAgICAgICAgICAgICkKICAgICAgICAgICAgbG9nZ2VyLmluZm8oCiAgICAgICAgICAgICAgICAiRmVhdHVyZSBuYW1lcyIsIGVuZHBvaW50X2lkPWVuZHBvaW50X2lkLCBmZWF0dXJlX25hbWVzPWZlYXR1cmVfbmFtZXMKICAgICAgICAgICAgKQoKICAgICAgICBmZWF0dXJlX25hbWVzID0gc2VsZi5mZWF0dXJlX25hbWVzW2VuZHBvaW50X2lkXQogICAgICAgIGZlYXR1cmVzID0gZXZlbnRbRkVBVFVSRVNdCiAgICAgICAgZXZlbnRbTkFNRURfRkVBVFVSRVNdID0gewogICAgICAgICAgICBuYW1lOiBmZWF0dXJlIGZvciBuYW1lLCBmZWF0dXJlIGluIHppcChmZWF0dXJlX25hbWVzLCBmZWF0dXJlcykKICAgICAgICB9CgogICAgICAgIGxhYmVsX2NvbHVtbnMgPSBzZWxmLmxhYmVsX2NvbHVtbnNbZW5kcG9pbnRfaWRdCiAgICAgICAgcHJlZGljdGlvbiA9IGV2ZW50W1BSRURJQ1RJT05dCiAgICAgICAgZXZlbnRbTkFNRURfUFJFRElDVElPTlNdID0gewogICAgICAgICAgICBuYW1lOiBwcmVkaWN0aW9uIGZvciBuYW1lLCBwcmVkaWN0aW9uIGluIHppcChsYWJlbF9jb2x1bW5zLCBwcmVkaWN0aW9uKQogICAgICAgIH0KICAgICAgICBsb2dnZXIuaW5mbygiTWFwcGVkIGV2ZW50IiwgZXZlbnQ9ZXZlbnQpCiAgICAgICAgcmV0dXJuIGV2ZW50CgoKY2xhc3MgV3JpdGVUb0tWKE1hcENsYXNzKToKICAgIGRlZiBfX2luaXRfXyhzZWxmLCBjb250YWluZXI6IHN0ciwgdGFibGU6IHN0ciwgKiprd2FyZ3MpOgogICAgICAgIHN1cGVyKCkuX19pbml0X18oKiprd2FyZ3MpCiAgICAgICAgc2VsZi5jb250YWluZXIgPSBjb250YWluZXIKICAgICAgICBzZWxmLnRhYmxlID0gdGFibGUKCiAgICBkZWYgZG8oc2VsZiwgZXZlbnQ6IERpY3QpOgogICAgICAgIGdldF92M2lvX2NsaWVudCgpLmt2LnVwZGF0ZSgKICAgICAgICAgICAgY29udGFpbmVyPXNlbGYuY29udGFpbmVyLAogICAgICAgICAgICB0YWJsZV9wYXRoPXNlbGYudGFibGUsCiAgICAgICAgICAgIGtleT1ldmVudFtFTkRQT0lOVF9JRF0sCiAgICAgICAgICAgIGF0dHJpYnV0ZXM9ZXZlbnQsCiAgICAgICAgKQogICAgICAgIHJldHVybiBldmVudAoKCmNsYXNzIEluZmVyU2NoZW1hKE1hcENsYXNzKToKICAgIGRlZiBfX2luaXRfXygKICAgICAgICBzZWxmLAogICAgICAgIHYzaW9fYWNjZXNzX2tleTogc3RyLAogICAgICAgIHYzaW9fZnJhbWVzZDogc3RyLAogICAgICAgIGNvbnRhaW5lcjogc3RyLAogICAgICAgIHRhYmxlOiBzdHIsCiAgICAgICAgKiprd2FyZ3MsCiAgICApOgogICAgICAgIHN1cGVyKCkuX19pbml0X18oKiprd2FyZ3MpCiAgICAgICAgc2VsZi5jb250YWluZXIgPSBjb250YWluZXIKICAgICAgICBzZWxmLnYzaW9fYWNjZXNzX2tleSA9IHYzaW9fYWNjZXNzX2tleQogICAgICAgIHNlbGYudjNpb19mcmFtZXNkID0gdjNpb19mcmFtZXNkCiAgICAgICAgc2VsZi50YWJsZSA9IHRhYmxlCiAgICAgICAgc2VsZi5rZXlzID0gc2V0KCkKCiAgICBkZWYgZG8oc2VsZiwgZXZlbnQ6IERpY3QpOgogICAgICAgIGtleV9zZXQgPSBzZXQoZXZlbnQua2V5cygpKQogICAgICAgIGlmIG5vdCBrZXlfc2V0Lmlzc3Vic2V0KHNlbGYua2V5cyk6CiAgICAgICAgICAgIHNlbGYua2V5cy51cGRhdGUoa2V5X3NldCkKICAgICAgICAgICAgZ2V0X2ZyYW1lc19jbGllbnQoCiAgICAgICAgICAgICAgICB0b2tlbj1zZWxmLnYzaW9fYWNjZXNzX2tleSwKICAgICAgICAgICAgICAgIGNvbnRhaW5lcj1zZWxmLmNvbnRhaW5lciwKICAgICAgICAgICAgICAgIGFkZHJlc3M9c2VsZi52M2lvX2ZyYW1lc2QsCiAgICAgICAgICAgICkuZXhlY3V0ZShiYWNrZW5kPSJrdiIsIHRhYmxlPXNlbGYudGFibGUsIGNvbW1hbmQ9ImluZmVyX3NjaGVtYSIpCiAgICAgICAgICAgIGxvZ2dlci5pbmZvKAogICAgICAgICAgICAgICAgIkZvdW5kIG5ldyBrZXlzLCBpbmZlcnJlZCBzY2hlbWEiLCB0YWJsZT1zZWxmLnRhYmxlLCBldmVudD1ldmVudAogICAgICAgICAgICApCiAgICAgICAgcmV0dXJuIGV2ZW50CgoKZGVmIGdldF9lbmRwb2ludF9yZWNvcmQoCiAgICBrdl9jb250YWluZXI6IHN0ciwga3ZfcGF0aDogc3RyLCBlbmRwb2ludF9pZDogc3RyLCBhY2Nlc3Nfa2V5OiBzdHIKKSAtPiBPcHRpb25hbFtkaWN0XToKICAgIGxvZ2dlci5pbmZvKAogICAgICAgIGYiR3JhYmJpbmcgZW5kcG9pbnQgZGF0YSIsCiAgICAgICAgY29udGFpbmVyPWt2X2NvbnRhaW5lciwKICAgICAgICB0YWJsZV9wYXRoPWt2X3BhdGgsCiAgICAgICAga2V5PWVuZHBvaW50X2lkLAogICAgKQogICAgdHJ5OgogICAgICAgIGVuZHBvaW50X3JlY29yZCA9ICgKICAgICAgICAgICAgZ2V0X3YzaW9fY2xpZW50KCkKICAgICAgICAgICAgLmt2LmdldCgKICAgICAgICAgICAgICAgIGNvbnRhaW5lcj1rdl9jb250YWluZXIsCiAgICAgICAgICAgICAgICB0YWJsZV9wYXRoPWt2X3BhdGgsCiAgICAgICAgICAgICAgICBrZXk9ZW5kcG9pbnRfaWQsCiAgICAgICAgICAgICAgICBhY2Nlc3Nfa2V5PWFjY2Vzc19rZXksCiAgICAgICAgICAgICAgICByYWlzZV9mb3Jfc3RhdHVzPXYzaW8uZGF0YXBsYW5lLlJhaXNlRm9yU3RhdHVzLmFsd2F5cywKICAgICAgICAgICAgKQogICAgICAgICAgICAub3V0cHV0Lml0ZW0KICAgICAgICApCiAgICAgICAgcmV0dXJuIGVuZHBvaW50X3JlY29yZAogICAgZXhjZXB0IEV4Y2VwdGlvbjoKICAgICAgICByZXR1cm4gTm9uZQoKCmRlZiBpbml0X2NvbnRleHQoY29udGV4dDogTUxDbGllbnRDdHgpOgogICAgY29udGV4dC5sb2dnZXIuaW5mbygiSW5pdGlhbGl6aW5nIEV2ZW50U3RyZWFtUHJvY2Vzc29yIikKICAgIHBhcmFtZXRlcnMgPSBlbnZpcm9uLmdldCgiTU9ERUxfTU9OSVRPUklOR19QQVJBTUVURVJTIikKICAgIHBhcmFtZXRlcnMgPSBqc29uLmxvYWRzKHBhcmFtZXRlcnMpIGlmIHBhcmFtZXRlcnMgZWxzZSB7fQogICAgc3RyZWFtX3Byb2Nlc3NvciA9IEV2ZW50U3RyZWFtUHJvY2Vzc29yKCoqcGFyYW1ldGVycykKICAgIHNldGF0dHIoY29udGV4dCwgInN0cmVhbV9wcm9jZXNzb3IiLCBzdHJlYW1fcHJvY2Vzc29yKQoKCmRlZiBoYW5kbGVyKGNvbnRleHQ6IE1MQ2xpZW50Q3R4LCBldmVudDogRXZlbnQpOgogICAgZXZlbnRfYm9keSA9IGpzb24ubG9hZHMoZXZlbnQuYm9keSkKICAgIGNvbnRleHQubG9nZ2VyLmRlYnVnKGV2ZW50X2JvZHkpCiAgICBjb250ZXh0LnN0cmVhbV9wcm9jZXNzb3IuY29uc3VtZShldmVudF9ib2R5KQo= - source: '' - build: - commands: [] - code_origin: https://github.com/Michaelliv/functions.git#202b4c489e4c02c3025742ea237f1a042b7c6043:/home/michaell/projects/functions/model_monitoring_stream/model_monitoring_stream.py - default_handler: handler -verbose: false diff --git a/model_monitoring_stream/item.yaml b/model_monitoring_stream/item.yaml deleted file mode 100644 index 219fa5286..000000000 --- a/model_monitoring_stream/item.yaml +++ /dev/null @@ -1,23 +0,0 @@ -apiVersion: v1 -categories: -- monitoring -description: '' -doc: '' -example: model_monitoring_stream.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: {} -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: model-monitoring-stream -platformVersion: 3.5.0 -spec: - filename: model_monitoring_stream.py - handler: handler - image: livsmichael/mlrun-api:automation - kind: nuclio - requirements: [] -url: '' -version: 1.1.0 diff --git a/model_monitoring_stream/model_monitoring_stream.ipynb b/model_monitoring_stream/model_monitoring_stream.ipynb deleted file mode 100644 index 93d8c92e4..000000000 --- a/model_monitoring_stream/model_monitoring_stream.ipynb +++ /dev/null @@ -1,178 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "source": [ - "# Model Monitoring\n", - "\n", - "## Initial set up (and pre-requisites)\n", - "1. Make sure you have the `mlrun-api` datasource available in your Grafana instance, otherwise add it by:\n", - " 1. Open your grafana instance\n", - " 2. Navigate to `Configuration -> Data Sources`\n", - " 3. Press `Add data source` and configure the following parameters\n", - " ```\n", - " Name: mlrun-api\n", - " URL: http://mlrun-api:8080/api/grafana-proxy/model-endpoints\n", - " Access: Server (default)\n", - "\n", - " ## Add a custom header of:\n", - " X-V3io-Session-Key: \n", - " ```\n", - " 4. Press `Save & Test` to make sure it works, a confirmation message should appear when this button is pressed\n", - "\n", - "2. Import the available dashboards `(./dashboards/*)` to you Grafana instance\n", - "3. To allow the system to utilize drift measurement, make sure you supply the train set when logging the model on the\n", - " training step\n", - "\n", - " ```python\n", - " # Log model\n", - " context.log_model(\n", - " \"model\",\n", - " body=dumps(model),\n", - " artifact_path=context.artifact_subpath(\"models\"),\n", - " extra_data=eval_metrics,\n", - " model_file=\"model.pkl\",\n", - " metrics=context.results,\n", - " training_set=X_test, # <- make sure this is passed into log_model\n", - " labels={\"class\": \"sklearn.linear_model.LogisticRegression\"}\n", - " )\n", - " ```\n", - "4. When serving a model, make sure that the Nuclio function is deployed with tracking enabled by applying\n", - " `fn.set_tracking()`\n", - "\n", - "## Configuration\n", - "The stream processing portion of the model monitoring, can be deployed under multiple configuration options. The\n", - "available configurations can be found under `stream.Config`. Once configured it should be supplied as environment\n", - "parameters to the Nuclio function by setting `fn.set_envs`\n", - "\n", - "```python\n", - "project: str # project name\n", - "sample_window: int # The sampling window for the data that flows into the TSDB and the KV\n", - "kv_path_template: str # Path template for the kv table\n", - "tsdb_path_template: str # Path template for the tsdb table\n", - "parquet_path_template: str # v3io parquets path template, assumes v3io is mounted\n", - "tsdb_batching_max_events: int # The max amount of event to batch before writing the batch to tsdb\n", - "tsdb_batching_timeout_secs: int # The max amount of seconds a given batch can be gathered before being emitted\n", - "parquet_batching_max_events: int # The max amount of event to batch before writing the batch to parquet\n", - "parquet_batching_timeout_secs: int # The max amount of seconds, a given batch can be gathered before being written to parquet\n", - "container: str # container name\n", - "v3io_access_key: str # V3IO Access key\n", - "v3io_framesd: str # V3IO framesd URL\n", - "time_format: str # The time format into which time related fields will be converted\n", - "aggregate_count_windows: List[str] # List of window sizes for predictions count\n", - "aggregate_count_period: str # Period of predictions count windows\n", - "aggregate_avg_windows: List[str] # List of window sizes for average latency\n", - "aggregate_avg_period: str # Period of average latency windows\n", - "```" - ], - "metadata": { - "collapsed": false, - "pycharm": { - "name": "#%% md\n" - } - } - }, - { - "cell_type": "markdown", - "source": [ - "## Export function yaml" - ], - "metadata": { - "collapsed": false, - "pycharm": { - "name": "#%% md\n" - } - } - }, - { - "cell_type": "code", - "execution_count": null, - "outputs": [], - "source": [ - "from mlrun import code_to_function\n", - "from mlrun.runtimes import RemoteRuntime\n", - "\n", - "\n", - "fn: RemoteRuntime = code_to_function(\n", - " name=\"model-monitoring-stream\",\n", - " kind=\"nuclio\",\n", - " image=\"mlrun/mlrun\",\n", - " filename=\"model_monitoring_stream.py\",\n", - " handler=\"handler\",\n", - ")\n", - "fn.export(\"model_monitoring_stream.yaml\")\n" - ], - "metadata": { - "collapsed": false, - "pycharm": { - "name": "#%%\n" - } - } - }, - { - "cell_type": "markdown", - "source": [ - "## Deploy Stream Processing" - ], - "metadata": { - "collapsed": false - } - }, - { - "cell_type": "code", - "execution_count": null, - "outputs": [], - "source": [ - "import os\n", - "\n", - "from mlrun import import_function\n", - "from mlrun.platforms import mount_v3io\n", - "from mlrun.runtimes import RemoteRuntime\n", - "import json\n", - "\n", - "# Set project name\n", - "project = \"\"\n", - "\n", - "fn: RemoteRuntime = import_function(\"hub://model_monitoring_stream\")\n", - "\n", - "fn.add_v3io_stream_trigger(\n", - " stream_path=f\"projects/{project}/model-endpoints/stream\",\n", - " name=\"monitoring_stream_trigger\",\n", - ")\n", - "\n", - "fn.set_env(\"MODEL_MONITORING_PARAMETERS\", json.dumps({\"project\": project, \"v3io_framesd\": os.environ.get(\"V3IO_FRAMESD\")}))\n", - "\n", - "fn.metadata.project = project\n", - "fn.apply(mount_v3io())\n", - "fn.deploy()" - ], - "metadata": { - "collapsed": false, - "pycharm": { - "name": "#%%\n" - } - } - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 2 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython2", - "version": "2.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} \ No newline at end of file diff --git a/model_monitoring_stream/model_monitoring_stream.py b/model_monitoring_stream/model_monitoring_stream.py deleted file mode 100644 index 90c8b92c2..000000000 --- a/model_monitoring_stream/model_monitoring_stream.py +++ /dev/null @@ -1,768 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import json -import os -from collections import defaultdict -from datetime import datetime -from os import environ -from typing import Dict, List, Set, Optional, Any, Union - -import pandas as pd -import v3io -from mlrun.config import config -from mlrun.run import MLClientCtx -from mlrun.utils import logger -from mlrun.utils.model_monitoring import ( - parse_model_endpoint_store_prefix, - create_model_endpoint_id, -) -from mlrun.utils.v3io_clients import get_v3io_client, get_frames_client -from nuclio import Event -from storey import ( - FieldAggregator, - NoopDriver, - Table, - Map, - MapClass, - AggregateByKey, - build_flow, - Filter, - FlatMap, - TSDBTarget, - ParquetTarget, - SyncEmitSource, -) -from storey.dtypes import SlidingWindows -from storey.steps import SampleWindow -# Constants -from v3io.dataplane import RaiseForStatus - -ISO_8061_UTC = "%Y-%m-%d %H:%M:%S.%f%z" -FUNCTION_URI = "function_uri" -MODEL = "model" -VERSION = "version" -VERSIONED_MODEL = "versioned_model" -MODEL_CLASS = "model_class" -TIMESTAMP = "timestamp" -ENDPOINT_ID = "endpoint_id" -REQUEST_ID = "request_id" -LABELS = "labels" -UNPACKED_LABELS = "unpacked_labels" -LATENCY_AVG_5M = "latency_avg_5m" -LATENCY_AVG_1H = "latency_avg_1h" -PREDICTIONS_PER_SECOND = "predictions_per_second" -PREDICTIONS_COUNT_5M = "predictions_count_5m" -PREDICTIONS_COUNT_1H = "predictions_count_1h" -FIRST_REQUEST = "first_request" -LAST_REQUEST = "last_request" -ERROR_COUNT = "error_count" -ENTITIES = "entities" -FEATURE_NAMES = "feature_names" -LABEL_COLUMNS = "label_columns" -LATENCY = "latency" -RECORD_TYPE = "record_type" -FEATURES = "features" -PREDICTION = "prediction" -PREDICTIONS = "predictions" -NAMED_FEATURES = "named_features" -NAMED_PREDICTIONS = "named_predictions" -BASE_METRICS = "base_metrics" -CUSTOM_METRICS = "custom_metrics" -ENDPOINT_FEATURES = "endpoint_features" -METRICS = "metrics" -BATCH_TIMESTAMP = "batch_timestamp" -TIME_FORMAT: str = "%Y-%m-%d %H:%M:%S.%f" # ISO 8061 - - -# Stream processing code -class EventStreamProcessor: - def __init__( - self, - project: str, - sample_window: int = 10, - tsdb_batching_max_events: int = 10, - tsdb_batching_timeout_secs: int = 60 * 5, # Default 5 minutes - parquet_batching_max_events: int = 10_000, - parquet_batching_timeout_secs: int = 60 * 60, # Default 1 hour - aggregate_count_windows: Optional[List[str]] = None, - aggregate_count_period: str = "30s", - aggregate_avg_windows: Optional[List[str]] = None, - aggregate_avg_period: str = "30s", - v3io_access_key: Optional[str] = None, - v3io_framesd: Optional[str] = None, - v3io_api: Optional[str] = None, - ): - self.project = project - self.sample_window = sample_window - self.tsdb_batching_max_events = tsdb_batching_max_events - self.tsdb_batching_timeout_secs = tsdb_batching_timeout_secs - self.parquet_batching_max_events = parquet_batching_max_events - self.parquet_batching_timeout_secs = parquet_batching_timeout_secs - self.aggregate_count_windows = aggregate_count_windows or ["5m", "1h"] - self.aggregate_count_period = aggregate_count_period - self.aggregate_avg_windows = aggregate_avg_windows or ["5m", "1h"] - self.aggregate_avg_period = aggregate_avg_period - - self.v3io_framesd = v3io_framesd or config.v3io_framesd - self.v3io_api = v3io_api or config.v3io_api - - self.v3io_access_key = v3io_access_key or environ.get("V3IO_ACCESS_KEY") - self.model_monitoring_access_key = ( - os.environ.get("MODEL_MONITORING_ACCESS_KEY") or self.v3io_access_key - ) - - template = config.model_endpoint_monitoring.store_prefixes.default - - kv_path = template.format(project=project, kind="endpoints") - _, self.kv_container, self.kv_path = parse_model_endpoint_store_prefix(kv_path) - - tsdb_path = template.format(project=project, kind="events") - _, self.tsdb_container, self.tsdb_path = parse_model_endpoint_store_prefix( - tsdb_path - ) - self.tsdb_path = f"{self.tsdb_container}/{self.tsdb_path}" - - self.parquet_path = config.model_endpoint_monitoring.store_prefixes.user_space.format( - project=project, kind="parquet" - ) - - logger.info( - "V3IO Configuration", - v3io_access_key=self.v3io_access_key, - model_monitoring_access_key=self.model_monitoring_access_key, - default_store_prefix=config.model_endpoint_monitoring.store_prefixes.default, - user_space_store_prefix=config.model_endpoint_monitoring.store_prefixes.user_space, - v3io_api=self.v3io_api, - v3io_framesd=self.v3io_framesd, - kv_container=self.kv_container, - kv_path=self.kv_path, - tsdb_container=self.tsdb_container, - tsdb_path=self.tsdb_path, - parquet_path=self.parquet_path, - ) - - self._kv_keys = [ - FUNCTION_URI, - MODEL, - MODEL_CLASS, - TIMESTAMP, - ENDPOINT_ID, - LABELS, - UNPACKED_LABELS, - LATENCY_AVG_5M, - LATENCY_AVG_1H, - PREDICTIONS_PER_SECOND, - PREDICTIONS_COUNT_5M, - PREDICTIONS_COUNT_1H, - FIRST_REQUEST, - LAST_REQUEST, - ERROR_COUNT, - ] - - self._flow = build_flow( - [ - SyncEmitSource(), - ProcessEndpointEvent( - kv_container=self.kv_container, - kv_path=self.kv_path, - v3io_access_key=self.v3io_access_key, - ), - FilterNotNone(), - FlatMap(lambda x: x), - MapFeatureNames( - kv_container=self.kv_container, - kv_path=self.kv_path, - access_key=self.v3io_access_key, - ), - # Branch 1: Aggregate events, count averages and update TSDB and KV - [ - AggregateByKey( - aggregates=[ - FieldAggregator( - PREDICTIONS, - ENDPOINT_ID, - ["count"], - SlidingWindows( - self.aggregate_count_windows, - self.aggregate_count_period, - ), - ), - FieldAggregator( - LATENCY, - LATENCY, - ["avg"], - SlidingWindows( - self.aggregate_avg_windows, - self.aggregate_avg_period, - ), - ), - ], - table=Table("notable", NoopDriver()), - ), - SampleWindow( - self.sample_window - ), # Add required gap between event to apply sampling - Map(self.compute_predictions_per_second), - # Branch 1.1: Updated KV - [ - Map(self.process_before_kv), - WriteToKV(container=self.kv_container, table=self.kv_path), - InferSchema( - v3io_access_key=self.v3io_access_key, - v3io_framesd=self.v3io_framesd, - container=self.kv_container, - table=self.kv_path, - ), - ], - # Branch 1.2: Update TSDB - [ - # Map the event into taggable fields, add record type to each field - Map(self.process_before_events_tsdb), - [ - FilterKeys(BASE_METRICS), - UnpackValues(BASE_METRICS), - TSDBTarget( - path=self.tsdb_path, - rate="10/m", - time_col=TIMESTAMP, - container=self.tsdb_container, - access_key=self.v3io_access_key, - v3io_frames=self.v3io_framesd, - index_cols=[ENDPOINT_ID, RECORD_TYPE], - # Settings for _Batching - max_events=self.tsdb_batching_max_events, - timeout_secs=self.tsdb_batching_timeout_secs, - key=ENDPOINT_ID, - ), - ], - [ - FilterKeys(ENDPOINT_FEATURES), - UnpackValues(ENDPOINT_FEATURES), - TSDBTarget( - path=self.tsdb_path, - rate="10/m", - time_col=TIMESTAMP, - container=self.tsdb_container, - access_key=self.v3io_access_key, - v3io_frames=self.v3io_framesd, - index_cols=[ENDPOINT_ID, RECORD_TYPE], - # Settings for _Batching - max_events=self.tsdb_batching_max_events, - timeout_secs=self.tsdb_batching_timeout_secs, - key=ENDPOINT_ID, - ), - ], - [ - FilterKeys(CUSTOM_METRICS), - FilterNotNone(), - UnpackValues(CUSTOM_METRICS), - TSDBTarget( - path=self.tsdb_path, - rate="10/m", - time_col=TIMESTAMP, - container=self.tsdb_container, - access_key=self.v3io_access_key, - v3io_frames=self.v3io_framesd, - index_cols=[ENDPOINT_ID, RECORD_TYPE], - # Settings for _Batching - max_events=self.tsdb_batching_max_events, - timeout_secs=self.tsdb_batching_timeout_secs, - key=ENDPOINT_ID, - ), - ], - ], - ], - # Branch 2: Batch events, write to parquet - [ - Map(self.process_before_parquet), - ParquetTarget( - path=self.parquet_path, - partition_cols=["$key", "$year", "$month", "$day", "$hour"], - infer_columns_from_data=True, - # Settings for _Batching - max_events=self.parquet_batching_max_events, - timeout_secs=self.parquet_batching_timeout_secs, - # Settings for v3io storage - storage_options={ - "v3io_api": self.v3io_api, - "v3io_access_key": self.model_monitoring_access_key, - }, - ), - ], - ] - ).run() - - def consume(self, event: Dict): - events = [] - if "headers" in event and "values" in event: - for values in event["values"]: - events.append({k: v for k, v in zip(event["headers"], values)}) - else: - events.append(event) - - for enriched in map(enrich_even_details, events): - if enriched is not None: - self._flow.emit( - enriched, - key=enriched[ENDPOINT_ID], - event_time=datetime.strptime(enriched["when"], ISO_8061_UTC), - ) - else: - pass - - @staticmethod - def compute_predictions_per_second(event: dict): - event[PREDICTIONS_PER_SECOND] = float(event[PREDICTIONS_COUNT_5M]) / 600 - return event - - def process_before_kv(self, event: dict): - # Filter relevant keys - e = {k: event[k] for k in self._kv_keys} - # Unpack labels dictionary - e = {**e, **e.pop(UNPACKED_LABELS, {})} - # Write labels to kv as json string to be presentable later - e[LABELS] = json.dumps(e[LABELS]) - return e - - @staticmethod - def process_before_events_tsdb(event: Dict): - base_fields = [TIMESTAMP, ENDPOINT_ID] - - base_event = {k: event[k] for k in base_fields} - base_event[TIMESTAMP] = pd.to_datetime( - base_event[TIMESTAMP], format=TIME_FORMAT - ) - - base_metrics = { - RECORD_TYPE: BASE_METRICS, - PREDICTIONS_PER_SECOND: event[PREDICTIONS_PER_SECOND], - PREDICTIONS_COUNT_5M: event[PREDICTIONS_COUNT_5M], - PREDICTIONS_COUNT_1H: event[PREDICTIONS_COUNT_1H], - LATENCY_AVG_5M: event[LATENCY_AVG_5M], - LATENCY_AVG_1H: event[LATENCY_AVG_1H], - **base_event, - } - - endpoint_features = { - RECORD_TYPE: ENDPOINT_FEATURES, - **event[NAMED_PREDICTIONS], - **event[NAMED_FEATURES], - **base_event, - } - - processed = {BASE_METRICS: base_metrics, ENDPOINT_FEATURES: endpoint_features} - - if event[METRICS]: - processed[CUSTOM_METRICS] = { - RECORD_TYPE: CUSTOM_METRICS, - **event[METRICS], - **base_event, - } - - return processed - - @staticmethod - def process_before_parquet(event: dict): - def set_none_if_empty(_event: dict, keys: List[str]): - for key in keys: - if not _event.get(key): - _event[key] = None - - def drop_if_exists(_event: dict, keys: List[str]): - for key in keys: - _event.pop(key, None) - - def unpack_if_exists(_event: dict, keys: List[str]): - for key in keys: - value = _event.get(key) - if value is not None: - _event = {**value, **event} - - drop_if_exists(event, [UNPACKED_LABELS, FEATURES]) - unpack_if_exists(event, [ENTITIES]) - set_none_if_empty(event, [LABELS, METRICS, ENTITIES]) - return event - - -class ProcessEndpointEvent(MapClass): - def __init__(self, kv_container: str, kv_path: str, v3io_access_key: str, **kwargs): - super().__init__(**kwargs) - self.kv_container: str = kv_container - self.kv_path: str = kv_path - self.v3io_access_key: str = v3io_access_key - self.first_request: Dict[str, str] = dict() - self.last_request: Dict[str, str] = dict() - self.error_count: Dict[str, int] = defaultdict(int) - self.endpoints: Set[str] = set() - - def do(self, event: dict): - function_uri = event[FUNCTION_URI] - versioned_model = event[VERSIONED_MODEL] - endpoint_id = event[ENDPOINT_ID] - - # In case this process fails, resume state from existing record - self.resume_state(endpoint_id) - - # Handle errors coming from stream - found_errors = self.handle_errors(endpoint_id, event) - if found_errors: - return None - - # Validate event fields - model_class = event.get("model_class") or event.get("class") - timestamp = event.get("when") - request_id = event.get("request", {}).get("id") - latency = event.get("microsec") - features = event.get("request", {}).get("inputs") - predictions = event.get("resp", {}).get("outputs") - - if not self.is_valid(endpoint_id, is_not_none, timestamp, ["when"],): - return None - - if endpoint_id not in self.first_request: - self.first_request[endpoint_id] = timestamp - self.last_request[endpoint_id] = timestamp - - if not self.is_valid(endpoint_id, is_not_none, request_id, ["request", "id"],): - return None - if not self.is_valid(endpoint_id, is_not_none, latency, ["microsec"],): - return None - if not self.is_valid( - endpoint_id, is_not_none, features, ["request", "inputs"], - ): - return None - if not self.is_valid( - endpoint_id, is_not_none, predictions, ["resp", "outputs"], - ): - return None - - unpacked_labels = {f"_{k}": v for k, v in event.get(LABELS, {}).items()} - - # Separate each model invocation into sub events - events = [] - for i, (feature, prediction) in enumerate(zip(features, predictions)): - if not self.is_valid( - endpoint_id, - is_list_of_numerics, - feature, - ["request", "inputs", f"[{i}]"], - ): - return None - - if not isinstance(prediction, list): - prediction = [prediction] - - events.append( - { - FUNCTION_URI: function_uri, - MODEL: versioned_model, - MODEL_CLASS: model_class, - TIMESTAMP: timestamp, - ENDPOINT_ID: endpoint_id, - REQUEST_ID: request_id, - LATENCY: latency, - FEATURES: feature, - PREDICTION: prediction, - FIRST_REQUEST: self.first_request[endpoint_id], - LAST_REQUEST: self.last_request[endpoint_id], - ERROR_COUNT: self.error_count[endpoint_id], - LABELS: event.get(LABELS, {}), - METRICS: event.get(METRICS, {}), - ENTITIES: event.get("request", {}).get(ENTITIES, {}), - UNPACKED_LABELS: unpacked_labels, - } - ) - return events - - def resume_state(self, endpoint_id): - # Make sure process is resumable, if process fails for any reason, be able to pick things up close to where we - # left them - if endpoint_id not in self.endpoints: - logger.info("Trying to resume state", endpoint_id=endpoint_id) - endpoint_record = get_endpoint_record( - kv_container=self.kv_container, - kv_path=self.kv_path, - endpoint_id=endpoint_id, - access_key=self.v3io_access_key, - ) - if endpoint_record: - first_request = endpoint_record.get(FIRST_REQUEST) - if first_request: - self.first_request[endpoint_id] = first_request - error_count = endpoint_record.get(ERROR_COUNT) - if error_count: - self.error_count[endpoint_id] = error_count - self.endpoints.add(endpoint_id) - - def is_valid( - self, endpoint_id: str, validation_function, field: Any, dict_path: List[str] - ): - if validation_function(field, dict_path): - return True - self.error_count[endpoint_id] += 1 - return False - - def handle_errors(self, endpoint_id, event) -> bool: - if "error" in event: - self.error_count[endpoint_id] += 1 - return True - - return False - - -def enrich_even_details(event) -> Optional[dict]: - function_uri = event.get(FUNCTION_URI) - - if not is_not_none(function_uri, [FUNCTION_URI]): - return None - - model = event.get(MODEL) - if not is_not_none(model, [MODEL]): - return None - - version = event.get(VERSION) - versioned_model = f"{model}:{version}" if version else f"{model}:latest" - - endpoint_id = create_model_endpoint_id( - function_uri=function_uri, versioned_model=versioned_model, - ) - - endpoint_id = str(endpoint_id) - - event[VERSIONED_MODEL] = versioned_model - event[ENDPOINT_ID] = endpoint_id - - return event - - -def is_not_none(field: Any, dict_path: List[str]): - if field is not None: - return True - logger.error( - f"Expected event field is missing: {field} [Event -> {''.join(dict_path)}]" - ) - return False - - -def is_list_of_numerics( - field: List[Union[int, float, dict, list]], dict_path: List[str] -): - if all(isinstance(x, int) or isinstance(x, float) for x in field): - return True - logger.error( - f"Expected event field is missing: {field} [Event -> {''.join(dict_path)}]" - ) - return False - - -class FilterNotNone(Filter): - def __init__(self, **kwargs): - super().__init__(fn=lambda event: event is not None, **kwargs) - - -class FilterKeys(MapClass): - def __init__(self, *args, **kwargs): - super().__init__(**kwargs) - self.keys = list(args) - - def do(self, event): - new_event = {} - for key in self.keys: - if key in event: - new_event[key] = event[key] - - return new_event if new_event else None - - -class UnpackValues(MapClass): - def __init__(self, *args, **kwargs): - super().__init__(**kwargs) - self.keys_to_unpack = set(args) - - def do(self, event): - unpacked = {} - for key in event.keys(): - if key in self.keys_to_unpack: - unpacked = {**unpacked, **event[key]} - else: - unpacked[key] = event[key] - return unpacked - - -class MapFeatureNames(MapClass): - def __init__(self, kv_container: str, kv_path: str, access_key: str, **kwargs): - super().__init__(**kwargs) - self.kv_container = kv_container - self.kv_path = kv_path - self.access_key = access_key - self.feature_names = {} - self.label_columns = {} - - def do(self, event: Dict): - endpoint_id = event[ENDPOINT_ID] - - if endpoint_id not in self.feature_names: - endpoint_record = get_endpoint_record( - kv_container=self.kv_container, - kv_path=self.kv_path, - endpoint_id=endpoint_id, - access_key=self.access_key, - ) - feature_names = endpoint_record.get(FEATURE_NAMES) - feature_names = json.loads(feature_names) if feature_names else None - - label_columns = endpoint_record.get(LABEL_COLUMNS) - label_columns = json.loads(label_columns) if label_columns else None - - if not feature_names: - logger.warn( - f"Feature names are not initialized, they will be automatically generated", - endpoint_id=endpoint_id, - ) - feature_names = [f"f{i}" for i, _ in enumerate(event[FEATURES])] - get_v3io_client().kv.update( - container=self.kv_container, - table_path=self.kv_path, - access_key=self.access_key, - key=event[ENDPOINT_ID], - attributes={FEATURE_NAMES: json.dumps(feature_names)}, - raise_for_status=RaiseForStatus.always, - ) - - if not label_columns: - logger.warn( - f"label column names are not initialized, they will be automatically generated", - endpoint_id=endpoint_id, - ) - label_columns = [f"p{i}" for i, _ in enumerate(event[PREDICTION])] - get_v3io_client().kv.update( - container=self.kv_container, - table_path=self.kv_path, - access_key=self.access_key, - key=event[ENDPOINT_ID], - attributes={LABEL_COLUMNS: json.dumps(label_columns)}, - raise_for_status=RaiseForStatus.always, - ) - - self.label_columns[endpoint_id] = label_columns - self.feature_names[endpoint_id] = feature_names - - logger.info( - "Label columns", endpoint_id=endpoint_id, label_columns=label_columns - ) - logger.info( - "Feature names", endpoint_id=endpoint_id, feature_names=feature_names - ) - - feature_names = self.feature_names[endpoint_id] - features = event[FEATURES] - event[NAMED_FEATURES] = { - name: feature for name, feature in zip(feature_names, features) - } - - label_columns = self.label_columns[endpoint_id] - prediction = event[PREDICTION] - event[NAMED_PREDICTIONS] = { - name: prediction for name, prediction in zip(label_columns, prediction) - } - logger.info("Mapped event", event=event) - return event - - -class WriteToKV(MapClass): - def __init__(self, container: str, table: str, **kwargs): - super().__init__(**kwargs) - self.container = container - self.table = table - - def do(self, event: Dict): - get_v3io_client().kv.update( - container=self.container, - table_path=self.table, - key=event[ENDPOINT_ID], - attributes=event, - ) - return event - - -class InferSchema(MapClass): - def __init__( - self, - v3io_access_key: str, - v3io_framesd: str, - container: str, - table: str, - **kwargs, - ): - super().__init__(**kwargs) - self.container = container - self.v3io_access_key = v3io_access_key - self.v3io_framesd = v3io_framesd - self.table = table - self.keys = set() - - def do(self, event: Dict): - key_set = set(event.keys()) - if not key_set.issubset(self.keys): - self.keys.update(key_set) - get_frames_client( - token=self.v3io_access_key, - container=self.container, - address=self.v3io_framesd, - ).execute(backend="kv", table=self.table, command="infer_schema") - logger.info( - "Found new keys, inferred schema", table=self.table, event=event - ) - return event - - -def get_endpoint_record( - kv_container: str, kv_path: str, endpoint_id: str, access_key: str -) -> Optional[dict]: - logger.info( - f"Grabbing endpoint data", - container=kv_container, - table_path=kv_path, - key=endpoint_id, - ) - try: - endpoint_record = ( - get_v3io_client() - .kv.get( - container=kv_container, - table_path=kv_path, - key=endpoint_id, - access_key=access_key, - raise_for_status=v3io.dataplane.RaiseForStatus.always, - ) - .output.item - ) - return endpoint_record - except Exception: - return None - - -def init_context(context: MLClientCtx): - context.logger.info("Initializing EventStreamProcessor") - parameters = environ.get("MODEL_MONITORING_PARAMETERS") - parameters = json.loads(parameters) if parameters else {} - stream_processor = EventStreamProcessor(**parameters) - setattr(context, "stream_processor", stream_processor) - - -def handler(context: MLClientCtx, event: Event): - event_body = json.loads(event.body) - context.logger.debug(event_body) - context.stream_processor.consume(event_body) diff --git a/model_monitoring_stream/requirements.txt b/model_monitoring_stream/requirements.txt deleted file mode 100644 index ef238930e..000000000 --- a/model_monitoring_stream/requirements.txt +++ /dev/null @@ -1,3 +0,0 @@ -storey -nuclio -v3io \ No newline at end of file diff --git a/noise_reduction/data/test_data.mp3 b/noise_reduction/data/test_data.mp3 new file mode 100644 index 0000000000000000000000000000000000000000..a330f9804f67205e2af72652151f4721ec16ff74 GIT binary patch literal 27972 zcmce-g;&(=_C7px_t4!TU4nFXGjw-ImmuBUjdYiEcXxLqC5?)JfG|IJp0A#B-ap`- zwScu|fX}u2y7%5U_&VGS@c(^jS=(E_yodJkRRRFmi36bF;1N+!F|dFH6f|#G*|>Ok z`9;O0l~h35x`rmE7S=X)4vsGF9$wzQ{z1WEkx?=63CStxS)X$A3yaIjt7>ZN>gwwo z8X7yg`Ugj*<`CXrMKm7z>r<(!b@!oRpKvM$IS?%q*;U($%SOEz#G%LGQrg+ORK6iec z0Q6sEmx{ClhY$c^^rs)On$R=gQ)r0qJc2D?y7N|_oicQ|zHi>7GvFtq`R`prAOH7wE919uBr+medc3i1lHK8l(}^2&HgCk%o- zsVt{7vUTYWs6pL@j1H!-)eMhs8wfex>r~u5b6Nngi2 z6p(k7iWu1Dcws*%Jj;m~+H(pU5U-96Bcfgr?+6ilcDKvrR~ekMxb!@0o(yBykY*3+ zh?ZUatV@uiMrpHkBCbpME2Q0iV8L>K{z#71DpaCr>$pcbjglE1r#?mnZtc-QM0vMh zX}VgsU^~|&uNHhQEj|YTKoq^&l_V0@%fL7c*-i0C(y4@PHE=)-4t-ebM3Q+1-(IB__oPh%N@5Q17q}bIld;JLQ?i2b zZ6AyiIQ;gXb&HJ136~m5elFz-ExQ9}&tb*7Cbptgs72?OY__MKrt`z#C2efY{s#}S z8Y-x}9|@xZ6geB`4pN zU$~HLMVn_sBVN`e#kXnXd{3UCOg8Qt?Zk@BneLz=@(!2hiXd+i&pazXW=#Igp!RlU zExtl`OH;r}V90Mp$V_Cq%I#WDmV+qL^B?^z|IrT-2c}AQUkJt&Y34fkY~r`V{ZCMR z@+|Dm3ku~2y&GN`SiZ!|wK@UDf({wxgO2OvEXs#m+hfm4?9(p)Un>_8HeqKgb`lE8 zuA{1i4zADjzcMY-sn~g3CiS{!MvAuR9cw&pz>|~UJ4G_Oxw!@P0)5YBxlaWWbrv<- zp*@@mUa$Ju{fh_jhlW8v5>bjRBs`qH!-GHJIaAYS9ci}G_EzY-l`LRfYfGRP%!f0e zFx+Htj!hDc?Gsm&$gEzQoMx-^dW+2{%fZ5a%T;#2Gt!Q2>N79NG7q8^Ma|jQJHzD) z90?dYttK^srs7DD0n7J0@Rby%`uhN3%pLn@0LN`#8s2K01^Ajr{PYhV!g)yzgT4gh zciYkg7PfvgHfypy@Wn_TGz~K%IHNGA{66^{qmd`xNBrp(rr(lKTa>fK?@)RjN7vJ-nJk8coonW!3gCZKl+-GEfd z!qE(0QrY751y$r5zBh4#?A4cDzf0@;Z(ZEZ#SPPLqj5%l5EO5*9U=4PXR+p2=eaYs zK}W$y8lS<~*&OJ$blyTAyC^}O8U=^Sat;jip3}q65c^%L;r{jzE&RljWn3L&lQmmF zY^_4fxAgV7%l(-oCE-=}y+5)OM4D(A_@eKY&Z5rTfB|T43icKFZsa9&DqzS6VDEi3 z=B=VJw+-TujY#I@G-zKokt%Hefr=? z@dgZ*T@S!D%9MJQefrO{h)r%^><&XjN*+qdCO8_9%;+Za*1PSzaH-f%3LR15atBbr zUQzePt5RdX^J=NZm zCPAb^x^R%>^&G`}g;_^PJdUo!iA#^vYFzl~Ov9)%(!qBk-NSCe>DZ12c7h>rk@|G&qW7pnRH_lpg(2*$b$922yw_1ukwWD57cg|!?fEg98vT0yP<~>N7=F8a;?4~;bf{3efNO>KxI{*vgulS;k&zEp30q$xKjfL z1n|7#Is1c$IMYO>yB`-BmJ&*WIwddsTJ}>7p9zcaGx}7=Etp^_`|GY?8q@TzI63RD zozqI~=Cn-|GGx4z#I#-yhW&*2bzV>DT1m{ zO+MAC!j=fA3r5Sm?aktpVlqUf(nkQ+`24T(sa2 zCocG`x8zOGV$qC8&E06vEu5*&M}Nsmo53qdDtnT7m3{Nivxpd8p4AtHKDm5c-_8YpT+^W?!~z;{O0*W}UACXy^AfHsNAIML-*-m-`~@1@e5D z=$8kuKQrQxQ%Q z%S2qu3ry<=(Ht4R;yL;UPaXh3N?@X4sFV&UIA@dJ_BEn+@VBy2T zqg-;hbA7f23o7u(I6Dt!Iw0|`Pmah=CKGko7!xDo$u{}DX=3d84Ssn|+<>pc#Si|$ z(+dDdbt0M=X01Z(yngob`sKolb?-HdL#ceAx;UAW^zXa>`yC0Jo;ux+3H)aGwL(?}@3Y=>udji4cUsd;$X#;vv(KYfW+RRSS2cwZ4($b>HbRz>Z)_=P?c4Hf_guqhhJJI2u{M4@)W7fIqGeUMbawY3^2 zhs`VKAA}B%jjRA7$I!Ci^N+ty@#XET-VAFylx5tVjk0R9OVhHC*O zxmijA^C^Jle8(2WNpPK_)9e}}+*i|J+fSmiMTx1a)ART5?~AgH&1tT6zDkFwZ=HuV zah(|_cBC*Xk!=&>ho2KR->VoYUq(be04pJS*PdBz$t2uuuSsjHvv3 z@1X#QVH$X#2^kGMw{ib2MZB`|s8KMqD2OcZ1`Y8Y3uUVro@KH0qx z04p0CykkENG&9~VegH;LtbCOx>M*P-IzSy62`(TpzNkL}8Np2*I%*Vr-74PyN6!vF zaT4EtS_L3}61EBVCmK^I8jwB+9_oRii9o>rULFtth#nM;#z4}=i_a^oO~^cBXEj0p zVx)rS7%n^jIRJqyZiP+}D6uFmE6i;WkpHc30f7YSO=MggdXPUIKF3!CKc}c+h=IH> zMr;}W0MP~FF3W0refu_YVddZos)hvpQgV6`BHABP8UeqUoUx?+wPkyG92fNBSZ{N& zyps_k*q0%XN2%j}($s7(HV0HThC!(75jvcw-1ufAs0Tt7M~<^s(}nHadkehD2(=R$ z4Zf%x#$pcJv6-ndz~HqT^r5Ofz(-?>j2?#e%xm-EWaHdcPm=uZbdsaRkWjtw<|_qB zsts@mfHKwqVk_W2FpiV&9a9OI7-qDo3t;FQf@S|KN`pXw2p~d7pbN-PP6>tT;Rq1@ zqHs?MEkBH)L9rX1Wiv*k{Gl(46@JqN8Z}mDw;Ar6m94JqvT7z1f-`&b-nC$2q~(@* z3S0(25@vm#%_4!5jz*DDMMr+_t>=)X1Z?K4=@%oTgaA<(*uFPGums$p@S_9(Bgk?m z^j$i1Msz?5mpDX8zA1T{JS6-aFq~7GiXIZ_j5whz{h^RA6K`j6hb2vPS4IZ^Z zoD5S24d0O`ooYAFpCU>0_yYhbBkb*Kd>#I=yMY-_yj0dK06H?kHwg#MMolyV#ngpc z-3mVf5gQ%^hf*$H0aK66PNk*ppR|C;)D8hbf9w1#KeD3d&cT$XAa^#u)pJ`nYy)vI z{*gyB1(a+o#ix-;33JF1JnXz~BQx%4f&W$YvPShbiGeVwfqT1asbiIaFl{1Vg`pAW zaQ;$td`|`6PA5+s5^ zw)}_`;2h`760C~iVjkit`g@RVG+sq5v@MXUg{4-7&WTO^;>5(^VMH0GGo4L5-6xZB z@_@4phIYH@dB57MyvPmXJ*F)C#Go}E0v1_3tS!7=#IaVr-u!d*NI+^ATPi<1)ie;< zztMkWppaw87sM>96Lt>Y!GiUb+M?V#5bTp|RM+ zG3&N_m7#|{lU}#zRnOgj^sG{Z6D*ZA3V?opjx0WQJQl7U{jR?-@<%o0m`u&V4=8Be zQWuoX3ny%B5Uum5>0$W5cgRgh56vlQ;Iq zmNl!wD{vy;Odqr+Y8sJg_M~^jnrdhGs-GW3MuY6{LiW|i8V*(2+Oa>MwP+eT5APjk z0NopCNGKP{C1xuWr|J=;))EJf8fKeG1?s(tTR_MdCE9NlUMkUHQ#bFqc{<^Q+w}!P z!;|)^;6Z*zG1r0!vA8p@s!29Rpcjl)t!ydjs~gs&i?x6=mtj#`YhP zNj|T79{;0f1xCbZrL0kiDSnuEv;2N;8W{xNJY820P5;_ilb_nEZwdS^pNlz`TvEr% z&pb$1gz%w7Q`MwoI^A-3w}v~BE2tD?Ez4LhnGh6w<1Gv9iH^;*u8(#qAQh)TjMFZ&*iKtT`MDgz7j535b zln>ZVrNmU;XSIDGKKVp3y2T;53^bxU?3?tPlHgsB4v$csAg#_>miJ0t6;^9UTg{9P496JH_J-EAEL^CE7}S zZ^DdMi)^4n_-3iJ%XItLuF5QgviW+;u2_^{vG&UtU@ zG%!&h(7D9RJ0oK{wj1I8`dwjxh?GEdB0bCA%<9-05Z#zz#jqo;iYZ1(2)?!wKmJG0 zFDFoT2FVDi^#Embb-p4IASyWE?IoA1{>Hv~*AJ>Pw5v)^Xk(0+DY-Oma4s-7s<+aGbX(MgE# zaC;{sdBgl!1*%+D5vF$z7Ga5Vy!PJGEFmO-G(osGN;Tsr_{pW_V*2EHD_YaLDNm{N zdUSM-d~P@Ppt4mYk#>JcgbeU;Gk!&-5Ca(_g(&mQa2{lSv^S=_-6;(-$yz*tZUf@0 zTz}7TERXBKs2fAXYHXRTSig7-jQrdwy5NK#$ndJf7B9 zp80Q{2LJ$~$*BG~+bYC%6i@gjRV6~O#fRYc5JX*qc5Toh*r$Och|p}uL~uc3 z*n5!vC{N6|wjHtBAXr)@a_4>eaWeWuMTHv8VhUku{2=ND>Ik^w=q8bUJICT)1_pSHEgIi`;Ab=3H5>YRJw~l|UfRt*+hu=fwxqTW z*5-FLv2VX=ieG(LuFlE6zp*ie>|+~=&W!E%1KD3&w&Yx~jg{?!U!0eg2cYj}CX?6|xTI)ahc{A`(tT6$jUa2x7BbFOJ?U9V(f?!6b z{#_jXljFYOk(Bm)q_toETenRyy@meIGggbVOI1L|$#eb!(M;Rg+69sXF(EEVG@FK; z2#uB}#2j(uqz|#2WFY{pO-jf^B{ABdTRJhG9XBGSN?$MU%KEM;8j}-@k%k#8v?Nm% zy}$}%D)=r`@Ji*|$8v~N(FP9iiNI+HNA&ufDqUPYk};^vxtr z`Q^BhxcleD0XXQ+Cab3g(S#e^CKnUJk0kr1$Lfo8J%i(iDk9%o>P>qv{9`9qGe*I2m|C^Xj|5obKie`5I< zbOv3rrZhPbHz_9<5O$|rC1U4hb@uR4?2doEe)WyNeNl4C^>z}^5EPi}x9#Km_|1WdHJ`S~41#mP(JuO%_981*_}l;BVD4xvp#^Y?F;JhU{0*Dst(sUB;H zmcj^y-r#C1Hk+(?b2n8lrS>K*oiN#^udk5M%~O5P=uUP=6e>5h9tGGgtuNYE1Hm9LtCTMWtQsc7>5ZJ$7vn27DWDem*~prVN^7Hy1P0 znykH1mrzr+byr`Tm3<#!wzy>>rAWVpO%ajlg(E#ZN@F$V+u}x~fEBKjIV+>Ctb+{J z&%k=c^S4JtF6w13s0e#(ifD11TlVF~DrFpIFrJ&SeZ^rz8K6=p^)=$-&w}FwoTW*1 z8bA=4SQ*lMOr8ecF!}=FU?Fij$I+wF0ez6KcwQzEFTMA?m%hZfs)>+N zH~={~B|+Mnf(xm8VmELT7620<9x{n8Oov|bnLja-^`W7WI@OzuhzCFiG!q%SOemvd zn};+~i$L<+$1NvsPsdNcj|dG$n67B^dik^d2>&T~L<&=4Y&aA=UKma5jiTaIg@NlM zM!bMf6G>Ckqj#{y$ZkPt=i)iyuRLn;50BD)kK=5ZwFdZPTpofNN@9lU%u0!cj}Abz zNgP747Krg6dEf{NnnXwM^T*v4=Iegyi9a~#91fL_7ao)h2MhsV<0meQR;VfO)&8$} zacPhrbC>Z&hOy4Rt98IlDUHdE^B&oRFlC~YV%9+Q>Ti$ZK@Rvd07jd+|E0FD5vg7$ zBL*{ea=^0|G8uD;~fJB-dwbfC+##z=(!So0CR3xapP(!Q<_i{r8^tR6~C30N7rN-b+Ulq=g48r^-^WLAh?_;Ea!5(@a5UPXOL?=R(08&2_jXE zKp6z6@qlo{7URga)79dN_x->nCTbSWyha)x4D1DQfI1O73J?(;mIe>Ij9gp?eS76q z&uxG74C2Sh$IYArm_*lYS;n3#G?^;|GI!!fh9+jYiYNs`W76Ajo=So4x^p1?2_U2L63} zLQ~u3f_im7?Oll=&#OSb>~XQ6-z76Hm8b$t_Sss=n|YVrlRdo?L|!&|__6hiK`tS; z{mK^qQusjyS}LfdMn2E7rz)#fJ#a{$Y2OgFQm_N1KFE1v%$$!-*zMCtbn~T)ZeLh~ z((IgdP4OGv1H*Xd)>PsAk4i}h)`%lkl9b_rr>o*BOC?53lom*&LMk*k+KKi#gS2o{ zCVMJP%Bg8e=IN2e83Fr+7uHVmrLpq*s+|OJWp5@z@(->xxlOe{3qar}g0sP3OFs8e z<)3^EhpqA-YTacfsGuo)>YBtWjyMR!q44}6zmx<*K8zRwp~@uHUTbB`KYHE;0B9cA z%cWM00LT$EwZ)mLY}8jSPbhZ2*D4xhrxXb~M+q8-j@BG)g&T=hBhhFW6VvjgKeEs1 z33AtK5LT+F>Ca1z*wIXDpb_A;JUFV>=OwG&?0Ic?Dhq2zr^pZ?neMlzE2~!+$N+)HnSHJJtjLMq?I`@b zfXCyk%P#>KMVrV^pNrJ}bbfdB>tiBb->H(Lk#9-YapEsr>0({ZHi(}Jz1F7ff8=_p zm2^R;(VsWQH_?t6K>gPJBoU7vf7!S=;PS#a(d9;mqlyK^)lmYQ)5xJPothP~#qubF zXBXty$YYby-!#LBjYJL#Jl>t&=yCCu=mC)<21JCbgpiDpEKNB_dG;+QvY@Em^6k~^ z4=&NyoYY+B{$G&(e7^6x?z{rmIK)+fqy21mz;}^;SAtKKGGBffQO9&N@9|)g{_TeXN^8%G-^x$Zn{KYHtv`WjWO7P|OftCw> zCu__V>Rqmy_R2+Nf^JuFdDvGAT@jMvf@IxKKBnmWw>^A#s&^jF$Qvk);WvH-QomfHSP zHBgk|uc~)$HR4^$X0jTCo91Wu-ix9QjMwZY4Wd*WH0FD2EwrwUldCW~c2(uVD)Xxt zsM3&Hgb0U*dPdZIT=1{N|D{4^POQR-9 zViQZ`j}Lj(^NWW6)w4Q3s{gB=Ia#tOi4pRsf;5e$jy0%0A^>F%daWyb@oIFn=mD;f z6+OizftfBXA`_c3bqP3L2}bW?RDC@UD{Qh4i48U`1bS-h!}wSZ+jUbNf45A2#LS#P zz-8FqNcXc>1=a9T?XXpUGT^N`#3j88T&hseZxOS&WgGwgGT1(59L>F##B*@D5kf8o z9amg8HWe&rHv)vA!8`e>;RQBg@BA}XNhL0|%*^mixmyOK;LM9f;JxuP7|+NJrbgQp z!&&`tkwYdwbP9wEi^E!(!;sO7v`dHl##HVzS;x1M>vJ_58mGg6Rv1e(a?Y5Vg!j~{ zHIu}dPQmMZFNa|Y(XT~H)w{y;ZZf=JQl^f(-crKIBTtuFGNE3Ae2k5aE*|ofE-AQ+ z{kw992&UCTbQH}PmFrT=ljPQgRL>JmHS~5SY9f*=E=b8vbIqk%B1JVCOJr&WeEs47 zdcwN^0N0z;jIXQy5LAJ4Y9$rGn_RV0_obXEnWU_2jH~AI{1OVq;wSeIfPXiyg19m! zb93U)Xf6>eBL)b-VnR_>Exjp4GB-`^E+1NNJg#auG=xqO8dA4uP`PdIL~+exwUsl3 z7L(QqhjM%E!$a`EAs9%b6S^Te5kYNMW*KPY;{<(E8e6E~tCQg}pLV9YcV^MSmA|mo zbTW^~o@#rO-=vR%N#h~*zB~^{wdJ{~_^ngKEdqM>2NgDN2WBN(@z z3|^f@k|-@29c{frv2~C9>}NcD5jjmfAG-aQk-}*?1tr?%*BtdP{{g(5t)Nt+OBP_1 z@ubX+x~GGJ5U6Fwgs z{(Y6+m0RO$Rt3wolG`0V{8Tc=hvA6xLsX563(rp{joEY6e%qD{VsD^Y_h3*9IS#82 z>n=U+B!nuob9cZTJ$^GKLvojzGY??ebDu=~v3bW?2@~}BFzw=x;FR>$n1FXym9aol z+pkdhy_8>5eo7b-S~wy-xZ?SL$MD=GgGaXZl6tVX09rbxCw2F~){DQhNT$MDZ`cJm zm2@r-W7G)WSfo)$(G|+bYlUyF=Br}m7E{gQF^0XANWHbi`|lCL*NZ3*O#mk7NY6w7 zI(Jb`1_h#$9iDsa>hoqsSH zK^qsyz1rgMe3>=rG8}d+ETMb`F@i1#lKF)qps(`SoI(L0Oq77pknm8J@0s8s8G(r~ zQuP<12*wSpJR&7)De*yYG=^jCFwCX{7%(hfBHx>%CyU4^i<$a25@2}aK^j?(+Ef>R z6?U0lP*Y|^u^;N~7Y!UArW-#|5%xb2CM~O12m^SjRpTHXO-&e`&`+6DfON`vXZ8>7 z-YxWB`l1I}p@;!$--rU@e1CB){E(%*nif6s=9frnsxF`6mQKLR+R|dGlB)iAFC|}X zv9x&W`P!5DE0>td4}a8!sq%04&tm+}va-YbbpaBG=TK&zB(-SPlGdSa4BPvWd8ptmRv>>^4&kg&zwHB#osKD7c+DM;HT5;dkpONXAmaR9>h+sFF47Om zhR8F;l1C;F#msdCQ8J_nV0d52o~ar89rAQ5*?#&U*Qr){BBm3msP+4U(9)QUA7}U; z1P|0~Z<5IiAxqQ(HL&f^`{k04{AgjhwS!*#eZ!PDH~Pyj_vCl9giKWQcqPO4_BItZ ztkDXDRPaJiI^r_C#Ue7sb&+>IhkeQMvp98834`1_mczqY12Z^HbBTusBQWCGum>nqGsiWNSEcVs=LC(mO&(63L(QW!faEiIPr^F+bZ<3V=@s#!fesE95z7tt%GqL9?f zSdya|z=s%x6)lSUpI!?2eEZ85kraY&(GUtXLaX~SXPLEz;!GF>hOtj zYX==0bV!X2KgdTgWL`M1jG+2necOW?o5Gb%tEtoi*Ixa%I4I_sr&#tYt>Zl2X7tB>9g~5 zi&B>TjEB=?!3JJ!$jYr^^|p<7&(YtiNVwDu2AD4O)+!$v6u#>h{EE=7>2Tev;pAQJ zQzxUEjOvKRGr*IeaS^jB7Pq{k9Se?q@m1;n^hJqHVD+B+6`-?)a>30lB?rcOCkiDS3F^;F|bhcwzn(&(AB zD|$U@BTbcca5QQMtmDy4{))kEzk6))n$sWNPC3VM-9J5S0jr@mkPi|2JWo5R9ZM5t zRqXmmc{m|(aBnI*vId;En@@sY@%+1!^Ws0GA2C5WzUT*+a*|m`GF-7-%nG^Kin0YM z{uBT%+oWC0NQjSjt;?lf(nK{CFyn?9C8_J^`QDk=-?7WcFH52#6->rks`KKYLvWfv zEd))tcI!{-lEXre(x~OqB;M7PYI4t$FCpgK2xvan>KU_ZZg4Fd0ANUl*e1p8s|s?wQS zRO<}P>)1iT{_v}ZzXDg#N<&)^RNho`v39$tC>6d#ctdqPEgM>@+RdQi%4#V&n1>I9 z3$hCR;zSG=SAMM*zN@aRfY@a1P<<+Bn*QdUaa^GVC0M2qkCgaf&~W7L=D{sIe;lXm z4*b+-`99fsYlCq}|2uQ;E1rLMmR{^dG;gn+(=Ue%E`Iv{eOy#;hm{X(5&(c-PWcHV zMA8#uZGItLwRx{TsCv-$7^~RxO)X-{LAB%~8pMR}9)=DO7Dm)JNhzB6Fqew9V#&mq z0KP6G!eJ^zpRP3TF#FQ*Nw0qhvgQ%|yMlL6Iwm+w(3JP5W$Ujrfz5_^DX#btf#rYp zVE=Mj(l@BD{xjGXbJQ~jroZzGBvrV0JAgvvQDIJ2FyQ-cf5Yp&&i+oA}&N#$r)Edre>G}pxkb) z!VLY)?$2^BzgS97R_>|KSo_%g+~L3%e=FuBt68J~UtCsA*UlY$=T?a@b#P?^r*Qwd zj5$wwxQR$IBubl8^VCY;73ZD4PrA%dulUn*k1dH4P#K%BwZNnNu3qTw!8bR**-QoN z@H8A9z;)iSi(#Mmy6<=K2M^K9Uanxj61JI_zwxPy;`Hl5}-82tSK zuq@_1fGjRb@a+wDIm)eOOu1WS%^S3jnTAx8nE2YCA5LMkwd@szCDf>JCBBSBXvG{b zMAtR60iEbPdAevcCT^0FEA+qHROGu@@SS;Ayp>Eth|IlubUiLHEqlfDmur#i-M-3> zfoesKs*D;kjDtg?bppMVW^3cq*O(nz@uD)wp7-1S`Ht;O+jW)P^z^5*pfHYVVtXfP zn8g?g_NXma*db5nrN#vG)CU6VW0DbK0+R3dBdbK}=Ccg88H&fQofO~CIeTvklvL1< ze~CY}fPZRb0Rl~o2N^STK8LBi;`v)EiRTetb}!|C#wKz`Ut;X-mcV=^3H|=CLN(j| zET{Je3!ULdyGtldQ{wASF@kQSb35h0QrF!*D1}=#!a>^idL-{$Y)+%wZxO`vceT~% zOs?OWnk40n3u5DW=IVf<#&D=U*r|PnyIT<$d34uuoTLO`2(Dn`>WN`jK7TI0d>Q_{ z;`!S@B&?UzF&K-2=e^pP#j>QeyK`qU=7Q`=V%p}FgH%);aN^#6oAMOu@N1P zzS460b8(kN*~k)6l8S`%N&`(%Y!-cHv7l4PDJt7bZu##Wa!5pjm%VOiOOwOXXnz20 z>76nnjBle9;w`gYL(K7ZZq|CJ_MknH{lpr}5Z(tQAj=-_Xujw0unGmKwliNLT}F+n zC>-A~kj6M$%4QC<|B*22^7p*P=fZ>0pe~MA+5gT@$yDt%1SbfA^5O+?tnqo_3?lIl zlBF|MYqC0ApDWINI2F^Gu%bL4=5;{I?|eY6E#_>+HQAO!Ib~joUgrRm`^#0Bkv34( z*BkWAg8nQ5c)gDL(nu}{&b#6*Y&wZ}uc{k2;M3wJAur9d%u}~18|8QaWR3d{V6i2eshqomqDxJMfnX!V>`|Um zx>InKlFGp$5AZlhOLH2_*V0iH-8*Ssl$&J7;+!8pviS*@;M$QZ7oYB$@a{h&_^!d>w*L!b@@JMicv>Ro^YHt+Gf8| zu)b=Ud}%_kAN;$gK_*P&d)2e0Hz?Zo&x&86&K&|J1JPv&odWrzj+jS>=SFw8=5*R9AhQr!Uz8faM^EP)D?`|`zn(cn0 zD;6`V?qxH~<(8}kRoB7H+LF;)G+!jRi-cS z9ae^9ZWgJKpVfD+5GLCyADwzN-Z-jpbEHP=$c)KUa>pS0&m|CGC)r&VrMcVP`b3 zm)f-S;jK)wJ8aja!8lJpgrhTBL=<25Pssuoyd~CbNf+Q*$>g{wu4tCktA75TzsTxy zRJvdMM=zy9aZ721^4&|&Ln`<_`$xMqbu4PDjytufaIWX38tt7?a1r@PS21_Vhzsh402reFvX>&sh>*YGxdTeFEwwR zup3fd@%*jbq!h3ILkKK;@tYaHUM|0mn5^{W#MP92^r~w~W2~-f?TxsRa7AwT34I7v zKvA4yh#6Y!QQDXsCA~OAIVPwGZ{CtNLHhiiR`wJ4Y0!@!D{p)Hc7wJ_osxo;7_Jmt z@cikTE{VSo5;Q}kupB6>l1WeDBVVf8!YDUchpr`k@`JBs2fv8w&fk|3H(Xe6A#-R zke4tHxe`u!3rslBesVHI+CbDBQ)?BvylN}fA9;9?#Efsbb%{!iK?lI@_-y+}X?|;Pk5j{`(yNv$ z^UYjptDaPwoyUd`PIt-%%5q8I)OsJz)hSlNPA|i56#acRgIZxr-+Gi)WTDDAfVcSnA4C-B1ks|@jAFR zrimu)1hj_|1d|%`ztAfZp85UL*ZOz<_1XuBdaVT;z_-*{BFN?EZd3|(yPGfb z6RqF-9`orgveZf8M1Fa5N`No?%5k%~R43|)qT#1JZDkVynT|;bTAH^#%vD&5A;~d@h$B=U zoDZ=t{c*qeTi;Vy#c}GSKLHer+{prcL?= z^qq=%1u~}bFi`>ZhC8=EQ%%+D@OeR|imtNdRo3_uXYJ`to#z2UmQFACcX16OPk^bzT<53~Rwx+1H)D1S*naGGX&~^CQ#O#c<2(%{Mf*QzY;a zq`$f|h6NsH(GFhXUr`ljh(Q;IH1vMSJ+AI=-EH2|MNBDgkDJ&C@p?3!$>k1B;uyVS z6|EpQPrope(Y0wW-5t3yUkwS@8zc$EuY*Z3m*C^{m>W z=91bDl-k0!XR@aa;ygz~V`2>67a;0a9blq~m~Xw}`J11KA6|T802}Bt_4t^1bySbC zLrPQBsyQvcRyKGQa?m8;KFt*AcoPbGlx~@|m$56%C0L2r>&X^lL}wj>_k>%sfVJ zmJ=LB6d8YYQtDScfB8Id|EsSK;mP@CcN;VIL$FpQ5zS* zKUrc(AOpY(3BKGNRs%h2%589;Fi1UeK&mGgW*#KU4{SJrk`WektXZG9PR3Q25?K~q zMF!~IzZ%y+on~kk;yWY*gDMt5?H|)q*$F(!6b;NCPqi(l-iD45z2fdbkKo~C%S@c1m?9BN=n|l`N`-#=*CcGsP*1B#P2J1TFqlk)O zE-9voi4GQ{248CN~j#kH%p zC&c0!SZPv36u=g9sEl;E1O5g;w(JP@zR_gTZd2=Q3)w7T-pG#>M%AbM^*n$koIjAh zX$=$gsoSD*ilvFAbo_99e4mRC0qH_@q7QDNPqmO^>elJqQhG+&)bLUaYoK%`g{Lj7P|Kw-cSFTP7 z%uHRs>XLJNQ*u+jY}}5fbLK_Yp?s7U%QP^3PYU9CS#E?heBp!v0>&x#ewSBF?m{AH z3y0WEZSN~ERD@>ndJm{kqBm1dEULUdTC1n|+H&;v-D#A_rZzK&bLD%O`Rsdj@+C(m zil7`F&W7__jY0@-j`(=HIQhKnC;XTH5UpFh*lPp$eKP~^zLl$Khr^8&i>t5a2{dxc z>w6|MIfk|9366$fT3JMAw&a~LW z4co16KYNCkjX{1iYacP*+n202Hq3o2KV*E8V7wmqeCgnM*E}^Vj6v=9_^=j#E5r1P zXXQ`*BF=oxhZyjYOkcHr2h)XDh^1}ooUt%IBvA{wz7QOF@Vb8%=bBzm+XO-X#c`L` zyTr&zu|~}3ul;?1!+2gK?K-9T6%M_&BNjJSCWnSQZ5VpXkEqGm6&+K?gd1{pekjcf z549hTz3oaaOhN@J9H4hYS}%4;^(Po;BMsKePcL8b{OvWpo@EUBmw@(Y4~F(R4trf6 zJj1XXHcKnY>^*I60D$_Qx4kf_)Zr`C&4WKEPd;XzPD?e?Pd@l+mdS6Tt0vP=V5kUko$*G%q$0_BX({HV zg`wp8O5yFF)iVy6GeY@vY*HHywhPrikLYXQNO6UP&U#XXR{?b!fkk9uEq`VjImYx-xtA=fX?XW?{b{Ps zo@?vkOFwl)Xa)nk&>kr8&l8FE#IFYV)uZ!ku&z-5+&%}3S3-Msa^T&6}NK}CKNA-erz0{}Yntg5BKe-K^-6UTD#!2g2d{uftCPKZlMX8u zyi)b2`M7(^PUrvHKM*UlK1JK=qw!SgQX9tUF-)r$vo2%uf-zoCvQXO>=Z-iF6^R(t zF5Q7v7BA>ub){Cx74&?~o&E`z^fwl2cM#y7A>GxT*%}A8#RvKbIL|HoxXLE@`Lf2c zEFs@}PE~=yJ+CT#Ux?#jG_;mEVvo-n)YTtSD{>P5yNoBg9eqCiFaEm2ilt{ZgzKtN zHx;j#AEY~VlKw5@^=6^^J?%-^@ZE~|WB}fWfMv8R!?HqAH^E)o|2n(ZKL1moI!O&f zxO$(WgTKS}1yA+8lG47=7m|%7KD-MXs3o)6%b%`qE1l@2?35VARF9JdcPFhw2H>|> zUcNgq#n)&NQh$UBxSa>vb!S6B z&t}^SVbn~icE0-1M)ZwP{Jvonx3#=&RgZ0VSjH-qB=mgAL&{IYWppd`(I@`r&Wc>H z=Ve)1cK2Az=qDqO%fi(0q>t-cC_cWW#Hnbzc#av*ANhZsY6ruPUk_TAIY&!Jemv;EorW&1*&hO=WWfOP9@0av<0 zrFwMLs-bs}bK@QR&vSFLI~_c6vX6rqksaW8Lde&bNqw8@N<3|#H)kyA_N}EPqB@@r zJ$PyyI=q^=#L;mpga0}2FfUSSNevHOHKa_|ZnJ#%-RJRCVqGC7z(4!?MfK&?v&5s# ziY4CipCSSlFW&AdAc@`HU6{z2*p!Z$ zGBH*b(8d{-kg(w$q{Kt4;Bn+{FfMIKh2RZv1XtxY zl6U!62-YUj@|P0?#~jr)HNfe={xFU>40WUdsl|Oq?29@9+hFwR4ceSXH|#4+&(>#Q z8H8GEkY9b#Zy$=TtpT?AQv(N$QNUobtbQ$^b93%ST1bT}sD)uLNB`If|6oGgv{_hu z$b-&J{d3r_z^FHGbDH5Uokhp&J}9BK>P_LXFI7TIxpp4zH>xo z!ELbwbV2nH(28M`|0|~362OAGxs@pcn`d4(F!@)^9aLxtZZvPOaVLi`Ht~PW9gd+e zwgM0e0H7F)q=}av$Wl?a(f0MDG0`?Cs&nF7y+ad#yL-2I<1*j?)Wt(#;3>;vPX^-H z915zI7ic=73Lrbh_9CRwkO?tJG>eM}<~P*Pn|PnvR{Mol^!S5aMRJjS#lNOS@6HC} z1FVU3?V9uI3_UXyK9&8lkSW;wHW)MlbXCwEKt^H%CY!RSHdRk!$@Gq_%-MLxD8peimc$6X0(xNO4Po3NQ9tw|NpZ15*@PF}4K@5g@rF_G0U zYMZ$^ploM%j9$>Fhcc2zD3^l6K}e>ONUoSqbina}Dt~0lwE}ZA6L!!#rsl(t_z?b+ zpz&isFouS~005q`V2V2TVm3l=Gzx}8sXDNe%t!5J6s%3 z<5w(%VtV?&Ptz5Vv=a&m7EiErn=r<{6^us?>Bb{e82sE}|yLdG$Hr@LBC!n0$qO--~_}krEAndo5Fx zt$F=M93N;LUqp!Q+nK43Nh=8QUhV^2!ZHu^YV|U(XJ`c6IC)_%Jz7q3Oje}M zPkLfRw7|k_zP~H#nKKn}SE_vB(TYDzkHYeIaM{t(hL2k~p&DCbbqBd1@v%D9C@ZY?|?0927$ZV0YlQrAb)f6!k_-P>GT`xY>{ zj_3Ql2o_P&GMdafRH7y%J^GU(|BC|aOLRU7I_Hh5aNlTo5AyfABhh!bNrcp*`h5!| z2)ZaEE3(xG@D1BZe^_~!&c|7F-wO3z7%S9ci(0rpR_~iidxk`EtoE_3ESsh_T;Q`G zgdVAa(o~(c^xw?Q<92expW1}3>T`;ewdT0m>9D*}d#rSyc9>9?Uds1D65gYo81`b) zkA;~rak@Ni**xkIZXjifaOge{9%qkC&&BbN%CXToo?@n)6dq;9v61oh@*Q-ZUbO5H ziyW~LzvW)yKQeP_J|}3|a@JX*)&|e{L9=t4PKAxHC9to`V$>A0Y%>C5tq8$=)}-jv z*8St+(eu)}R*jHm1kajH#69;i=N0_~q=LmP6TNazE`-&`Zx!*d$w%w)sG#!_4}(ng zy_d$4PN8FK$^QX78P4caM*N%$e_m4{B%6Fa8VuW&$sKgrrdYH5KRw+{lMY2r@zAX56;xW z>=I3V8nTp2zVYGeky!JQZSksYJMIAR9kaJVsOzR>z@y|JLpk!%BM=XS_V%x6udvR|%;yJsdNUPz zu-h{qlm!!xwmn1ExXg1;OOn3Q%=D-h*pA6(8 zz1avb;~6KnO>&SfPDhXAZ@ojhd{Ez9e)ekrYo87CU5O%tREqt)p-Q7RtG9vkP9C{t zaieK1CEcpt?GbK*c* zPJJK}ZQ0d(1>WYlwb$%*%4?xY1JfChX*1jq6Am2(EHWv9KYtc^$pgO1L+EJ)2If zYt8ZIZQHK+(X5k_#;)6FZ5HhYhGZr*YFJF}l;KX6#y&9UYi77*9TSTK^VTXD?!ls9 zSA;eLSxGkD7QK8}vvTfVFrXp;&7`53rlaHheEuv&dWLb?$O&ps=a*q33&wG}KVZd* zL#+cv*QRTc9lU>wSdo8!`)~Rn0RXGD3?PAiHaxNcB1jmaqHDyd03}ooOYI>5wj$*8 zWRn3p1QMUgOPM{FeekM^q|F44xoo~Lp%h;caI>p&MX`e)IjJs3e)+)f?O2LvsGX9{ zW=ZzWq2k90VZ`GGLoCXLHc4?cRs=#O3r9!@Pv{xCLB?E5WMYKeW~=~cPsBZmCr-BI zy1*zg%V1(Au}#g>X(bU?dCip9g1tzx`1=CxYbpR=R|!KWXbhRoT0FFuPsEmt4XFp5 zkdfUS$p*Q_vN-f)zuV;XGn}SBrG$Y-?Yzn@xb$pszWTa|!j;|M`V?h;nvJxp$My^dVbnkvIZoN}voTGi^ z0GKZ+?$H)VK43)MbfcCpVf7#1vcK=c<6){sExLW-9;BC(G-j+5nAJsC>iFP1Hq6LL zX~T`u=CscN>W~0>=xwS8w@slTu7ny~?L=5Sa#ouaF3h@;CuiX-)K7}bp?>iS!qlW6 zHdWvR$liSlcOHnrin~P*qzJMr<67Z4Z)ohyA7x1hL16fF4P;2=mQ)P?TgO2GfYg{g zzD$!>P8p6(V2a|#ZDY9*?pc2pq~)6J@a#v#`>o2ky~R z+|YM>dWO`E?Yn*A2$Lmo!^P*Usge`%)>Hz+Kf0}TW?YB4f|#Ql4Rtxh_3qIIyb;Q4 zovXi}=%7>ncF>`Xb?EJbHVmIGa##`DCXH^RIrMK^i_WC7?Tl0K;(OXD<1%XY;O*{> zmp~=eaPC<1Vean?u0yEk#l1>LO}#63(xUpd6YxfoIPF*Jj|HRtEsGsPeiyQJnvcBa z_Ih39FC)DxrULBeb{tKbd1pX$x2#gD<}MGfGT-v)yc@Fq+W#Fk63xxd5<8kB^KHU= zCLp>;i5q(@0%=RFn$FY8PVH>0eysAXgXD`%bgk!9ji&xyLp(V#e)k}mENg@!YBeV) zcC!UxoU!ItsbE(-aB^Swm}m3R<9T^ibcst^bTzx6JD?QvM-SYuGL)4Rwk20>&7@Hs zH#k%@?11vDqKK(-09NjtYU-htKWoD4XiZ7JW?m%uwh08m5F1vFmh<+q^_2PZ&!o{d z$zSwA0sy8!=scWOPbMGdP1A;FNgPtnwNs>}U0;r=-$o5?-i7k^rLGS73tEa%%sDn?oaW+Q??tqFt*kzrJ(Us zeoL&xfArVOG?x^YU(>B$iMi-L{uCKL!f~08u2MQcj8t?+7tiMLqf}+D=Fjc>N{CCy z7=PF1Ud$BA_BY6WG!cEA5FP$l?^m46%1q)hFL9Whs?VI~(U*cuU@cxzc(@LL9ZSU! zN%*8C#X2>lTIF0*UNHxUT(LExDuFdFd@eEU!AGTje z9^Le`Ut1k(Z63^Fc;|rf%+G4sqPq#R7U%^5w6O*h44(*_U!#blYYuO`<0{`f9jc$@ zT1(HNPRO;zrHmxS~MniD0>epn4S1FVer$pdyyr`!w^6KAn2oU=2mq-15o|1-)p$(hUcA!(z<{imYDhtbfUhbO@tQIP&u@SRL&HhR zPs3*XRp39ojHq%60*OK39+Eo!Q4zP|kwg^CLh8y88pPNH|_)~Zx z6`hBY`bzAYTizpmPN7uN53Gqu*mPH^s3(6r66$EzXR!MG$Qa(bP)%T~Ac#ZOB*X@M zo7ivFylk1E~9@cSo@P@t0ZsQsU_m7_AN-o-0HaF-yNrct6T3C@FG+E6@)IE0^HJ>9Z5f_m`IRg ziAc)Ef`&4YcIwRT#+Osi^K{1WgxgE3^vARL;&eg35ON4S1_|hp%#|Rxu>kluJn!s7 zJs@oFhc{D7J7$JM!S*7J17RTmmtn<#;yN&Cl*}(SR<%ri-(ejfR(Us*p|WD(*GL=k z?=$nS%lPyi3WeBk)rVQoC6GWWB|JL_Nzodmjtzks%pKl$T#t!$`A7FC|B;8%j>Jwb zqkY{{)7#c5SrvD=Z7RUyPwU$BTfH~^k)$hV_ z&BN!~g* zAfO72><$qy>-b()JDDW4MNbut3jx%)VhA=h*7gQI!12UxcT`^lt8Y1T!;vn?fLO68 zUp{>)%sj_VB$l2TF92cdRFISXeh-~9fUYmWBS$Q8IJ$?P!{2&m=1n>xE`C|+kn)BI z>mx}_jBdr5xVUO?JEeEs2z6_ET?wA)*!K;dxDDg|2O2}cHF{*tOU+zKvZYo6r?w*b zPo>;49=TlB>~(ugEsTibefyeMX;@1aWC}jJP!$uo%91(Tx?AdUFdBrgye{$i^6CYD z228D?C|4in^FaqgThdK)66&T7wSVA4jw-Hwi;yQVE9*FWzNTF9X=GHt(uh9PVMFJ1 z<`Xvnh_oOmP9SYsbZ%;rO(x*k$L|C_t+EKt{_`&30YDQ%xROCm&%87$+-qtP}Mh+Em()= zm{qq5@;(G<95aV!dS$0YzpE|Lkn^VRww#@7p!&>4Vlo&*J2^U)sU(Y_WrQ!50ZAek zP>%Igh3zi#Z{GEces6aCG;D4+yX!YdwMge%B<7<^G~VjddO%k#P7wCyg!(kKd9J+(x}a!{hXcggD$q;yr&~%;#L~&2edw4mNqnvZ{e>$yz6t)jjr5h z%^=WgcNTy3N=FWaE-A0CJ&?q{!;y?VZ@c@P5Bw#edCa%D@ zKDNmOhv(g$0jCM9wejg?U4QfAeDpRF4 zR;(GQ%@9r()1ko3{R{nU8~QAaPDA|9+^Gz_JTJdv*Uw|gkf{jN70aqJq?zP{VPVlY zIofjI_f+9Hpx{?@6@sL=GO}8X%@SFG>78F?xZlAAYb28|M6e0yt|Dfn|8hZVTw4|D)q&pYn1=S5N6YRp zvo8w&H(xYs^)cL_rVKXyP^f9gXPikK>yuob6?Tm28!D&Vl(amVi%tN%A1ROM6E0mt zFW!rcp5=OT#v0IkVbJO-^AzH+E+psDF$$9n!Q<-g>SpL%Bc@YqYU18}ysy)C?+yRZ zM{hx;2qr-W1ABWhuJBIn4@^2nABC|2_PE|YfVl6#+HftM;AJW94xz!ozBcOU@`cYZG+y7fbs{wHSh#Lgk1`+bJNA~F_>lG)g5}c z4HpEnS236XY6$nwivkBw?VU&tFh2&f1Wc!5fLCGO{_Sg0TyG`2l>yiUVy6MaL+~(| z6|8l32pqHJ|Fh=-Ks*2t>VCs=Nf43?Q^x3~dA>lj4`lQh@(;!~@w)kXF>vG6IJz25 z^NuNyqoOt|p@bF=2Sh^Q%#p#A3+?{G0$)CVxEC3pAo{xF*v~d89hNCPep6gusG(rN z$w16o(NuCORyOPYMd8OrG&Fibgur)}em(TN*dT?U%=~*`l)KMQbD{=)n-=5OuIsKt{b?;@Z0iE^clp}fFJ9Wwa){L3M^GyD?#AF~ za|;8YRQPji*g#>RHa0ERiQ+ibvq;zzhA?45DlhD(f%+ndmqz!=RP`B!P+o$ie~ykcyYX#bR;C z!h*BOX|Dq1+o`GIiE%oiqF1j&pxI#mM7FLKh*Ph1Q!Y2aBocrHZi%;W1;#+RG;kmU zk0U9#=|q@p0rw*?zNV)8YQ)m~W9abSf7Sw8dJ8+5fCPu!2RE#g3SGsT z!8yBR!bquZaYu9d_XDs=jHCqwuW?7PuspPidGN6TDIiKBHo|O>oGdLrWP&|X()m_K zTNL()0k%?|RjLO>(4=!D=AutwBrTv#nojQZhNRCO#IIhq57Xh&C;M(P)lvu7@`x85 zyoL(gZ$`e|TW8Yn)Jki{ac04(@ZRo%DBu-`Tb)Vqa8m4=rXic)&fF&Yf zu?o;wzr(LA5jbh;S$W$a$MS7Amu-XMdol_^syCz@^sGinp$+eR<7|TUu&QjYE9ru4 z)f*quh~Bc`BYc@NBWg|W-(VoMo7qqindiG<-H|?O=Cg77HN9{7v+sQje~od{UF!3X z?cx2i?rn-z9xwW;`8g~X1kYlJ5N_3(g4;7sEM9Rj;9P2(_M++)gWPs(=E&tJpUsw9 zekSqZ8l#Ky8P9kcb?x}I|0U}B3ZFJ;ae-`#-_F&ja}yaY)PwfcqN02 zZoM&M;Mvl%nsU+w>4{x6`<{{f~)`~m;~ literal 0 HcmV?d00001 diff --git a/noise_reduction/data/test_data.wav b/noise_reduction/data/test_data.wav new file mode 100644 index 0000000000000000000000000000000000000000..a3a993c20c707b19bc4a38596e555ccb544390ea GIT binary patch literal 179672 zcmeFacd%dQdEd$PA3NDecC#fr*%YrF$BHCNvI=FYv3F8z00@922ojCx9Ty!JI2XP5 z-YdZZ5+W&9krYW)OR_AvDRwqava_>^XLe?1w*0@J&-=Zv&hG*sY0I(Wot%5-{^~jB zefsmf&-1+RZ}?re-~Q9Le&wpsH;uY&@~pmJ`){wh>ZEdnWj3FZO%3Oxz2AMY=$@EIe$}gusPX0*&OFSE4lMLMr!zU zZ*!zdJUg!=FC&g>E;CY-oYo!fmAs9x#Mo`euD3J zHru%8X1?Fbr$&T4?!B;?3!F2X`2~~tycdHBz&HjR9%-iW`Eov;-i&T; zZ+@%!D94wZCxG=r^SjNDn(s8vHAlGf+-71kshQO*>MqXddN0lfiBcpFP{W z(Y)R~57yfldrmW@nZYri*BgL$E01EiworE;R67m^N5Eh;cbd)pXF`j0-1i{#@!LII zArAe`Dy}}pr+W*wo8Zb^&Rq_cd!fxP&TGJG4x>#0@=4${A2_{s0nk0v+~3^B|8%a~ z4kmqIeM56y^R?!l=5DC4zIm_t^XC6){@><*ZQg~F!}#vD=6b&EYxXr?hYkmTc@enJ z;Bh;6zTUjnyxcqkR&(L_Bi!SOLYd8Avjeyfa>Y*0++FyynsHYZuC9Q4OTlLau&n~u zZSc!0*79yOSZwBh17pd_9pJXTyxYaAJ)F0NvlcVrbnu(btJNIqxYK;zJ&r_<R!ihcHWgp+H;S8}{178jT_bxsg z0Ar(9yUh3Tq(7k=U+s^;aa_oR=yLqn-(*CUE9zG6mtm6Lji{36nLY8vx>B#JTaPKx?omF%~ zSjM84*EXL4)0>g%%~1SLk=q|Y_X|*M3h+hhG4LCPbWP+tWqk&(?`}TP{CCa& z)Lg?kucMED&he+sQ(!d#?%#o&Pl9&a7-1vSn#En7fOZ>!;vD#$#};m5jJaSg{u8)! zA2|580hqKS8==DCavb8SU7W9NQOEZ4ejohW&bzH(b{M>lL7`I|O8#L!)1oSu>!Du5 zNb?v$@3@fHdZ(3)HxWD@1k%ySPk$M+pZCAs{7Q3G^FK6S0jjN7$v^#8IObK6kaXi`crtCmyqh0p#5e>dk{VS8hS7a92Wz{O1Rnw=T?E+A?WZ7 z5^)~b^`x7M)zp7(f;(D9ZR1YPJB_vZI-2ncdVXRMZ@IROV?F1tf^JLUi=0^s973i) zl9%em6C85zp62t|ng6l*d(A(=KThGkf7bk;;PxI?M33hc>X3522zrgf8b6MtZib@I zV`aVvpU*PlO!V>|aGO-5Si88ajIED54n{A7{~2EEZx#Sg!|UC^eW2LJ-Hf!o$m2fn z-orQB3zjFi=4_GKeVnll80G=@LyYwpBh6xz-SBlk=g#M|`;oD`_`kXNZLH=$Z$5&y z-HKN1Z+_DJ-_8Ht{2sL3$zA4gEQ0n+;CvsLsmU7)t=IFsv%s_$4D}DwIA=N1u^FiJ zReJRq$cWx#3GyTshk@`MI6e!shk#rgxEP2gbIj)1*k&^veXe=4`7XydIrkvfDl?P8 zVU=5a6Q@8*5vWEH%ahCECF0(rcxSc%iz?Rhx%jpAQ+@XZ{s zoywu6EQJO}1X|7mXzW@3_wjCR(JJM04dX5WmZ@;+Ys4hi!iBrBi8nPLZ~jq{vxU&& zTTtR#JRgKwa|T=8@O#*`3&3;?N^fV(wctMs|92C1@@}qK z%3Zc_UvWRhI4^N$B5k`kw&) zJK?}`Mm`SbHbaqD(eOVaPP)K&^MKcQVH{i-i?tXIwl_4lV4>y$z0t#~;P4(8d=IUB z5s8l!_jBE2*#6N(zeYj}7<&g}9EA&~ik7SA=XrgUL)qK|^!mwV{0r|wByl!ZP6E0| zpuB!zD&t?@e4+UwR2dF6A3!p0#K(+-*XsV$(Erbxf785$>@4HzEqtO?Ux!4GY`%o- z-Gu&32200V#TZ+G_^sv-ntut-Z*%rpsIa~8+(>n1k)?^?>g-HPuHxGL@K!5u6e&DY zBQ|7Mj|0CnAB%K4{szue7MxH0TeRROaCa}d zc%1)L+-ol1DtGrn`Rl-0&!C0d1Po`m>Lrd>iU<29@b3b~ z()bf7`1|1C%;yA<&qsEgU0sWeeHxqlC1hb5FfHVpwOn}&XkTOg_&d$Fq5R9p#4$AT z=^{aTmt|Zp^f`<)J(hFlt;H6d;cVw6dyC|rMcdB6nPrSQ8cN>-%nuZ={Ew0KU&W8z zf}9vhEJ2sF;OCInpEQ301TSGdPjauPfc!LfpF+kN#MdVhg?I0sb@k{xhPG6AVI>YIZWuPOG$GmP|4GU914*aB8Y6v&*xmeJd(AsQ+x~Oko!}ofu zgN5c>_*8ECsZqnt&A%W5`V29{_2}d+Ks=NC=`T9Kwo3*H=n~eR}Z?4>x|6h zg&xH?NE^DYz^tFt4=%(;-H3etgXX^n_L~Z|9NF3Ia;`lC#6QBK{62ivt|_6;6rB-{ zDLQ#GSKWy|i~t9t-4RI293a>Y)W+ARu_(`Q7vX*v$iE2|@@+p*?FOQLEaNQhwwUp> z*vZzI88DJEHc-MNP0KiA3EVMudxFuwg!lO;%|ApVZ-xF7%g7s{sFD5ABBAdB`CEAT zCyS4-CTGLOi&HqpbC=uU+pRp_U$QGoO7bP!fX|uR^T>*Gg%^SPS?+im zoDKk=8J^WZY;MBn(CmcS72{PYy_>VNi)T5`aE(#uVyHb1>ORV|KJ5YQ_s2_I{2S2z zt7xb>H*KkOljkpB3!XzBfC)+tAv5A?O(cHyo)ojXR*ax!$=6%+3o^hJ_za!jD zSu;vG%BQ=4&&b``RsNmv=qJ}e<86Guw?OXPZ6EJ^@4P@6Tf^PW0*z(t;l(ZvM*>Db zJMH_u@c%x(nF^je8STkpCybH2kCJE>%{v(rE#pozu~bv|cSbWGzNk%e;jXq-c#J!o zYqj$(_0ZXHwTXH*p{BgELI?VUE3o@#ZqBQM8gU1qrzzP;q0%k@Vat&`k<5<$C`hTx4}oP zJ&MLSFEFAp8q-b*`TD^_9P%z4@^L+%n6b$>YL{6A=Z+Fov@?^CPFhMhH%W&ADPf#3xyJHOF{%dg1W|9w&46>0oE3dJ%FhPUd)> z2b)V06SMo`Q|Ft`6_w*;@;4XT8jh?fk|s5(mU)(%AI~@&+KZKSS^@2Vxh8WkbD-Y1V$JRXi)-;?U%?LF1U71n zer$lNBI(*LW!+hZ@%ocsB4yD~>IOnbE5;TD0Y5O!Z#-Wz-WXy9v*DL*dRHNZQz<{c6ot?za#6C9a5s zMB>$7vo}KE9MG9;bkbPc%#%1}9;qs3oq zpXGca4I?1Ex6z03r+GPfl-SxxBZpaND_+iZfVF zKFlkn5;lC$q8ZuV4rb1a#AP=6Ym{oMis*2M49ZIiE9Kv)suf znh`MipUbGubgWK=ve9oTU`#8ggjvh2?U(P~Tig!txq2lhH<$C}iuTMLSRZgYmwy7N z9~ra*W|HNMe#X4{bjH?asyiO>Hp=p@La)R|hZ$8}I#pn>_P7mR>YcR-sj>y1#0QHD zmzMCY|BDN+qMb(O>W|Ut@p2@~tT&Tt#vE-dpk2_;HAQp!7)g&guGoPG!N5r70W83M z$k{{qA?I91%tqB(6#bgs!ow&rvKRlS1{rthd7K4rK!YODQY{)Z9s0!9MVB(GZleR~ zWvp}%$hDoZ-Y2m$R!kxR&g<14EuOYbY*s-LX_pv4i5pkq7-Mwf0B7OOLC1paY-qWx z@F$Y3JZR~o!}3Y2gy1Ou){@L`1tX=wd8YA=nH6!d?iA~79Ji2rYq{fFV}+GyYtn}c z$Mx4n8?l~74BDJ{BxjC^2CQHj**H_tN0`AGQzA~I&-;jP9xV8nubl@xM!v}(reb1+ zQ_QrWiMOJ&>W>kloD4R`YjVh#(41v(nZqYqg+vK@#Z*nh_i9DWMCrBl+xlsv?uw7L zK;B5PjA7iV9i55>S=W9Dd>()^_hJ|1kMpO;z|~l5CD>TG)fT9cYHWQifA$x?=x3v^ zQf>g+C+?6l+JfnfRwHb&(I&)dZ$`t_O7%TbtSv}pO%7ShG71u#olr6nof2K+Td6f2 zuRjS~5;>1#W?)8bSa}u~xicGVk`-0X;?cC4^2N~)FoyaY>*l+f{nhT+TcAgcoFxa6 z4uhr8bvbv|s)>y}(GqCY>>kh}sEd}Qk2x^qFqj=Lq9ZH$rI`KjBY75XMt1#jd-e_5B zbf8~T6SZ#2*r{R*wAo6smd06eytTD|y|vUb-qlW>ykLJ8m8(pK0#D&_jYF!*D z(VUTovmRqZ>5>YWnTm11Yac{(<^izKsyzTsWBD#RWA9748nj>0kwk9wQLK~_DHkiM zl{UVOuU2cO4eE@2Fxq~xQ2S#o)j{bbHnER-GNmmP%bZ#NDJE;-X=;73Va90Y)wE&G zobTlrhJ0B`cMhK!r!u{{NPwg0r@g!Uc5H2ekF{2=r*&S0%fDeoH# z@6;9TqIZby1rIH~Ql(}(cQSfTCRYsRB2&&~9vie<526vy|3@NSkD)b__;w~|S(9=+ zWvbd=xi2@wz>Jj7YiCG~%Ei^ksZ7vrGvI1cj^>VjH563H0_Mian5 zzrF~%TQhY=Z}s}=GLO`17-_{**zu&4Ov5_M%}2Ro(q0skv4T^1pKq>BJ#F^u{VJ8Ue2~};R4y=i~L^zW4jErj;n#m z?29!aBf0oKrAewMlD8Ae%!e_7RpkkcVzuO})QLVvmF|<&89zzwWEglXV!Y?bEdBw< zPfO>>>x_3CtyvAe>d0e-Q)3F8%A6f+bC8ww$i5YqXE>Y_p9X?mP||sv6*4(uO~*J| z>$3)Ju?{N_ULzOye8KH}sWLppr{{rW6W7i}w(lWJas$wQfvo?hpzv5ew^D8{&a8sj zV7unbSsD|}VLWZ!sF^1lWgMp2{qhx9I7Mb-9WR4T3ozJ<_VTHEXO416==FR#xdGtV)meMI*cD>D)t zCn7UTfzABVdCvO=W4;GI??4gr`>$}HgTSMde1&@1|J?l79Dko~onHgTag4STOY&aP zY`x4yUR`YdHFe@|akUlJC#cbX5;}a6{KW%|w;sM&0>;GO`hWyM==K=NMZ?01oR zdqZ9YM!Wy)!hIaOV`re5uDjv$*BGfEeD(s%D_rv@rKWZPsqF{uQ9yq;|Bv(9x!A)T z>h*kd@DP->%kTH#)f?b;6b@K%p8;HU9gOFyhmof-Tqk6EIsZFw@pm}?J6v&$5$13& zbAKZkO5L9z?fRYwO~CP4%`L(KaT{9 zjkT)NQ080E$led7cQ^On$v0b&gISDmJ5r*4Su-Dwb`K!;&I`=3p5uJyqp4up`8%2W z-d$=+cLC3AWM(^DejTp;2>H+t9_KpymL-!}}5Z{=*WG82LD05bGSv9jN!CiYFPdx|TJV6E%S0-~|d zFk08!r^g=)BkN zJM%=Nfc_R>zmDrhz`q8~eF=VjANhG5IF7+3b6Cz9YS_v8J#%qH3v2kimrn% zZvgQR%kgdQe;$af0@}qkk-LutvKib#pR^HrzfC0ZGcbIcdmjhy*_<;H4vvI=*CXqC z6e~K9aQ!;y_73-d84WoPZ;$XvALl!Ru&QQda5h{s8a&Q9?_vwyM6>rI9rEP~xH1gP zhQYzl6zh3AI1E66*MQ>B>16p4e7D2an65$po#W{f?cK3csgG|DGxkfs@D?y%I45!0Mqqdh`Me%|)Jj~BZhoSy3efkhL@LwMsl9u?NVu}P zt3YF);6!NfFkI?`ZU@k;myqbU;fzq&ot#>J?4219do1ncT3GbnUP{*lXhk1l^cxue z?johv7ahJ1c-4UCxa0o;E&eqW`8xEr`+5W1GUH(;%BsQb++hZH-3cw80oIFLWz=b2 z*tt|P#jYxFrN$}l@+#E&W|5)`NYD|mwHkaQeY$p-`~!U1myyzY;oVNYwFCP*_?AC{ zgVuwOGFJMRK97&ItMWJTm{)^=QL)|F&P~+zr}5=Nxy~BM?o<=>5qWbMde&3!z{l&$<7DtQ8(kTpKs0r#cM@2?fNk`(i^UVwl|@tA3-8MOHcYo3p7)K z^f=%D1b!HK|9k9*_Hr})ox{jCqthdb1v6t~kK=?Qv+IHP02pg&&SOiG0dQV!r9s`Z z?|%t=FvFdBn^2}J{|4?fg8Q%HE=II3LXGEu`!N4UkjNe2G9PVv5DwbW^f-4~3k8gK z4+5?8rWe5RTO7|J1r1ctr&_(6!D!}@%z2noc0OXqk}LnQk3BI9QFb|RJ2fY@HX&*SOs4jly_2jG*PKi}Z&9{__d5j^NqI8a@Bsmn~iOHxKN=V-1Y&( zS!i^E*Ji-2x9DBO>;^RMUqFlN;oB-K-4D=(KcTbePl4hh*O-Skqm!8RYU1yYpg;ed zInx8&X&=&H503rsuR@R4x#k?>>;u=i$fPqSGimmNJEJyNyOnX41B0CaW3dtULyPM; z(!05lF<(F<|B&NfA&;+c?`PomF|L`z-EK#6{uy(QUj>5u;qp!}dkg$uL3S>}lgv59 zb_Xy|FW$mc3F@T1G**dRPqCQM?M!+ED%^~We;!KQ#Bm>#*~Ez7AvX9kY~**L!^>#7 z{m8CR>*Ja`;MzX|hK~XBSfJVq-ao+#ej6&B0sE62=lFCBnz0=y%!^D$f0matLW!7z%Yu>&Oz5dMjF10cDxF1uOaoqp@p3Y z_Mbu;KUQL}$G~zElsL{AFJl`o0LAOYsu(fvf(k3Jgsw_)Hn#{!ltgoB=9yM;m{Fez z408C_fZ=LrV#K(oc$6Pr1H z|n0XwV&zTa((IpV0JBX`BiMzeNfLDgR`^)MIzooI^F<| zCz1SZT=zJzTGe-k{5Z79N>XPWYJh9sp5;2TN%r-+cEB~M=Go1~n?1`48@0`Rq|t_} zq9)*ngu$$Uk=k%Zc$~ZIp|>G-#ssfH6Qk{TJnKJ?q2o1~rR>1ApNWVvDZxzK#hx1xSQDX7oz$9=I-hkoPrwOfA*-IO%$mV7o_D2?Jpft(D-CwmNOh|&X0xP>S$q#` zW3>k@6=5sf&L0Dd*?d2jTV}M=wUDcyo>R#%dhH-C{Htod-w7}8pR zk)*x2S}W;gwLbL@bLRFT36oG;X%KEbmS_1s_4wLj;I|%rnss-=Wv8LMQ6KGOv6n@w ztu09BwHZ&TXK$DloOHT)rrt%KhIT^Yp@b;8ZC5~K%x=L{K`N~RU*1W(bXx{PsV!gQ z{ZbEb^jHabn0_z0YCpU4Uu&X1+P5NSwXgQm2C{TsSSymp(XMpmICgkiYc%0$;0?7r zGe_tj{>lYmto=a3mmYvrGAl%pJ+tK=>Z0(eo7yM!!-|6P8yO3%kv!=b+J&#ak9?*t zN=*nQh0klmQ9F{Jj&$Y1k7a`S~S!?yEx)ENh zBhp7$Ldm=e+)AI2JMUH7Lqj{B#LqiL9#j2MFVjV2byuCRC&N0aP+D>JQ9YO1o>O6z zlK!U$SDTxz!rINMZ4L~v{Bk?GR3%i{#V9b^8IwbqQ66It)f@X~?KFQa$1)!gskO&jMpAi&{^udmep`%m$)fPWxTn$8w9apT*Xy z>vkCXCxx;iHnrNU2=dX$F0x_$R?Z7+y8D&V=zAhh{f?Du$H^GdBc3d9I)32yFUa((XU>u7`H2z~0~v8$oE{EiP0%1CP9jn@u_u>)S6dq{`SLYUK09@yjk zq;}{R$i+u^Jx31YQ*p0{XH;GDGcnQ!RDE_l$B(@X_k~vp37tdf*pXmhydPbNOnVPu zRj$)B5D9cNKljMkK8iuh8=-Gy!7KbW*m!0j4ZZy?JgU)2FsT|T3&57O&)zDn42C6I7Iz+RhZ}qC)>*A4eRqr*IUZ-})Lpx49LolmjRS2tg z0|+z;X7x&+TioiEu?@l)dW4p-`|YFm3U0l&AlLXw@1N&h>F7PD-mASs?_In)V_$k7 z@6iiE?`ywnuWVtzGS0D(b^P8dtKVqpat2Q6 z9IxYOp4}di*IvwesZ^oNXf1wq#Ee-f>S*nq@~LO|m&ab(Ui!MdSNp2oIaik({o-M$ zg#6;;U%a^V{_Xn~``%Z6-~MIf7MtL9C3r4{&G)_MwXb`*kxw13!sRRB36H7{welh7 zde4Y%<&*FuF_sp({+TT?O0j3i?4!Hy*z4{b%Sc>5wFK>V6mz3lk*Fp0aQ#I3rE5f! zSXpTlOZ|g$27RLuR-FxKJ+-`%`j$TF-_(EExvE!*PjjV#zEtVaS{uVUqOY{sbq-V4 zqqG^6d0=B#y%|e)jk7P-E&{8U_5u6OZbAEe&2AXUCr+>@(|lBmYy6|L24Rsa;f8Q! z-elihavbqfbvBpipvL%K6$y~vkt*>@q?HPt^|{GF?TU6E?O=O=9F}`rm_v2v7yC?I zAMctBGi!EHxXRPr;*8Ji_jP5ayTO<>%l<|3!~BpnunmlFN3A?Iijzx1=Q`hH;mm#| zj&PP#@kke>^97|Qact$HJW9u?{mn)^?vwWbIl(*0#M(7D3Ao%bChHjO^m~#_qpN?l zN=k_}1#_kkmvgIaQ=ZHYCIjd$JVr>#e7JHjGc)DFT&(v=Y$CVK(8+Ck!D_}VagkI{ zlp#;5)l@!}PAlvSh}Z1Bn1&qPP7d5!v{glS9GVRk<-rSNkF7IgEwquH(!G!jw7s~G zkg=*60e6dOR|wfz?al@s#$ee&(H-KHZ1vKeqRh_9YcaRx?i!}_c8iPBB&5b3;fI)* zwGjj5AhT9i0ZajMDGO!$xZH%^Hh^yEdw= zuHe7G`K}C3rOzHu_g2xa+f}IqX}hy4=I%ip&F~AqHr728TqhD)k>*CjLM+U#?R2*= zxtZ0m*43?}x<=N0lfH^HJr3m0ApaME-aa|IPVJ($YQ6&6+aZ%(VAtD(7lu-TcIApgUsP zNo_y>J=73CN2Z9C`>$mD>}zLb$(@U=3k%a4Mz%Y} z?gIN8tse`IT5A|*FZ&gT(T40?7A6)b%T+IHMZ=t z<=U-k53SNE$Ii@^VKrIX8Hy&pNHml=zf~Ul$lXg&PGqiYkBz$_xf)6fX+8Z$==~Y0 z>AyjJ`g7FhMx)p6^P-=y1MzvJ;SD_4)6mISLcNju>wrS*Yeh=fk3#LI(DWBeUGV~s z?jPuycQYQzBnWTvW9n`^an`~mmN45=t9IFCF&U#og?i)6%B}p`qv#%cxAACQ@tV>{ zr}WsNF&{lr610BjfaN8+WnKdwd-T>r1?`9ukyzGfb^u8@%^B`&`V9^rUo1GB!K!4> zMkCAQrL{t?G)^VSm8()bl{I5i|E&!+jHQ=KRaGnRt{X;fb#0n!3Wie^wXfpS)aQSV zzNg{9XaAyp(Uqh2tXh+Pmx{Diar*&XCuXKz&!-PG?pOz$*#pqH!#z~&a1e_#$ekT6 z3-KBD16oN`ey!lm1rNJUv|L7wT2WVB85Jo*>%k)Vc=g}wtO{9skJoZVr~A7+h}CiD z3cITxM7s1(@p;16j|{l_+KQ+JjmbY_M&d2-iZrPi6Mn3*KMalBYk*V$#osOwV2!`(yd*|qQ0s<3r=y_5dQl}GlK zYU$kzO3#_RZ)RDkP0M}x?qR+z>lm|6KJl_Kb0oZ;mnvAD4;Zs1=F4NMQ$AWjNqi~> zHL_a**JF{hTb~%Xhl{JzjD_5L%>CnB2`L7yi_>ezc4&zm&)HO<%{)k}?>r##+$+_A zBQ+`KY&C~fpS`2Ho}<=N>KLiWr^+TLjV}^U>&uKRTxsX35?89(6`^k!Qx5wh>>SL> zr&LO{2U;oZh->ZJ`$q%ifp~aWajX@z(6gmxcvy3Jbv6^6Dvi8*AP;{lPprp?h0>%w zNj#C&v}1wZj_#~zcoZ(!|OYj*e z7{!wtQHY_u_o)2S%hy=jJWE!Z*S-*A9e2Kx3(lC`2PeBMiGf^jcB?)MtMglrx(7*C z0E9n)_%=CQ>4r;YN%K1&Y(3K_+S2fEKclT4L zjUud*&*A)KjHRUN{oKvVm{R+1A9>cci-kB;PRIwXs{TR9Yt=k>R8G{`Kb0$I;zqoQ zD4Y{$Ns@s{HZ1uHXFp<_8NXUleTDwp8IP;_M&TulLG6yo&NuFhZU!LHoGSxe`|2#$ zUijm{pZFkEiewk+j8`tU9xS$4?K3TZ% z($x;Qs=+m^?xtwZq`ONU0Dk9b!fCE09jxX}W4l~&8QWou>%1**7-~-pG$}VD&|2)n8<8Z9m^)?z0Q{?Iv_3)Qd!i_E*0Gen)`G^_+3bOJ_fGecJ;#0gJ&pZ^9|J(;S{f~kSyQ3Qe%Z#lW=EdlI%7=t z;LJ`byMV~5r2BiP+evLOr>6yQw&r|8Y}|n%yJw6hezzZR3}?BkvK_D&nJXQK)~ATy zUxC){_puRZ+}A;$>gq!GFpS>miB@s8z5XxJ2mGT#2Qv=trE8z!W@v1ux$~ZTh)#`7 zCjgWBZ|~_Lu6>%r^$M=1eYViS-2(N9M!t5TIPVn)v#(Qt)*OoS2qSR!y4JrqqqS$( zoPg^n+)a5J_i@dreV@*f-FeykhZ5p%0N8Up0=|A0&Sw8DcR(>y>+aF+C-^2E=6}ZC z$iI*FnNu(WyS?~m_auA-N&gb_pWC_mQJ`GTc=nDygGBf%32&13%4(^n;G4f2VD`#6 zh&wL0Qq(oA`U4|y*NEuv?7G%x4*=l|@Oc1T`t8y~{oBBr9ZK&+J{J@ypJt5j(kpv` zD>s9cxpMOy&eH5%H$QO~9Gk;;$@XP$#^->~^{wx6t!sq@U2Mq;ju&hFOkb@C{f83{NaQo7sy zKwQJ*erLy_q<%(gU1h<{b@p%UgFaHvu2|RSx`M*_va9%?LF=~yiMtUwh7#dw8aub` zvUe7}9!|PKBeuk~;%27zaX;;Ac7rh&uQk;x7`d(IRc5aF-2#1RbX)7^E~*bQ3%j4! zc0p%FgS<5(tEKgq3-qb(R=xoI{iPjOp(=ax;g7$=qwjThaQBPVK3oK=Cy@#3I_9yo za@jS+yq7lH^)v1|<9yXUjWW;4P8a&*#6o(b^wQUusm9M%eYBotRa|@PJ~Y|$+xg_Z z*uLvnQSvcn`FBA_cm012nRyG1coA6T`gR~Td*(WSbHU~YZbB_YD^QJ53Yo~nzH zZlig9mN|e#PR4ZZbYdhin=@VGdOa4%HR%2hkh2|E%V=rccgt0chrsIltWxKf3+Hj>IBw<^Amh0u~)WwZ~+2rIwa^BOX(wKgg- zf0cbbth|_mS`AF&xcePYT3WmFq<%BfH^3|N#pb$=kIeOqLtgH{V!M*e?CNL8GTjPa z?DY@)S_t>%ID)@D4^*!7Gf(M0kH)R(TsN1Qs%rXKrHnDVmUug`yGQd+$us|q-|YA~ zIG#XCtSQ}$Y>WgB=LE~ab3O8CjUYSc%!gOcgYVne8Fx@vSA30m7k8nw?h)JTPAJxJ z-IGWw=*}S4L0sMMPU+dtB|BC)JJlrZ#I~%b0;T#;B$Q*VOX&%y9tqH7o7gZmj7pKEy|Axejp2(yRoETH8yntKV0=n7Bsy!nk2XJqcz zWn`N9v6vZkC!ebhWiKY}z55s%|J50v^9v(KXV>W{%l<*WYTcDvUq!DTfiG4wzQxL*KW5$EAD}7cp|GnCA3~ysK@s=%^0&g$4<{e3D=mk< z+ri2mH(o@(pDj4LcGTY(sC{=?S7E=MRe|KgGHb~$ci9zrKsKa zD7|UUfV3z6zE^&SAvscazj802o53)vmhJ(EsX*l32Jf+o?dS0CC-CxF;9djf*?r1f zq;}j{RjSxpXXpD?j`i5qo!n19`?|RP*!s78G~-=&s4_>C{>1FLnOs5g9NC4*>aiMc zCdjLE7=QV$t_DbdMLdF;im4^ubHCs0jh)@G-Iw0qu$TvayU@6c=+Ijw=5sd~XOph* zbA?WRA1FP6weL9da63SfpSDipzTU>ssqHv-)K&|Z9I(=_zlCtD8q~7jl>q4?_LaD_rZr3!SNk9`3^S5 z+Q|v{lgJ^NEOpyh!HC28t#-}5ko^@AS4CO3Fp~0jRk9)=TpUZllRBE(uV(jkqwJwmo>&$QH-;9pUi@0xrzweV@__6are%30G z7Qy&D8OiJwsC9Kcl~z-GlI&Y#ZUBh<_s{%U4s%m&e#&fX?3mJOg z_)RfRtSv>QqmhCBEUQ7ylxAP5>=NgE$V`E~UiM44!rOigci_#glBsuvzS{QmJxEFU zDp!-|uG|UKN}^CJg~>!ZH?uz^Jcu2s(pz)f$xgNv7iZcY=|R#$v|Z0)tfj8`WHs5` zq*=UFOT;-V@{*gdZenzhm5t6b+~IB-`X}C6(AuXGiITU;C&p4}lTuexieeYl!DQf4 zt4poO46;=fqnBuJIx8|$P39{Vl=>*O69lSD3!LV(}^CH^4_EN0U95M>sf`FT(fIK zEAQ2Bt#$mZ*%E7&*-JP3{YmXetJ(6@nzAA*fqPJ}Wd&9a6ljZ_fR#8Ks)#H`TTUp6VkqBefzV&&>(Dr>a>W*D$+ZnS1owm0MSw z)O~o&C23upZ)8X=5dsU<|m^hnM_^qc7$jOA)ElLDzNRQpunubN&t7>t5{syNW1u6tBvzh&*5`=vS~O}4&T*2D-}tXhG$8masx zigzsIgXq0>#SX1Xfm9)*`PD87bMQ0%mZCnUGH!0D9&N2P72}#siC2qmL(rbTs#`7g zLp__hK6lwPbD8RyH3~c8%<3C8nCHs=u&L1)k5$gMwb#ofta_~Qy=t?PES&YI(k}K{ z?-|T%O;_&6)@LP8D(7wOy4FR5XQF~=RC;^lL*zXDCE3lnRtv2mSd;MAbMlLpk@ad_ z-N8B4T77abm*%}%UVVw#r*z^4TEdN>;I&j(j+GnE!c%*RYad;es35V8-KmzPB67wKf7z_ zl@bexskqe`D0bPAwaSU3^?mU~tskzo#fYlHIn?f&O^YO`zjjx+K0Oh)nWuRE+KnQP z%1?6KQdjM)vA;OEzl?ijxktWvMt7gRm&{vsgSv_As`c`jT(ylyIAg<7cC*h*!dDLi1O|03S*ydDVjO|mm zNk4|QHD^lJCET_3u97b`A9X#P8^Jg8p+_>qW}?joC+nIFnARn>OnFc3+MPd~z1D~) zQAX{Diw8~>Jo%SgsjtmC6Dux>)Qu9WZPU8d$g#ER`tDRSr=wfri$6@4kGsg;4>n(= zW@)|hW@ZJW`QG20ayOnsCD)x^-(>V0)me%ApKU>I{7q4NHK&)DVjbK`Bx&9`xr_WZ zNM;W9$*YgrHEp^_a&7kLCTC$UTAky@f~nW(^=`eBecjqdceSp4aH;FLtL}|p^C`0U z{%YR+oY4o9?XlPoyk?rUy22G}YbJLd6tJ_@U#EYTOq{>ES+7awwmlY! zzSL}EZL_InaLgf_Nlk=dWEo!@J6vb$nXMZ0Yfbb*)?-~OV>g+37x!*XE4Ag)PrZ!mv_FTe8U@BUd~u|{yLAp0 z*iDhu)po4817n>Rbwa7VOzkS%{l@-uX=Uq=pp z3J{%us{X$4Ss*tfu8;6nINhCS2d|ev2Ujt<2Vj2lWCiCrH+QCzU)ZvqZSBS0Y4e2U z?NT3zz1PE;6H8C6yIJN})AFlaUZrjt(I|gvp|xAH^JZ|2G2G)@J8jl@UBL_ zu7x`j7;PU}uQ$ljz79uD7jEe9v|`?CHTSWHJ-u}P{#mMpR#}C{+=_ip2jNEY`^mPN zi_{{hmkwoo|n(hVT!$eHhMXbBs2lVa|Yc1r{Wq-5L z$cpQ!>>rad`PC&kt_9l$W&9=QvCzN{S?5jejb(33{SA}kC2Cxi%#8Asnn9}hM&0cy ztjz7rnboXDopWWjmq^-ZLk|Wsj40Oj;2X-{MibT1nGqQMP zF>vmXUk=m0*61nzB(p_-Q@}dCGtkVw@5T#iZLPPv@0mW|y{wO-&##gN{|;GVdqWzi zXK$?f<62T}idNH^miZ}n4)r(l;zLh>jgn@D-b`HjG+l#eHYXLDYI~G$r6sjib5crs zqTK9H6I|7TtU>d44y3$(&;6zSh4jx-Z@LYsSj3%9VTY_!xNE&T*6-#zcXIU?DD0Wb zTvNZDUQ20b_l%G3XebPNr_3sq*Zi`PRl&nx>ArjZ8k8$M@+(uZ*omxa9?e{%@r;pX zW)6PuZ@Iejo~s7j*ZFGxuK|a<;fOnKK11c`M;xwOI*lGJ<9s7+e>c@tpjz@8a~uSM z{N9oF+#W9dhFMj!Aekf83Wd2^d+4$vuC0if?>8D~qwiE-YMg9Vw5~gK?w+n3>w8~D zTR#W%{$hX?-~IRi_aU^K=oKVk7qBeGcjR{l>>hKkHh;l;6Pl{!w-e9Iyn9tzPuEkp zGR4kDqZ>J8my^Gq?k_UhhbgA{-6i*o3-#SKQyemXG1qAnt@ko&iq$gW)qYy%@^=}e z{}-@1pDlLt7UX^cyxENQy+obTU&sDAwe8c~cLuU%J!K4%udPhg#xaa6>@|`H+CH^a zE0f%F>H^li)mJT|5}tV7-(Wk$Rj0tpH4<7)A=Y1shx~A!W8TnOR@Md@jo8biMX-Wx zZ~F8R=dZTU?xj@gtUVYrKZ$<53Z338JuG`TXATgI8RVRqQ|CVZ9>h{`+X24z zC|Gkk0>&#j+r9U1fCv8igk5P$h%447b01fMu4a^rMXFwd|Eu|2&-WnT+L4wXBqIeg z-FB#Hg|whg^LUi;SHq1dKxF;oA+DOrdv&7DgWM-^Kj*$!EcJ7Y;hJUFN!tCGUnX@1 z5PsKRgsYyyp3;M0(jXaQk)h#?a5{Y{7;qb={i zqdmYr9va)b=BlC0pRMmE^Sf+tZ0l0~UauWXa@my?`nK!=XP(>$++OnhnzE2NXL4ql zm^gFStiOmath0>FI+SHsv`II+03C5ZoIQtMb?;|=aXK9)qZQ8R%|zZ0X3h!S!+#F2 zp1^Z|x9p620ouFkx}A{sLIW$QYNfdW>$;0Td9Or&h;ji()ZQyC$Eo-Q>#rBoikCAMU z`-AG2T!~>XQhu*1v92p)B4>8czmR@6xJ(9K_q282Z&x?kTj5^8W~PmT z?7*}iRQ{NgGzYy6d9mk5x|<7d_gLqK+VdI_Sf6z?e{Xj+S2)+nj>}Rum94CIwgPF@ zN8f25MJ#V>T3Y>7ER7jbmC;{0@AF8nyw)ULJNCBNO=pjsI~C+tK*Iy!vFm9L-`Z*O z!?LULf53}fgiii;l6^@2DvF&JiQVkcoyKcDtXX&0kZ*xH^SPHf6IW|z_Ge$A*%Pgf zm1lbwj1!bW<27?NT1eLuXkD{%RjiDH^pr;D&Z4BOv(>DLNDQ0&hjEg9AI4>wUF$ot zm#I{99J@WWXYMerRkthLzP)=m?5UdzE{Bnu=h1WLFwRNL%p5NA<6i6P``ty0Q(?5H zRchK3ZWdn7D2Hm4ePl*nRwM1VcFvZ}ta6x6#QdtB9U^siSS#w{pu}t2lgCOwYVw20 zWVt3p+h+VLrZxAhc6ua3nW}EB>uDS9>QVNLW3({-){`sv?;(o+G!f0!%%_Hf&sl8q z_p!|G*6peSIVBY4WAs>BS~;1_fWK&Imzy)boTq-)nWxs${6l!|{M1>jGgNy*5jy+-kAE zE9H*-Gs@@gs_-z>xC1`>tDbfcn1>m|=gSz?RSM4`iR!hz2d*X%->rs1{;r$9 z%41jEQ7nSi+4-@*lc60F66e8n-<|y0xO0W9a5MvI#-Y~v-Iq?Si+&r!YYi)>ll8M3 zEYhd+ggUhjW>rb3j}ku_JC_}~0)Dh5JlWN)_8xtPBZ=I^0reagL1$&T0b z3c0G>-9TKM6Yfdz&_6xaYIQm&j2m4!t6m#1NEhe+*%wI6+D?pAYSb33e`*_QapoME zPp3Mbd7@nlwS&QH?NV{4aWmw}=OicWimF-@jHh%4tY65A#bj2Mvh*~We@;FxKF6Gd z+9hqABdB9q?sUz$2Etd#jk~9FTep1>s<{?r0%y2qhgzN}*7B!YC{hKvct(Q%9H?=?` zd;1~N=TK!;4=Apg^E*?L7G?s%dq+r6%sWbbb}UCrbs%s=2I^Ttn+j8{N5ZyW{M8&sX06c&Kl3jYw_2r% zHq@?myGE?6wsqC4#dK|qxvi{>HA9m;UF1jDYb_GDJ7y2(kMsXf6!w+vzMy9S`&z4W2&W;md3)_=ib#8 z*(F1Jsl{yRm_E|BE40<>t;gPYI9qemd9FgXPr6BRr^^yVcY8z zdsyk_{d%d=QX#K;=~`c3dX3jtUw)-$zYo;C^h`}Xo+q`ZbV;T6GyO8@6wB|`=P|#G zo=)QI#-*jV!{4Z@?QD^%sV;_!m(sFYs+N}3GG-nk)gqZgA45&GJ9T>XJ3gkJ4OBW^ z`MFZKrFu)V`dRx)`>K8yT2{E@A%wk^g(?Thf<{*L>opUh%(jQwiR`1E&g#0WTd11; z!PbMeR=?V>Uc1%8)^fU~Yn|i8w^xnV^ZQLYAM|alH)?&+<+QAH?A6!H&x+q_&-GRB zS+S_qPgQCPLns<4tyaHA7}|X4YP^(w!&G(B;g`PKUb@!)r1WJ~OR5fD8j;kQVePm} z)X{56jkXg1v>dP6ug>R))P}}GDcsVxQaAJ%N~2cVKI~O{UG5`A%uT&i^)Y6W6pYX?UnWo?2{p;k*NHt#*6t%uYPttmkOK}TLdp&J0 zURCR=1*=eNSzBrc^7b*`2De-x(xFAAbs)M_k^|k99$PC=VHzsGy^`C~xxyP*Dy4g2 zZ{?$x=2E`39W^4aw69dIb~ipdbiS0*L*t;9w(%Dg7pWo!{t0s{6}?mrUE|X{BiO~a zdF7QUTx~!r+3l;A(!FO1Rck9^9a?#Iq(B)suLNsL)yw0ZNOgS-mD|UBE-$=CuMc3ha?xWPMw3gbj8kfhQt+rO2EB3*tN=ucMDlO8k%1V{%UdbIQ z!BvO7OK5!gxRusdeh!r@_l2#s5>>w{M4@!A^!D0?mewEkDsp-$jX(T-dtb+`G>;bd zQXsNcC9cxD;x-h&O2PMkZaI`!UO)862>Gmi?W_-m#^3g|x-K*Vnust$L~5`}+OPzx?yw@!HoFnkzkDiosv({a>C^ zA9TeBk9j4i^I41Gf67 zQJ|whM}dw49R)fHbQI_)&{3eHKu3X&0v!c93Un0cD9}-$qd-T2jshJ8Itp|Y=qS)p zprb%XfsO(l1v(0J6zC|>QJ|whM}dw49R)fHbQI_)&{3eHKu3X&0v!c93Un0cD9}-$ zqd-T2jshJ8Itp|Y=qS)pprb%XfsO(l1v(0J6zC|>QJ|whM}dw49R)fHbQI_)&{3eH zKu3X&0v!c93Un0cD9}-$qd-T2jshJ8Itp|Y=qS)pprb%XfsO(l1v(0J6zC|>QJ|wh zM}dw49R)fHbQI_)&{3eHKu3X&0v!c93Un0cD9}-$qd-T2jshJ8Itp|Y=qS)pprb%X zfsO(l1v(0J6zC|>QJ|whM}dw49R)fHbQI_)&{3eHKu3YU-4qyTh8~wfyQ*2mNAIhQgPD1K*Xut7rIpDdxc=XVk}br22`p@>p5^ zeQb|V4+8ff~<`73x_$!kCL9^P|xv!YzFqI}xltS#TIDOY)hck?XoySlvVeQYhTV_*4L z19a^nte)p5?fJP6gYZ|WkayKeUHQ|T^UFWB_ARegg0U3xSTXqQ>wNBG?-jrNxZRZ< z1v(0J6zC|>QJ|whM}dw49R)fHbQI_)&{3eHKu3X&0v!c93Un0cD9}-$qrm%7;7S$J z_dB=u^P$&SSr{nwj{kqwe{01g7`Am1D;4#-eA-KcE5GioZuGvs^4Z;|e|ssAy50N3 zSGn7JMqABmU$x)YD|(;(_VS(tukP)n@OA2!t#r2cZC}+|U;D|>S67nas>{6)54}$d ziFXsSq5AH7E28z&UfB7tT;aF=``G)se8d_4RS5Db-{f4cX)B0BtB5(TKDPI$pLEY1 z1v(0J6zC|>QQ$uf1*%W-*^Zsw_q`CcaY6mGUXgB}^Z=#jCmu6?G2J%3kMWi9x2<38 z9XGG*IPdp7^oriIT3Cm^YR?&ZZGG2%f8|&8yi1?kv)l28_7YWWJyYroy+f{Vuf1|S zk+bS!?;YxAz0bWP_I}>I?kLbvprb%XfsO(l1v(0J6zC|>QJ|whM}dw49R)fHbQJi{ zkpjKhr26_VLWg>;&u#9|E0fJK*Vmi-YggO(ef{gbOZ&Qhd--#FPW!6&tlmu8(AT|Z z^=9Fcl^gnP@4I%KOJDV#Rk0pAX5Q5bV?DoC73=$<*Cu=T{yD_0-(FnWyGyA$TKlQj`FHuZ?Op13x&DJbZ@lrrM2OV z+e?!cw|sY{GyE)n?UnL$=wIb|`)OdgbQNqq^{jeDea@ABpKI!>+Mzl1R-;z1hBZet ztX7`e`&4QS)y#Z<<;QwO3uVr$Fy{GExol}sIaaUt`Gfw3o?l0=Qd6%SI(qxAejjev ztLt<7ZF@!UtM>cuRY!r20v!c93Un0cD9}-$qd-T2jshJ8Itp|Y=qS)pprb%XfsO(l z1v(0J6zC|>QJ|whM}dw49R)fHbQI_)&{3eHKu3X&0v!c93Un0cD9}-$qd-T2jshJ8 zItp|Y=qS)pprb%XfsO(l1v(0J6zC|>QJ|whM}dw49R)fHbQJh&roe|S!2fF=zPo)# zfsO(l1v(0J6zC|>QJ|whM}dw49R)fHbQI_)&{3eHKu3X&0v!c93Un0cD9}-$qd-T2 zjshJ8Itp|Y=qS)pprb%XfsO(l1v(0J6zC|>QJ|whM}dw49R)fHbQI_)&{3eHKu3X& z0v!c93Un0cD9}-$qd-T2jshJ8Itp|Y=qT`?B?VSBtNx03yJtdseAvwfKJ0}5$lr_K zFL$(nwAg*ntJTe4{%G%)(JG`Xn;iYk2OXYsl$R&atc2|` z2)%Mmr9(IyF80cf7>ZxTy3)$ef^prz@|+<)L_1o{P1wBL$TPz0ZCp=6MVSyjPA^qAN`*cf;4p+p4E6#^PM@ z72{wizVkTd4jyxwInC^H_$r^yBc89b)) zembAeg;fg^J`Vu?q(An|%cgCpS-&V=~x^=`=8%1uf=7nMm?di$v*={6oCdStR`s#vpwcK8 z!^i1`mp-=qQ%>dJLL_`y(MY+s7U`0A+mOATz`nQObg(&8j>C<|q2^%0F&tD)W=wu`mTNMwXxOa4AsC$X)C8Q;FO2hdPF)G6uvDUl+0kJ)`e^GY#+P| zX2B_(3s&J-#VaYN%E`x1GqUZdBlM0ln+pvslB}%E;T+Vn2gH^%K3`Pn#c( zRXfpI5G_RHZzcR%4lGr|v=G`q?b{r*Wip4BK)vx%+c3BIKRL7n4y-B?wjQX}>CHuJ zwA(w-p52Ag(GWioa;Y896=-9n>t!EFnVj<*~@l&;S+JkBf zB0JgvE&0@x?>J$}en@^w?eA^3?^ zN7Raiz`T&}=3_NCG4fKb7vjx~zX|A-9V4XpQgvEtDrLg2HP!Ut)l{A}Vp<2s>R%XMpbrR~)_d8aLWF3r^vc@>$H176{yS9wVD6;Rt_ zX@Oj?FdyiocC4y#!1N+HdXSmuNUO<9xr6tWbDM#E7q8brabF3oajMYj>C`%b=zFrrgO0dns0E4sdCUW)~Re6st4~$Y+7y0MaxE zsMii&ub0~fP4{ClwKDsW^L@P92GqNG-c;aFavSchbR&sq)p5={_ ztZh^pyk<_3~Mrn50y`M7EM$t zq?Yn$9O~g%`i8`8?Yywgp0x|HbBWMnw->_?tzC@>jM}to6Oq)#&~_@iH^1PdoCxXW z!s#vWeP4ksp3)e{NI=Pr-`R#7R_hcS8}Hu2kf;}u@fVvHKNeY!HPb3cNsm6D)VC<7 zv!LNDK3f93Q~B@b9`apjU5)HWO?6=>_CRa41&T`5?R>Kx7E@0HQ?}lpo(G6{=5E>Kdnbn@Xz@1okN1?Y=_K0VT*V3wLYt@i;R-x==P9aqB zO-kAnpc1x8yw*D@i}G8!URkbR1;?DbYp=HgG^*w4q15c!jjiowyaFxA_h=R&P^fQR&m>X-Tx~3kn~d1?mUo@N{6DRrFy# zF!q7BTDP2g#?#8XO+exda2p&|H}>*AoQ)^e_a<_v{!kmJ)H=6pxSP_eE%ue4>v`hi zv~-z^Y8#gTL9B(gS{*mL;Jvj$#$5Uzxek=Oz3tD2aZmwH$^w;j0?N+Am$j7ik6 z-W-xxhbqPjdRjFyd>r5#XXw(a&h)F-()Vc}w7v_99%k-joUy&w0_~}mLF=^(2>0@+ zW!O@*QOLwS)M8?KI z>Y>&_k7XRIge7uS#v@CWN3pEVNTj>mk|Ob@Qc`Q)W~}OLSq?f&b3U_Z@TwKS-sS*| zl8!V-p^ov#5qPL&Ph71RSi@*)*2+SS;1ug9KRkrSL+Om{3$s^vrPs+Z;j6i)#GQ#X z{YHu};Y?|wjBG@+)O+F3D;^l+m|3&Lo3YtS#S;GYa?8qH^pK7zjf|QrJc-fb!&eo` zCE}|yLg^%QLZ%dXEGl>Qeyh3Dra@VfXHqIYCmEtWd{*P|4V*3BnLB1KKQvRI6w5mg zC0qe+HEkEb-NFWNjU zfZof)nSwM(&R7i*$5^HKMq_JXO%!PiqnsPx8u#jhrI_y*aj)EcB_k>;dfmha=I;(7 z+e*2sW3vl7p*V+7~a-l4|Bk=Vr1k(D+^N+hBtCiV({#iKpMNGb3b03-D}GGbJq z#OR0hcEXv+WFvf0Z@jJnQ)gn44|y;jdd2=lGUJ)!|BPIekJxa}GhP%YeVx>fglS*Z z4M&LYsIy3EF2FsW*BPwOdOdzvO%7W?IIK8kjvNht@8!gZi)T8)q(rTUw$- z$VPU`dgMS4DesjgeM)SQIm+H#vQdIIU<2a_zdlu&6)z=S9SRS^BjrXOsXgj_BrVoL zh!PPyBaW=c|K>hsOq4_^rG-|aq>d7|!(#3#~B?xOmz-WWq2mn+p4 zC`m<$(405Xabw|}AWHd*q zM_vnG{UopDT<>{)tG+uzG{8~xl99c3OtqsP?U|J-?Kc&+LZqIDN}-lU6*`~uX?xC< zU)9?=hA^~``Q*xvEo`|WXVrUs(DRj{`0!6Z=$@B;Hx%Yezx|tj-;3LakKLaA;pbj{ zM%ARhI6p4G$NPOEWch1fz2CVX{P|x8beE4Pv=!pZzx|+3DiuHIoWI(4;qotYk1E^0 z%(Z_zuMwl}ucJUmfxmeQ{PK3_Zywga%81Fm^xh%)llX`{8vWEcl%E)V>p_+g&zVoy zh@CQ8H~Q6Pd5$xKmHa!q$l1<^ooARgG%w@~+dp3=|KoXnn!ELLwRc;~T@$l=WoCfF z;Qjs1J9w_6I;-%USWfR_MZn12v&6%@TX9^%r}@rvYaS=|R=9mtADv6&8}H$#Mv%*i zwT+^4_na#X86&Vcey;X<$II1~Ub%KnfzfzTeCnNA&SXrX^qh5^%R%-WANemhyTnI7IVWN;zhDth@w@z!t81z5ES_eGI27*K@t+NMB`E7(7q- z)-dh{UWr#AlFFg1n0j5{bnHM-=~p2O_CoIrM;Vku&a#~UI`dO6d=}0+H*zktj`!9T zJmi&Ud5$t%J43}L8HT`|43M%Xq{$hDKh`1_mrs*VH`C>xnUn3DDSS(q6VGE@sbNNY zVHtHHv-X9|ao6y;jjPr%|Mea_Id?sG?c+OVvHiSS$R`WRIg829Eap6=ZU=XFJSAQk zTwdTX_q&8Md?!v;9F$J+3AF;D6jD39HrWd46iJXq%FY0k%Gt|#Zze8W7Z2|!HKnch zRZEnl^@ZP=qf37e`wSM57m)8-t}O*JWk5_~4a`Z)J0;(Hn;nTfD?!SGvJ{EP*nZ+Q zv2WpJ^uha7YbgA&T#=nNDqGqcbuZMaoK6MJtXcG`TFQRtpG=#&qJ(R~^r>b_&6`S9 zGm$mx6CRlFX-80VeOI-E&~z#TtxN>`=(g5vlF{wKi6)awJ1+#?i`*^*!^(cG&EJlgf=% zH@mdFyJPf`ue5$|R&-IxWZKi^=T;c~&MuxwakU45A z9n#L~Px6`8r_xnqjbktMgrfyAwZwi6k6NA9BiKP8jU&scn@C}`KDw_o`4`V)#$Qo1?|D*Ew=8 zO!7iJ+MYzIlb+D@y4%+p3$H#4k?Qpz5~0kE+Cr+=_J%>GjG zRiEu0%l+iAmfX6UwX9SVv;@XG=~GK(H66G1flQ)b@(FO$MkuTLoK!Im1BYF2r=XJ( z73n?*wE76+1a&&SbaKs}1f?<%)fh#Hv{8Y{sz_>CdV$o%(ru)^k8j2`k2hoKnjH^h zv%$;}jQZ3x>xV~y<77Eb0n-^?9UG*QK0x_S4M^#k4}H`WWx#r(`fkK)yypFKq%s=2 z8anACwL#KI>Wl-P$AL&W*G|T2>Q&RhV@JbLj$;M3<2;)jl`XYTe8a!c+F6ITz^+PVQpm>B4`|=*WZuU{S&vgf z?Iy^&0K18@CO`;%rB|_Ev3ieq>DUr&Q-#SGszRl=u?CrbrF4@i@sR?pz0}r9WzES% zpqq$gNJn$5jI@s~$D>HCLxa%R0ik^|8k)y_(ygw|t@B2!ky*PDUMCi`N2>PO#D?ml z(w9&zyb;k7s5SxFdbAnSjBZAi|KZIro+m+5Bb_x+Tkm!N$$F}}fOR+wo%AerENaW` zozhp%BV_-x{;SU9b}{@6~q?YuIomS<|Zea~vMV^khklOI6ylraBCH}^O9HTQ7b z+l=Is*~nq6#6~dJ)2M^`JUjjDvD*P`M&X%TYLP<0Saa>Xo<>ico+_ghWlO8IA4xs+ zi~D&eZLG2SEI9W#<11HoBN{2#bI@0?R_Cp->jkvY+HCu!>{C|qQEQ@=bT{M zn(`!_OU4Civt9I$Bb(!jHc#SF+E0h>dec<9t-mK0)aIq@INEG)MWCTeW4%JYiOe<0N@jr7Nb4NoZnf4@nob!i)^&w zmC`10jALspm1r%>aeVq&MnBEy$ACw9TL%XE%t_!sg5$yF0W@tm6dB38N5Dn@K95n= z_Z>idf#V!D;9~P!(Gg{MFS1|+vWoE+0i*Le<9DfJ-)ZL2s~Io-cKYz|8 zBtY+yuC=VK)UWrhT2q>;%w#51Ev=D>XKQVZkrIDR0s2G&@kyB-*!OQwX}Z+azG#sY zF}f2PZIpD>`p1U1J~KWle$EVq)*~G#HTMT|S8y)t9O*(V1jue66iYyy6)~ zCE6G9h=l3+)n0vv7DwnZ-!@tfZH?pXMN52X$E@B+7!nxe5J|<6)e1I`{vP?!iy;QBQ zEAGL~nVg5uM$6)-6sY%NnEa7^6`sU!QdsKCOYNWU^NIKhX=dnlHb-|d)2SHdKABy3 z2Xi%QS2Qd+4|x=LhRS;7Q}Suzl=*hxtyHYM4=foW-cDL6S?X@9PsVzV=KUgn(oi~> z2g-a}%~7)w!KSZG?NVRU;b!dOw;obLD#-81TxQIz%!!G*o8GnPZ}M1L%2wW`oriae z&#thC&+1KuK2dJ0kuoMt)f*`u9!0tWW4hfFkDGyNIiZ|}56X~Ik&d`%L?lM75A72x zD8ptUq>@?@I`wK}#tUb za$i3>s;_@*w7T@&MRF{!HAm7jAY-pP|1R&g9`S9vyUNd^JpG$Eb4-SL3e=`%uUFN* z&(?YOZB(MMN_lK5NM`OQj$_19E8zN7KE4xkhQ#P?a!T9Jd1k9=9r)|Nc6#e08lkZ} z!G4z`R{8B_tqOfFT%%Xay6yUXwYZ8%roGmdGrgUwBvtoT)hd>r3D=T);A)+v|5Z^| zp4Y9f)O?OmG3s7cm5w{_rmoZVGtb`Z(WP1I&DqXvoz-4_ob_&7W3*S-J8t!BoTFGd znY!m~W~nRw-K#8~?|E)tH}lZ3yH9GFZLjv1{du)}rVD+y*@%Y#0tg_000IagfB*srAb=2.1.2 + entry_points: + reduce_noise: + name: reduce_noise + doc: 'Reduce noise from audio file or directory containing audio files. + + The audio files must be in .wav format. + + The cleaned audio files will be saved in the target_directory. + + For information about the noise reduction algorithm see: + + https://github.com/timsainb/noisereduce + + Notice that the saved files are in wav format, even if the original files + are in other format.' + parameters: + - name: audio_source + type: str + doc: path to audio file or directory containing audio files + - name: target_directory + type: str + doc: path to directory to save the cleaned audio files. + - name: sample_rate + type: int + doc: Number of samples in one second in the audio file. Pass `None` to keep + the original sample rate. + default: 16000 + - name: duration + type: int + doc: Duration of the audio file to clean in seconds. Pass `None` to keep the + original duration. + default: null + - name: channel + type: int + doc: Channel to clean. Pass the number of the channel to clean. To clean all + channels pass None. + default: null + - name: silence_threshold + type: float + doc: The threshold to remove silence from the audio, in dB. If None, no silence + removal is performed. + default: null + - name: use_multiprocessing + type: int + doc: Number of processes to use for cleaning the audio files. If 0, no multiprocessing + is used. + default: 0 + - name: verbose + type: bool + doc: Verbosity level. If True, display progress bar. + default: true + outputs: [] + lineno: 388 + has_varargs: false + has_kwargs: false + clean_audio: + name: clean_audio + doc: '' + parameters: + - name: self + - name: data + type: Tensor + outputs: + - type: torch.Tensor + lineno: 276 + has_varargs: false + has_kwargs: false + save_audio: + name: save_audio + doc: '' + parameters: + - name: self + - name: audio + type: ndarray + - name: target_path + type: Path + outputs: [] + lineno: 256 + has_varargs: false + has_kwargs: false + load_audio: + name: load_audio + doc: '' + parameters: + - name: self + - name: file + type: str + outputs: + - type: torch.Tensor + lineno: 268 + has_varargs: false + has_kwargs: false + update_to_wav_suffix: + name: update_to_wav_suffix + doc: '' + parameters: + - name: self + - name: audio_file + type: Path + outputs: [] + lineno: 125 + has_varargs: false + has_kwargs: false + remove_silence: + name: remove_silence + doc: Remove silence sections from the audio. + parameters: + - name: self + - name: audio + type: ndarray + doc: The audio to remove silence from. + outputs: + - doc: The audio without silence. + lineno: 134 + has_varargs: false + has_kwargs: false + reduce_noise_dfn: + name: reduce_noise_dfn + doc: 'Reduce noise from audio files using DeepFilterNet. + + For more information about the noise reduction algorithm see: + + https://github.com/Rikorose/DeepFilterNet + + Notice that the saved files are in wav format, even if the original files + are in other format.' + parameters: + - name: audio_source + type: str + doc: path to audio file or directory of audio files + - name: target_directory + type: str + doc: path to target directory to save cleaned audio files + - name: pad + type: bool + doc: whether to pad the audio file with zeros before cleaning + default: true + - name: atten_lim_db + type: int + doc: maximum attenuation in dB + default: null + - name: silence_threshold + type: float + doc: the threshold to remove silence from the audio, in dB. If None, no silence + removal is performed. + default: null + - name: use_multiprocessing + type: int + doc: Number of processes to use for cleaning the audio files. If 0, no multiprocessing + is used. + default: 0 + - name: verbose + type: bool + doc: verbosity level. If True, display progress bar and logs. + default: true + outputs: [] + lineno: 322 + has_varargs: false + has_kwargs: true + description: Reduce noise from audio files + default_handler: reduce_noise + disable_auto_mount: false + clone_target_dir: '' + env: [] + priority_class_name: '' + preemption_mode: prevent + affinity: null + tolerations: null + security_context: {} +verbose: false diff --git a/noise_reduction/item.yaml b/noise_reduction/item.yaml new file mode 100644 index 000000000..8ddc63f4f --- /dev/null +++ b/noise_reduction/item.yaml @@ -0,0 +1,29 @@ +apiVersion: v1 +categories: + - data-preparation + - machine-learning +description: Reduce noise from audio files +doc: '' +example: noise_reduction.ipynb +generationDate: 2024-03-04:17-30 +hidden: false +icon: '' +labels: + author: yonatans +maintainers: [] +mlrunVersion: 1.5.2 +name: noise-reduction +platformVersion: 3.5.3 +spec: + filename: noise_reduction.py + handler: reduce_noise + image: mlrun/mlrun + kind: job + requirements: [ + librosa, + noisereduce, + deepfilternet, + torchaudio>=2.1.2, + ] +url: '' +version: 1.0.0 \ No newline at end of file diff --git a/noise_reduction/noise_reduction.ipynb b/noise_reduction/noise_reduction.ipynb new file mode 100644 index 000000000..e4fa0a534 --- /dev/null +++ b/noise_reduction/noise_reduction.ipynb @@ -0,0 +1,942 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "4e0abc60-b718-4f45-a82a-0b8759f19d3f", + "metadata": {}, + "source": [ + "# Noise Reduction\n", + "\n", + "## Table of Contents\n", + "\n", + "1. [Introduction](#Introduction)\n", + "2. [Project Setup](#Setting-up-a-project)\n", + "3. [Noise Reduction Techniques](#Noise-Reduction-Techniques)\n", + " 1. [DeepFilterNet](#DeepFilterNet)\n", + " 2. [Spectral Gating](#SpectralGating)" + ] + }, + { + "cell_type": "markdown", + "id": "9af33629-965f-4f73-9e4a-89cc4c3dacf1", + "metadata": {}, + "source": [ + "## Introduction\n", + "\n", + "Noise reduction is a crucial signal processing technique used to enhance the quality of signals by minimizing unwanted or irrelevant noise. This technique finds applications in various fields such as audio processing, image processing, telecommunications, and more. The goal is to extract the useful information from a signal while suppressing undesirable background noise." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "f9cd530d-36a7-47b1-96f8-498d338b3a1a", + "metadata": {}, + "outputs": [], + "source": [ + "import mlrun" + ] + }, + { + "cell_type": "markdown", + "id": "c659289f-01f2-4e02-b843-b39cfc0c1d63", + "metadata": {}, + "source": [ + "## Setting up a project\n", + "\n", + "First of all we need to create a project with the `noise-reduction` function" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "c4217272-85b8-4af7-afee-bc97c6c73bd9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 15:54:53,561 [info] Project loaded successfully: {'project_name': 'noise-reduction'}\n" + ] + } + ], + "source": [ + "# Creating a project\n", + "project = mlrun.get_or_create_project(\"noise-reduction\")\n", + "# Importing the function from hub\n", + "noise_reduction_function = project.set_function(\"hub://noise_reduction\")" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "f7df4c3e-4e5b-47bd-a298-527d9c6fcb8f", + "metadata": {}, + "outputs": [], + "source": [ + "# Audio source can be either a single file or a directory of audio files\n", + "audio_source = \"data\"" + ] + }, + { + "cell_type": "markdown", + "id": "6c1c5109-6380-4364-b016-728523ed0ea1", + "metadata": {}, + "source": [ + "## Noise Reduction Techniques" + ] + }, + { + "attachments": { + "e48ce103-14f3-421d-82a4-823344895241.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "id": "5c81ecee-851c-4ee8-ad3a-4d372a1bfd97", + "metadata": {}, + "source": [ + "\n", + "### 1. DeepFilterNet\n", + "![image.png](attachment:e48ce103-14f3-421d-82a4-823344895241.png)\n", + "\n", + "In order to use this technique, you simply need to use the `reduce_noise_dfn` handler.\n", + "\n", + "Reduce noise from audio files using DeepFilterNet. For more information about the noise reduction algorithm, see [DeepFilterNet GitHub](https://github.com/Rikorose/DeepFilterNet). Notice that the saved files are in wav format, even if the original files are in other formats.\n", + "\n", + "### Parameters:\n", + "\n", + "- `audio_source`: path to the audio file or directory of audio files\n", + "- `target_directory`: path to the target directory to save cleaned audio files\n", + "- `pad`: whether to pad the audio file with zeros before cleaning\n", + "- `atten_lim_db`: maximum attenuation in dB\n", + "- `silence_threshold`: the threshold to remove silence from the audio, in dB. If None, no silence removal is performed.\n", + "- `use_multiprocessing`: Number of processes to use for cleaning the audio files. If 0, no multiprocessing is used.\n", + "- `verbose`: verbosity level. If True, display progress bar and logs.\n", + "- `kwargs`: additional arguments to pass to `torchaudio.load()`. For more information, see [torchaudio.load()](https://pytorch.org/audio/stable/generated/torchaudio.load.html).\n", + "\n", + "\n", + "In the examples below, the function is running locally, for running remotely, it is required to build the function's image first (need to execute only once):\n", + "```python\n", + "noise_reduction_function.apply(mlrun.auto_mount()) # required for local files\n", + "project.build_function(\"noise-reduction\")\n", + "```\n", + "\n", + "#### 1.1. Example" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "16113524-8597-48d4-8172-76b897fee3f2", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 15:54:56,999 [info] Storing function: {'name': 'noise-reduce-reduce-noise-dfn', 'uid': '9732dac831784a6a8b53acab5ff83a08', 'db': 'http://mlrun-api:8080'}\n", + "> 2024-03-04 15:55:07,525 [info] logging run results to: http://mlrun-api:8080\n", + "> 2024-03-04 15:55:07,702 [info] Reducing noise from audio files.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Noise-reduction: 0%| | 0/2 [00:00 2024-03-04 15:55:08,437 [info] Loading DeepFilterNet2 model.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "`torchaudio.backend.common.AudioMetaData` has been moved to `torchaudio.AudioMetaData`. Please update the import path.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2024-03-04 15:55:08 | INFO | DF | Running on torch 2.1.2+cu121\n", + "2024-03-04 15:55:08 | INFO | DF | Running on host jupyter-yoni-d56767c87-678n2\n", + "> 2024-03-04 15:55:08,464 [info] Loading DeepFilterNet2 model.\n", + "2024-03-04 15:55:08 | INFO | DF | Running on torch 2.1.2+cu121\n", + "2024-03-04 15:55:08 | INFO | DF | Running on host jupyter-yoni-d56767c87-678n2\n", + "2024-03-04 15:55:08 | INFO | DF | Loading model settings of DeepFilterNet3\n", + "2024-03-04 15:55:08 | INFO | DF | Using DeepFilterNet3 model at /igz/.cache/DeepFilterNet/DeepFilterNet3\n", + "2024-03-04 15:55:08 | INFO | DF | Initializing model `deepfilternet3`\n", + "2024-03-04 15:55:08 | INFO | DF | Loading model settings of DeepFilterNet3\n", + "2024-03-04 15:55:08 | INFO | DF | Using DeepFilterNet3 model at /igz/.cache/DeepFilterNet/DeepFilterNet3\n", + "2024-03-04 15:55:08 | INFO | DF | Initializing model `deepfilternet3`\n", + "2024-03-04 15:55:08 | INFO | DF | Found checkpoint /igz/.cache/DeepFilterNet/DeepFilterNet3/checkpoints/model_120.ckpt.best with epoch 120\n", + "2024-03-04 15:55:08 | INFO | DF | Found checkpoint /igz/.cache/DeepFilterNet/DeepFilterNet3/checkpoints/model_120.ckpt.best with epoch 120\n", + "2024-03-04 15:55:08 | INFO | DF | Running on device cpu\n", + "2024-03-04 15:55:08 | INFO | DF | Running on device cpu\n", + "2024-03-04 15:55:08 | INFO | DF | Model loaded\n", + "2024-03-04 15:55:08 | INFO | DF | Model loaded\n", + "> 2024-03-04 15:55:08,635 [info] Reducing noise from test_data.mp3.\n", + "> 2024-03-04 15:55:08,636 [info] Reducing noise from test_data.wav.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\u001b[32m2024-03-04 15:55:08\u001b[0m | \u001b[33m\u001b[1mWARNING \u001b[0m | \u001b[36mDF\u001b[0m | \u001b[33m\u001b[1mAudio sampling rate does not match model sampling rate (16000, 48000). Resampling...\u001b[0m\n", + "\"sinc_interpolation\" resampling method name is being deprecated and replaced by \"sinc_interp_hann\" in the next release. The default behavior remains unchanged.\n", + "The MPEG_LAYER_III subtype is unknown to TorchAudio. As a result, the bits_per_sample attribute will be set to 0. If you are seeing this warning, please report by opening an issue on github (after checking for existing/closed ones). You may otherwise ignore this warning.\n", + "\u001b[32m2024-03-04 15:55:08\u001b[0m | \u001b[33m\u001b[1mWARNING \u001b[0m | \u001b[36mDF\u001b[0m | \u001b[33m\u001b[1mAudio sampling rate does not match model sampling rate (16000, 48000). Resampling...\u001b[0m\n", + "\"sinc_interpolation\" resampling method name is being deprecated and replaced by \"sinc_interp_hann\" in the next release. The default behavior remains unchanged.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 15:55:16,701 [info] Saved cleaned audio file to clean_data/test_data.wav.\n", + "> 2024-03-04 15:55:16,706 [info] Saved cleaned audio file to clean_data/test_data_mp3.wav.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Noise-reduction: 100%|██████████| 2/2 [00:09<00:00, 4.51s/file]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 15:55:16,791 [info] Summarizing the results.\n", + "> 2024-03-04 15:55:16,792 [info] Done (2/2)\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
noise-reduction0Mar 04 15:54:57completednoise-reduce-reduce-noise-dfn
v3io_user=yonis
kind=local
owner=yonis
host=jupyter-yoni-d56767c87-678n2
audio_source
target_directory=./clean_data
use_multiprocessing=2
silence_threshold=50
atten_lim_db=10
successes
errors
\n", + "
\n", + "
\n", + "
\n", + " Title\n", + " ×\n", + "
\n", + " \n", + "
\n", + "
\n" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/html": [ + " > to track results use the .show() or .logs() methods or click here to open in UI" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 15:55:17,976 [info] Run execution finished: {'status': 'completed', 'name': 'noise-reduce-reduce-noise-dfn'}\n" + ] + } + ], + "source": [ + "dfn_run = noise_reduction_function.run(\n", + " handler=\"reduce_noise_dfn\",\n", + " inputs={\"audio_source\": audio_source},\n", + " params={\n", + " \"target_directory\": \"./clean_data\",\n", + " \"use_multiprocessing\": 2,\n", + " \"silence_threshold\": 50,\n", + " \"atten_lim_db\": 10,\n", + " },\n", + " returns=[\"successes: file\", \"errors: file\"],\n", + " local=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "a71ba944-1fc2-48be-b789-d57c59201939", + "metadata": {}, + "source": [ + "### Looking at the result" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "19b04cf6-5a4d-4d74-b66e-193540a900a1", + "metadata": {}, + "outputs": [ + { + "data": { + "application/json": { + "test_data.mp3": "clean_data/test_data_mp3.wav", + "test_data.wav": "clean_data/test_data.wav" + }, + "text/plain": [ + "" + ] + }, + "metadata": { + "application/json": { + "expanded": false, + "root": "root" + } + }, + "output_type": "display_data" + }, + { + "data": { + "application/json": {}, + "text/plain": [ + "" + ] + }, + "metadata": { + "application/json": { + "expanded": false, + "root": "root" + } + }, + "output_type": "display_data" + } + ], + "source": [ + "dfn_run.artifact(\"successes\").show()\n", + "dfn_run.artifact(\"errors\").show()" + ] + }, + { + "attachments": { + "68c16acf-c28e-4bb8-a453-abbebc0137ce.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "id": "4576b576-4ac0-433a-9d1d-f39225a6648d", + "metadata": {}, + "source": [ + "\n", + "### 2. Spectral Gating\n", + "![image.png](attachment:68c16acf-c28e-4bb8-a453-abbebc0137ce.png)\n", + "\n", + "In order to use this technique, you simply need to use the `reduce_noise` handler.\n", + "\n", + "Spectral gating selectively filters signal frequencies based on amplitude, offering targeted noise reduction or feature enhancement in signal processing applications.\n", + "\n", + "Reduce noise from an audio file or directory containing audio files. The audio files must be in .wav format. The cleaned audio files will be saved in the target directory. For information about the noise reduction algorithm, see [noisereduce GitHub](https://github.com/timsainb/noisereduce). Notice that the saved files are in .wav format, even if the original files are in another format.\n", + "\n", + "### Parameters:\n", + "\n", + "- `audio_source`: path to the audio file or directory containing audio files\n", + "- `target_directory`: path to the directory to save the cleaned audio files.\n", + "- `sample_rate`: Number of samples in one second in the audio file. Pass `None` to keep the original sample rate.\n", + "- `duration`: Duration of the audio file to clean in seconds. Pass `None` to keep the original duration.\n", + "- `channel`: Channel to clean. Pass the number of the channel to clean. To clean all channels, pass `None`.\n", + "- `silence_threshold`: The threshold to remove silence from the audio, in dB. If `None`, no silence removal is performed.\n", + "- `use_multiprocessing`: Number of processes to use for cleaning the audio files. If 0, no multiprocessing is used.\n", + "- `verbose`: Verbosity level. If True, display a progress bar.\n", + "\n", + "#### 2.1. Example" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "f10a5ecd-bf90-4650-a42e-d3fbfff78e52", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 16:07:39,378 [info] Storing function: {'name': 'noise-reduce-reduce-noise', 'uid': '6e6d6f7c3f8243b995dc1bbcf66f7544', 'db': 'http://mlrun-api:8080'}\n", + "> 2024-03-04 16:07:39,541 [info] Reducing noise from audio files.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Noise-reduction: 0%| | 0/2 [00:00 2024-03-04 16:07:39,565 [info] Reducing noise from test_data.mp3.\n", + "> 2024-03-04 16:07:39,566 [info] Reducing noise from test_data.wav.\n", + "> 2024-03-04 16:07:46,174 [info] Saved cleaned audio file to clean_data/test_data.wav.\n", + "> 2024-03-04 16:07:46,175 [info] Saved cleaned audio file to clean_data/test_data_mp3.wav.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Noise-reduction: 100%|██████████| 2/2 [00:06<00:00, 3.31s/file]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 16:07:46,211 [info] Summarizing the results.\n", + "> 2024-03-04 16:07:46,212 [info] Done (2/2)\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
noise-reduction0Mar 04 16:07:39completednoise-reduce-reduce-noise
v3io_user=yonis
kind=local
owner=yonis
host=jupyter-yoni-d56767c87-678n2
audio_source
target_directory=./clean_data
use_multiprocessing=2
silence_threshold=50
successes
errors
\n", + "
\n", + "
\n", + "
\n", + " Title\n", + " ×\n", + "
\n", + " \n", + "
\n", + "
\n" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/html": [ + " > to track results use the .show() or .logs() methods or click here to open in UI" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2024-03-04 16:07:46,389 [info] Run execution finished: {'status': 'completed', 'name': 'noise-reduce-reduce-noise'}\n" + ] + } + ], + "source": [ + "noise_reduction_run = noise_reduction_function.run(\n", + " handler=\"reduce_noise\",\n", + " inputs={\"audio_source\": audio_source},\n", + " params={\n", + " \"target_directory\": \"./clean_data\",\n", + " \"use_multiprocessing\": 2,\n", + " \"silence_threshold\": 50,\n", + " },\n", + " local=True,\n", + " returns=[\"successes: file\", \"errors: file\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "699615d7-bba1-4147-ad3d-d295d794f866", + "metadata": {}, + "source": [ + "### Looking at the result" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "47c4f66a-d5d0-47e5-9842-abbe6653526b", + "metadata": {}, + "outputs": [ + { + "data": { + "application/json": { + "test_data.mp3": "clean_data/test_data_mp3.wav", + "test_data.wav": "clean_data/test_data.wav" + }, + "text/plain": [ + "" + ] + }, + "metadata": { + "application/json": { + "expanded": false, + "root": "root" + } + }, + "output_type": "display_data" + }, + { + "data": { + "application/json": {}, + "text/plain": [ + "" + ] + }, + "metadata": { + "application/json": { + "expanded": false, + "root": "root" + } + }, + "output_type": "display_data" + } + ], + "source": [ + "dfn_run.artifact(\"successes\").show()\n", + "dfn_run.artifact(\"errors\").show()" + ] + }, + { + "cell_type": "markdown", + "id": "6eeae1bb-c714-491b-91dd-f22148cd0970", + "metadata": {}, + "source": [ + "The output of this function is the same as the first one." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "mlrun-base", + "language": "python", + "name": "conda-env-mlrun-base-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.16" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/noise_reduction/noise_reduction.py b/noise_reduction/noise_reduction.py new file mode 100644 index 000000000..f0fff5504 --- /dev/null +++ b/noise_reduction/noise_reduction.py @@ -0,0 +1,625 @@ +import logging +from abc import ABCMeta, abstractmethod +from multiprocessing import Process, Queue +from pathlib import Path +from typing import List, Tuple, Type, Union + +import librosa +import numpy as np +import torch +from scipy.io import wavfile +from tqdm import tqdm + +#: The value to send into multiprocessing queues to stop the process: +_MULTIPROCESSING_STOP_MARK = "STOP" + +# Get the global logger: +try: + import mlrun + + _LOGGER = mlrun.get_or_create_ctx("noise_reduce").logger +except ModuleNotFoundError: + _LOGGER = logging.getLogger() + + +class ReduceNoiseBase(metaclass=ABCMeta): + """ + Base class for noise reduction. + This class is aimed to be inherited by specific noise reduction algorithms. + You must implement the following methods: + - clean_audio: The method to clean the audio, where the noise reduction algorithm is implemented. + - save_audio: The method to save the audio to a file. + - load_audio: The method to load the audio from a file. + + After implementing the above methods, you can use the reduce_noise method to reduce noise from audio files. + """ + def __init__( + self, + target_directory: Path, + verbose: bool = True, + silence_threshold: float = None, + ): + self.target_directory = Path(target_directory) + self.verbose = verbose + self.silence_threshold = silence_threshold + + def reduce_noise(self, audio_file: Path) -> Tuple[bool, Tuple[str, str]]: + """ + Reduce noise from the given audio file. + + :param audio_file: The audio file to reduce noise from. + + :returns: A tuple of: + - a boolean indicating whether an error occurred + - a tuple of: + - audio file name + - target path in case of success / error message in case of failure. + """ + try: + if self.verbose: + _LOGGER.info(f"Reducing noise from {audio_file.name}.") + + # Load audio data: + audio = self.load_audio(file=str(audio_file)) + + # Perform noise reduction: + reduced_noise = self.clean_audio(data=audio) + + # Remove silence from the audio if necessary: + reduced_noise = self.remove_silence(audio=reduced_noise) + + # Prepare target path: + target_path = self.update_to_wav_suffix(audio_file=audio_file) + + # Save file: + self.save_audio( + audio=reduced_noise, + target_path=target_path, + ) + + if self.verbose: + _LOGGER.info(f"Saved cleaned audio file to {target_path}.") + + return False, (audio_file.name, str(target_path)) + except Exception as exception: + if self.verbose: + _LOGGER.error(f"Failed to reduce noise from {audio_file.name}.") + _LOGGER.error(f"Error: {exception}") + # Collect the error: + return True, (audio_file.name, str(exception)) + + @abstractmethod + def clean_audio(self, data) -> Union[np.ndarray, torch.Tensor]: + """ + Clean the audio from noise. Here you should implement the noise reduction algorithm. + + :param data: The audio data to clean. + + :returns: The cleaned audio. + """ + pass + + @abstractmethod + def save_audio(self, audio: np.ndarray, target_path: Path): + """ + Save the audio to a file. + + :param audio: The audio to save. + :param target_path: The target path to save the audio to. + """ + pass + + @abstractmethod + def load_audio(self, file: str) -> Tuple[Union[np.ndarray, torch.Tensor], int]: + """ + Load the audio from a file. + + :param file: The file to load the audio from. + + :returns: A tuple of: + - the audio data + - the sample rate + """ + pass + + def update_to_wav_suffix(self, audio_file: Path): + target_path = self.target_directory / audio_file.name + if target_path.suffix != ".wav": + old_suffix = target_path.suffix[1:] + target_path = target_path.with_stem(target_path.stem + f"_{old_suffix}") + return target_path.with_suffix(".wav") + else: + return target_path + + def remove_silence( + self, + audio: np.ndarray, + ): + """ + Remove silence sections from the audio. + + :param audio: The audio to remove silence from. + + :returns: The audio without silence. + """ + if self.silence_threshold is None: + return audio + + # Get the indices of the non-silent frames: + non_silent_indices = librosa.effects.split( + y=audio, + top_db=self.silence_threshold, + frame_length=2048, + hop_length=256, + ) + + # Get the non-silent audio: + non_silent_audio = np.concatenate( + [audio[:, start:end] for start, end in non_silent_indices], axis=1 + ) + + return non_silent_audio + + +class ReduceNoise(ReduceNoiseBase): + def __init__( + self, + target_directory: Path, + verbose: bool = True, + silence_threshold: float = None, + sample_rate: int = 16000, + duration: int = None, + channel: int = None, + ): + super().__init__(target_directory, verbose, silence_threshold) + self.sample_rate = sample_rate + self.duration = duration + self.channel = channel + + def save_audio(self, audio: np.ndarray, target_path: Path): + # If the audio has more than one channel, transpose it in order to save it: + if len(audio) > 1: + audio = audio.T + + wavfile.write( + filename=target_path, + rate=self.sample_rate, + data=audio, + ) + + def load_audio(self, file: str) -> np.ndarray: + data, sr = librosa.load( + path=file, + sr=self.sample_rate, + mono=False, # keep channels separate + duration=self.duration, + ) + # set sample rate: + self.sample_rate = int(sr) + + # convert to int with scaling for 16-bit integer + data *= 32767 / np.max(np.abs(data)) # re-scaling + data = data.astype(np.int16) # change data type + + # select channel + data_to_reduce = data[self.channel] if self.channel is not None else data + return data_to_reduce + + def clean_audio(self, data: np.ndarray) -> np.ndarray: + try: + import noisereduce + except ImportError as e: + raise ImportError("Please install noisereduce package") from e + + reduced_noise = noisereduce.reduce_noise(y=data, sr=self.sample_rate) + + # add channel back after noise reduction + if self.channel is not None: + # putting the channel back in the data + data[self.channel] = reduced_noise + # updating the data to save + reduced_noise = data + + return reduced_noise + + +class DFN(ReduceNoiseBase): + def __init__( + self, + target_directory: Path, + verbose: bool = True, + silence_threshold: float = None, + pad: bool = True, + atten_lim_db: int = None, + **kwargs, + ): + super().__init__(target_directory, verbose, silence_threshold) + self.pad = pad + self.atten_lim_db = atten_lim_db + self.kwargs = kwargs + + # import required packages + try: + from df.enhance import init_df + except ImportError as e: + raise ImportError("Please install deepfilternet packages") from e + + if self.verbose: + _LOGGER.info("Loading DeepFilterNet2 model.") + + # Load the model: + model, df_state, _ = init_df() + self.model = model + self.df_state = df_state + self.sample_rate = self.df_state.sr() + + def save_audio(self, audio: np.ndarray, target_path: Path): + try: + from df.enhance import save_audio + except ImportError as e: + raise ImportError("Please install deepfilternet package") from e + save_audio( + file=target_path.name, + audio=audio, + sr=self.sample_rate, + output_dir=str(self.target_directory), + ) + + def load_audio(self, file: str) -> torch.Tensor: + try: + from df.enhance import load_audio + except ImportError as e: + raise ImportError("Please install deepfilternet package") from e + audio, _ = load_audio(file=file, sr=self.sample_rate, **self.kwargs) + return audio + + def clean_audio(self, data: torch.Tensor) -> torch.Tensor: + try: + from df.enhance import enhance + except ImportError as e: + raise ImportError("Please install deepfilternet package") from e + return enhance( + model=self.model, + df_state=self.df_state, + audio=data, + pad=self.pad, + atten_lim_db=self.atten_lim_db, + ) + + +def _multiprocessing_complete_tasks( + noise_reduce_type: Type[ReduceNoiseBase], + noise_reduce_arguments: dict, + tasks_queue: Queue, + results_queue: Queue, +): + """ + Complete the tasks in the given queue and put the results in the given results queue. The function will stop when + the given tasks queue will receive the stop mark. It is aimed to be used with multiprocessing as a process. + + :param noise_reduce_type: The noise reduce type to use. + :param noise_reduce_arguments: The noisereduce initialization kwargs. + :param tasks_queue: A queue to get the tasks from. + :param results_queue: A queue to put the results in. + """ + # Initialize the reduce noise object + noise_reducer = noise_reduce_type(**noise_reduce_arguments) + + # Start listening to the tasks queue: + while True: + # Get the audio_file: + audio_file = tasks_queue.get() + if audio_file == _MULTIPROCESSING_STOP_MARK: + break + audio_file = Path(audio_file) + # Apply noise reduction and collect the result: + results_queue.put(noise_reducer.reduce_noise(audio_file=audio_file)) + + # Mark the end of the tasks: + results_queue.put(_MULTIPROCESSING_STOP_MARK) + + +def reduce_noise_dfn( + audio_source: str, + target_directory: str, + pad: bool = True, + atten_lim_db: int = None, + silence_threshold: float = None, + use_multiprocessing: int = 0, + verbose: bool = True, + **kwargs, +): + """ + Reduce noise from audio files using DeepFilterNet. + For more information about the noise reduction algorithm see: + https://github.com/Rikorose/DeepFilterNet + Notice that the saved files are in wav format, even if the original files are in other format. + + :param audio_source: path to audio file or directory of audio files + :param target_directory: path to target directory to save cleaned audio files + :param pad: whether to pad the audio file with zeros before cleaning + :param atten_lim_db: maximum attenuation in dB + :param silence_threshold: the threshold to remove silence from the audio, in dB. If None, no silence removal is + performed. + :param use_multiprocessing: Number of processes to use for cleaning the audio files. + If 0, no multiprocessing is used. + :param verbose: verbosity level. If True, display progress bar and logs. + :param kwargs: additional arguments to pass to torchaudio.load(). For more information see: + https://pytorch.org/audio/stable/generated/torchaudio.load.html + """ + if verbose: + _LOGGER.info("Reducing noise from audio files.") + + # create target directory: + target_directory = _create_target_directory(target_directory) + + # get audio files: + audio_files = _get_audio_files(audio_source) + + noise_reduce_arguments = { + "target_directory": target_directory, + "pad": pad, + "atten_lim_db": atten_lim_db, + "silence_threshold": silence_threshold, + **kwargs, + } + + if use_multiprocessing: + results = _parallel_run( + noise_reduce_type=DFN, + noise_reduce_arguments=noise_reduce_arguments, + n_workers=use_multiprocessing, + audio_files=audio_files, + description="Noise-reduction", + verbose=verbose, + ) + else: + results = _run( + noise_reduce_type=DFN, + noise_reduce_arguments=noise_reduce_arguments, + audio_files=audio_files, + description="Noise-reduction", + verbose=verbose, + ) + + return _process_results(results, verbose) + + +def reduce_noise( + audio_source: str, + target_directory: str, + sample_rate: int = 16000, + duration: int = None, + channel: int = None, + silence_threshold: float = None, + use_multiprocessing: int = 0, + verbose: bool = True, +): + """ + Reduce noise from audio file or directory containing audio files. + The audio files must be in .wav format. + The cleaned audio files will be saved in the target_directory. + For information about the noise reduction algorithm see: + https://github.com/timsainb/noisereduce + Notice that the saved files are in wav format, even if the original files are in other format. + + :param audio_source: path to audio file or directory containing audio files + :param target_directory: path to directory to save the cleaned audio files. + :param sample_rate: Number of samples in one second in the audio file. + Pass `None` to keep the original sample rate. + :param duration: Duration of the audio file to clean in seconds. + Pass `None` to keep the original duration. + :param channel: Channel to clean. Pass the number of the channel to clean. + To clean all channels pass None. + :param silence_threshold: The threshold to remove silence from the audio, in dB. + If None, no silence removal is performed. + :param use_multiprocessing: Number of processes to use for cleaning the audio files. + If 0, no multiprocessing is used. + :param verbose: Verbosity level. If True, display progress bar. + """ + if verbose: + _LOGGER.info("Reducing noise from audio files.") + + # create target directory: + target_directory = _create_target_directory(target_directory) + + # get audio files: + audio_files = _get_audio_files(audio_source) + + # Create the reduce noise object: + noise_reduce_arguments = { + "target_directory": target_directory, + "sample_rate": sample_rate, + "duration": duration, + "channel": channel, + "silence_threshold": silence_threshold, + } + + if use_multiprocessing: + results = _parallel_run( + noise_reduce_type=ReduceNoise, + noise_reduce_arguments=noise_reduce_arguments, + n_workers=use_multiprocessing, + audio_files=audio_files, + description="Noise-reduction", + verbose=verbose, + ) + else: + results = _run( + noise_reduce_type=ReduceNoise, + noise_reduce_arguments=noise_reduce_arguments, + audio_files=audio_files, + description="Noise-reduction", + verbose=verbose, + ) + + return _process_results(results, verbose) + + +def _create_target_directory(target_directory: str) -> str: + target_directory = Path(target_directory) + if not target_directory.exists(): + target_directory.mkdir(parents=True, exist_ok=True) + return str(target_directory) + + +def _get_audio_files(audio_source: str): + audio_source = Path(audio_source) + audio_files = [] + if audio_source.is_dir(): + audio_files = list(audio_source.glob("*.*")) + elif audio_source.is_file(): + audio_files.append(audio_source) + else: + raise ValueError( + f"audio_source must be a file or a directory, got {audio_source}" + ) + return audio_files + + +def _parallel_run( + noise_reduce_type: Type[ReduceNoiseBase], + noise_reduce_arguments: dict, + n_workers: int, + audio_files: List[Path], + description: str, + verbose: bool, +) -> List[Tuple[bool, Tuple[str, str]]]: + """ + Run multiple noise reduce workers with multiprocessing to complete the tasks that will be created on the provided + files using the given task creator. + + :param noise_reduce_type: The noise reduce type to use. + :param n_workers: The number of workers to use. + :param audio_files: The audio files to use. + :param description: The description to use for the progress bar. + :param verbose: Verbosity. + + :returns: The collected results. + """ + # Check the number of workers: + if n_workers > len(audio_files): + _LOGGER.warning( + f"The number of workers ({n_workers}) is larger than the number of audio files ({len(audio_files)}). " + f"Setting the number of workers to {len(audio_files)}." + ) + n_workers = len(audio_files) + + # Initialize the multiprocessing queues: + tasks_queue = Queue() + results_queue = Queue() + + # Initialize the multiprocessing processes: + task_completion_processes = [ + Process( + target=_multiprocessing_complete_tasks, + kwargs={ + "noise_reduce_type": noise_reduce_type, + "noise_reduce_arguments": noise_reduce_arguments, + "tasks_queue": tasks_queue, + "results_queue": results_queue, + }, + ) + for _ in range(n_workers) + ] + + # Start the multiprocessing processes: + for p in task_completion_processes: + p.start() + + # Put the tasks in the queue: + for audio_file in audio_files: + # tasks_queue.put(task_creator.create_task(audio_file=audio_file).to_tuple()) + tasks_queue.put(audio_file) + + # Put the stop marks in the queue: + for _ in range(n_workers): + tasks_queue.put(_MULTIPROCESSING_STOP_MARK) + + # Collect the results: + results = [] + stop_marks_counter = 0 + with tqdm( + desc=description, + unit="file", + total=len(audio_files), + disable=not verbose, + ) as progressbar: + while True: + # Get a result from the queue: + result: Tuple[bool, Tuple[str, str]] = results_queue.get() + if result == _MULTIPROCESSING_STOP_MARK: + stop_marks_counter += 1 + if stop_marks_counter == n_workers: + break + else: + # Collect the result: + results.append(result) + progressbar.update(1) + + # Wait for the processes to finish: + for p in task_completion_processes: + p.join() + + return results + + +def _run( + noise_reduce_type: Type[ReduceNoiseBase], + noise_reduce_arguments: dict, + audio_files: List[Path], + description: str, + verbose: bool, +) -> List[Tuple[bool, Tuple[str, str]]]: + """ + Run the noise reduce algorithm on the given audio files and collect the results. + + :param noise_reduce_type: The noise reduce type to use. + :param noise_reduce_arguments: The noisereduce initialization kwargs. + :param audio_files: The audio files to use. + :param description: The description to use for the progress bar. + :param verbose: Verbosity. + + :returns: The collected results. + """ + # Create the reduce noise object: + noise_reducer = noise_reduce_type(**noise_reduce_arguments) + + # Run the noise reduce algorithm on the audio files and collect the results: + results = [] + for audio_file in tqdm( + audio_files, + desc=description, + unit="file", + total=len(audio_files), + disable=not verbose, + ): + results.append(noise_reducer.reduce_noise(audio_file=audio_file)) + + return results + + +def _process_results( + results: List[Tuple[bool, Tuple[str, str]]], verbose: bool +) -> Tuple[dict, dict]: + """ + Process the results of the tasks. + + :param results: The results to process. + :param verbose: Verbosity. + + :returns: The processed results as a tuple of successes and errors. + """ + if verbose: + _LOGGER.info("Summarizing the results.") + successes = {} + errors = {} + for is_error, result in results: + if is_error: + errors[result[0]] = result[1] + else: + successes[result[0]] = result[1] + if verbose: + _LOGGER.info(f"Done ({len(successes)}/{len(successes) + len(errors)})\n") + + return successes, errors diff --git a/noise_reduction/requirements.txt b/noise_reduction/requirements.txt new file mode 100644 index 000000000..30934ad7c --- /dev/null +++ b/noise_reduction/requirements.txt @@ -0,0 +1,5 @@ +tqdm +deepfilternet +librosa +noisereduce +torchaudio>=2.1.2 \ No newline at end of file diff --git a/noise_reduction/test_noise_reduction.py b/noise_reduction/test_noise_reduction.py new file mode 100644 index 000000000..a77377565 --- /dev/null +++ b/noise_reduction/test_noise_reduction.py @@ -0,0 +1,75 @@ +import tempfile + +import mlrun +import pytest + + +@pytest.mark.parametrize( + "audio_source", + [ + "data/test_data.wav", + "data/test_data.mp3", + "data", + ], +) +def test_reduce_noise(audio_source): + # set up the project and function + artifact_path = tempfile.TemporaryDirectory().name + project = mlrun.new_project("noise-reduction") + noise_reduction_function = project.set_function( + func="function.yaml", + name="reduce_noise", + kind="job", + image="mlrun/mlrun", + ) + + # run the function + noise_reduction_run = noise_reduction_function.run( + handler="reduce_noise", + inputs={"audio_source": audio_source}, + params={ + "target_directory": artifact_path + "/data", + "sample_rate": None, + }, + local=True, + artifact_path=artifact_path, + returns=["successes: file", "errors: file"], + ) + + assert noise_reduction_run.outputs["successes"] + + +@pytest.mark.parametrize( + "audio_source", + [ + "data/test_data.wav", + "data/test_data.mp3", + "data", + ], +) +def test_reduce_noise_dfn(audio_source): + # set up the project and function + artifact_path = tempfile.TemporaryDirectory().name + project = mlrun.new_project("noise-reduction") + noise_reduction_function = project.set_function( + func="function.yaml", + name="reduce_noise", + kind="job", + image="mlrun/mlrun", + ) + + # run the function + noise_reduction_run = noise_reduction_function.run( + handler="reduce_noise_dfn", + inputs={"audio_source": audio_source}, + params={ + "target_directory": artifact_path + "/data", + "atten_lim_db": 50, + }, + local=True, + artifact_path=artifact_path, + returns=["successes: file", "errors: file"], + ) + + # assert that the function run completed successfully + assert noise_reduction_run.outputs["successes"] diff --git a/pandas_profiling_report/README.md b/pandas_profiling_report/README.md deleted file mode 100644 index 40e0c9b22..000000000 --- a/pandas_profiling_report/README.md +++ /dev/null @@ -1,26 +0,0 @@ -## pandas_profiling_report - -Creates an html report with various graphs/statistics/correlations for a given dataset. See sample report [here](https://pandas-profiling.github.io/pandas-profiling/examples/master/titanic/titanic_report.html). Link to GitHub page [here](https://github.com/pandas-profiling/pandas-profiling). - - -Usage example: - -```python -import mlrun, os -mlrun.mlconf.dbpath = 'http://mlrun-api:8080' - -# Load pandas_profiling_report function from Github -func = mlrun.import_function("hub://pandas_profiling_report").apply(mlrun.mount_v3io()) - -# Build MLRun image (only needs to be run once) -func.deploy() - -# Create task -data = 'https://iguazio-sample-data.s3.amazonaws.com/datasets/iris_dataset.csv' - -task = NewTask(name="pandas-profiling-report", - inputs={"data": DATA_URL}) - -# Run task on cluster -run = func.run(task, artifact_path='/User/artifacts') -``` diff --git a/pandas_profiling_report/function.yaml b/pandas_profiling_report/function.yaml deleted file mode 100644 index ffdbbf837..000000000 --- a/pandas_profiling_report/function.yaml +++ /dev/null @@ -1,40 +0,0 @@ -kind: job -metadata: - name: pandas-profiling-report - tag: '' - hash: 79fe77fb2920a8ffecfef2f614a0be494c2ea43b - project: '' - labels: - author: nicks - categories: - - data-analysis -spec: - command: '' - args: [] - image: mlrun/mlrun - env: [] - default_handler: pandas_profiling_report - entry_points: - pandas_profiling_report: - name: pandas_profiling_report - doc: Create a Pandas Profiling Report for a dataset. - parameters: - - name: context - type: MLClientCtx - doc: the function context - default: '' - - name: data - type: DataItem - doc: Dataset to create report for - default: '' - outputs: - - default: '' - lineno: 10 - description: Create Pandas Profiling Report from Dataset - build: - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IHBhbmRhcyBhcyBwZAppbXBvcnQgcGFuZGFzX3Byb2ZpbGluZwoKZnJvbSBtbHJ1bi5leGVjdXRpb24gaW1wb3J0IE1MQ2xpZW50Q3R4CmZyb20gbWxydW4uZGF0YXN0b3JlIGltcG9ydCBEYXRhSXRlbQoKCmRlZiBwYW5kYXNfcHJvZmlsaW5nX3JlcG9ydCgKICAgIGNvbnRleHQ6IE1MQ2xpZW50Q3R4LAogICAgZGF0YTogRGF0YUl0ZW0sCikgLT4gTm9uZToKICAgICIiIkNyZWF0ZSBhIFBhbmRhcyBQcm9maWxpbmcgUmVwb3J0IGZvciBhIGRhdGFzZXQuCiAgICA6cGFyYW0gY29udGV4dDogICAgICAgICB0aGUgZnVuY3Rpb24gY29udGV4dAogICAgOnBhcmFtIGRhdGE6ICAgICAgICAgICAgRGF0YXNldCB0byBjcmVhdGUgcmVwb3J0IGZvcgogICAgIiIiCgogICAgZGYgPSBkYXRhLmFzX2RmKCkKCiAgICBwcm9maWxlID0gZGYucHJvZmlsZV9yZXBvcnQodGl0bGU9IlBhbmRhcyBQcm9maWxpbmcgUmVwb3J0IikKCiAgICBjb250ZXh0LmxvZ19hcnRpZmFjdCgKICAgICAgICAiUGFuZGFzIFByb2ZpbGluZyBSZXBvcnQiLAogICAgICAgIGJvZHk9cHJvZmlsZS50b19odG1sKCksCiAgICAgICAgbG9jYWxfcGF0aD0icGFuZGFzX3Byb2ZpbGluZ19yZXBvcnQuaHRtbCIsCiAgICApCg== - commands: - - python -m pip install pandas_profiling - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/pandas_profiling_report/pandas_profiling_report.py - affinity: null -verbose: false diff --git a/pandas_profiling_report/item.yaml b/pandas_profiling_report/item.yaml deleted file mode 100644 index 13d374369..000000000 --- a/pandas_profiling_report/item.yaml +++ /dev/null @@ -1,25 +0,0 @@ -apiVersion: v1 -categories: -- data-analysis -description: Create Pandas Profiling Report from Dataset -doc: '' -example: pandas_profiling_report.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: nicks -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: pandas-profiling-report -platformVersion: 3.5.0 -spec: - filename: pandas_profiling_report.py - handler: pandas_profiling_report - image: mlrun/mlrun - kind: job - requirements: - - pandas_profiling -url: '' -version: 1.1.0 diff --git a/pandas_profiling_report/pandas_profiling_report.ipynb b/pandas_profiling_report/pandas_profiling_report.ipynb deleted file mode 100644 index 61aeba265..000000000 --- a/pandas_profiling_report/pandas_profiling_report.ipynb +++ /dev/null @@ -1,794 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Pandas Profiling Report" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting kind to 'job'\n", - "%nuclio: setting spec.image to 'mlrun/mlrun'\n" - ] - } - ], - "source": [ - "%nuclio config kind = \"job\"\n", - "%nuclio config spec.image = \"mlrun/mlrun\"" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "%%nuclio cmd -c\n", - "pip install pandas_profiling" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "import pandas_profiling\n", - "\n", - "from mlrun.execution import MLClientCtx\n", - "from mlrun.datastore import DataItem" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "def pandas_profiling_report(\n", - " context: MLClientCtx,\n", - " data: DataItem,\n", - ") -> None:\n", - " \"\"\"Create a Pandas Profiling Report for a dataset.\n", - " :param context: the function context\n", - " :param data: Dataset to create report for\n", - " \"\"\"\n", - " \n", - " # Load dataset\n", - " df = data.as_df()\n", - " \n", - " # Create Pandas Profiling Report\n", - " profile = df.profile_report(title='Pandas Profiling Report')\n", - " \n", - " # Save to MLRun DB\n", - " context.log_artifact('Pandas Profiling Report',\n", - " body=profile.to_html(),\n", - " local_path='pandas_profiling_report.html')" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### mlconfig" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import mlconf\n", - "import os\n", - "\n", - "mlconf.dbpath = 'http://mlrun-api:8080'\n", - "mlconf.artifact_path = mlconf.artifact_path or f'{os.environ[\"HOME\"]}/artifacts'" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### save" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2020-10-15 19:21:40,986 [info] function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from mlrun import code_to_function\n", - "\n", - "# create job function object from notebook code\n", - "fn = code_to_function(\"pandas_profiling_report\", kind=\"job\")\n", - "\n", - "# add metadata (for templates and reuse)\n", - "fn.spec.default_handler = \"pandas_profiling_report\"\n", - "fn.spec.description = \"Create Pandas Profiling Report from Dataset\"\n", - "fn.metadata.categories = [\"analysis\"]\n", - "fn.metadata.labels = {\"author\": \"nicks\"}\n", - "fn.export(\"function.yaml\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## tests" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from mlrun.platforms import auto_mount\n", - "fn.apply(auto_mount())" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import NewTask, run_local\n", - "\n", - "DATA_URL = 'https://iguazio-sample-data.s3.amazonaws.com/datasets/iris_dataset.csv'" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "task = NewTask(name=\"pandas-profiling-report\", \n", - " handler=pandas_profiling_report, \n", - " inputs={\"data\": DATA_URL})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### run locally" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2020-10-15 19:21:41,030 [warning] warning!, server (0.5.1) and client (0.5.2) ver dont match\n", - "> 2020-10-15 19:21:41,031 [info] starting run pandas-profiling-report uid=0894aed4f2854d96b776e25bdcaff80e -> http://mlrun-api:8080\n", - "> 2020-10-15 19:21:41,062 [warning] warning!, server (0.5.1) and client (0.5.2) ver dont match\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "86c3397cc7384565815af90bc5a6d10b", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=0.0, description='Summarize dataset', max=19.0, style=ProgressStyle(descrip…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "e7dece2ab7184c909611cf0aed3ef474", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=0.0, description='Generate report structure', max=1.0, style=ProgressStyle(…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "7153ac93afcd4e77a4b5a312766af995", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=0.0, description='Render HTML', max=1.0, style=ProgressStyle(description_wi…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Oct 15 19:21:41completedpandas-profiling-report
v3io_user=nicks
kind=handler
owner=nicks
host=nicks-jupyter-76668bdd46-g9sxf
data
Pandas Profiling Report
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run 0894aed4f2854d96b776e25bdcaff80e --project default , !mlrun logs 0894aed4f2854d96b776e25bdcaff80e --project default\n", - "> 2020-10-15 19:21:52,944 [info] run executed, status=completed\n" - ] - } - ], - "source": [ - "run = run_local(task)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### run remotely" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "# Create MLRun image (only needs to be run once)\n", - "fn.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2020-10-15 19:23:17,199 [info] starting run pandas-profiling-report uid=0ab5c8dbff95471da6018c1a7afd3b22 -> http://mlrun-api:8080\n", - "> 2020-10-15 19:23:17,303 [info] Job is running in the background, pod: pandas-profiling-report-xr48m\n", - "Summarize dataset: 100%|██████████| 19/19 [00:05<00:00, 3.78it/s, Completed] \n", - "Generate report structure: 100%|██████████| 1/1 [00:02<00:00, 2.22s/it]\n", - "> 2020-10-15 19:23:33,779 [info] run executed, status=completed\n", - "Render HTML: 100%|██████████| 1/1 [00:00<00:00, 2.07it/s]\n", - "final state: succeeded\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Oct 15 19:23:25completedpandas-profiling-report
v3io_user=nicks
kind=job
owner=nicks
host=pandas-profiling-report-xr48m
data
Pandas Profiling Report
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run 0ab5c8dbff95471da6018c1a7afd3b22 --project default , !mlrun logs 0ab5c8dbff95471da6018c1a7afd3b22 --project default\n", - "> 2020-10-15 19:23:36,481 [info] run executed, status=completed\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.run(task, inputs={\"data\": DATA_URL})" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/pandas_profiling_report/pandas_profiling_report.py b/pandas_profiling_report/pandas_profiling_report.py deleted file mode 100644 index c3d3d4d32..000000000 --- a/pandas_profiling_report/pandas_profiling_report.py +++ /dev/null @@ -1,41 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import pandas as pd -import pandas_profiling - -from mlrun.execution import MLClientCtx -from mlrun.datastore import DataItem - - -def pandas_profiling_report( - context: MLClientCtx, - data: DataItem, -) -> None: - """Create a Pandas Profiling Report for a dataset. - :param context: the function context - :param data: Dataset to create report for - """ - - df = data.as_df() - - profile = df.profile_report(title="Pandas Profiling Report") - - context.log_artifact( - "Pandas Profiling Report", - body=profile.to_html(), - local_path="pandas_profiling_report.html", - ) diff --git a/project_runner/function.yaml b/project_runner/function.yaml deleted file mode 100644 index 21a2f7346..000000000 --- a/project_runner/function.yaml +++ /dev/null @@ -1,53 +0,0 @@ -kind: remote -metadata: - name: project-runner - tag: '' - hash: b7888996aa9a7833972928fa06fa238f674099b3 - project: '' - labels: - author: orz - categories: - - utils -spec: - command: '' - args: [] - image: '' - entry_points: - init_context: - name: init_context - doc: '' - parameters: - - name: context - outputs: [] - lineno: 8 - handler: - name: handler - doc: "Imports the latest project version and runs the \nspecified workflow" - parameters: - - name: context - - name: event - outputs: [] - lineno: 11 - description: Nuclio based - Cron scheduler for running your MLRun projects - min_replicas: 1 - max_replicas: 1 - env: [] - base_spec: - apiVersion: nuclio.io/v1 - kind: Function - metadata: - annotations: - nuclio.io/generated_by: function generated from 02-07-2020 by admin - labels: {} - name: project-runner - spec: - build: - baseImage: mlrun/mlrun - commands: [] - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKZnJvbSBtbHJ1biBpbXBvcnQgbG9hZF9wcm9qZWN0CmZyb20gbWxydW4gaW1wb3J0IG1sY29uZgppbXBvcnQganNvbgppbXBvcnQgb3MKCmRlZiBpbml0X2NvbnRleHQoY29udGV4dCk6CiAgICBzZXRhdHRyKGNvbnRleHQsICdodWJfdXJsJywgb3MuZ2V0ZW52KCdodWJfdXJsJywgTm9uZSkpCgpkZWYgaGFuZGxlcihjb250ZXh0LCBldmVudCk6CiAgICAiIiJJbXBvcnRzIHRoZSBsYXRlc3QgcHJvamVjdCB2ZXJzaW9uIGFuZCBydW5zIHRoZSAKICAgIHNwZWNpZmllZCB3b3JrZmxvdwogICAgIiIiCiAgICBjb250ZXh0LmxvZ2dlci5pbmZvKCdQdWxsaW5nIHByb2plY3QgYW5kIHdvcmtmbG93IGRldGFpbHMnKQogICAgaWYgaXNpbnN0YW5jZShldmVudC5ib2R5LCBkaWN0KToKICAgICAgICBkZXRhaWxzID0gZXZlbnQuYm9keQogICAgZWxzZToKICAgICAgICBkZXRhaWxzID0ganNvbi5sb2FkcyhldmVudC5ib2R5KQogICAgY29udGV4dC5sb2dnZXIuaW5mbyhkZXRhaWxzKQogICAgcHJvamVjdF91cmwgPSBkZXRhaWxzWydwcm9qZWN0X3VybCddCiAgICB3b3JrZmxvdyA9IGRldGFpbHNbJ3dvcmtmbG93J10KICAgIGFydGlmYWN0X3BhdGggPSBkZXRhaWxzLmdldCgnYXJ0aWZhY3RfcGF0aCcsIG9zLmVudmlyb24uZ2V0KCdhcnRpZmFjdF9wYXRoJywgTm9uZSkpCiAgICBodWJfdXJsID0gZGV0YWlscy5nZXQoJ2h1Yl91cmwnLCBjb250ZXh0Lmh1Yl91cmwpCgogICAgaWYgaHViX3VybDoKICAgICAgICBtbGNvbmYuaHViX3VybCA9IGh1Yl91cmwKCiAgICBwcm9qZWN0PSBsb2FkX3Byb2plY3Qob3MucGF0aC5hYnNwYXRoKCcuL2xvYWRlZF9wcm9qZWN0JyksIHVybD1wcm9qZWN0X3VybCkKICAgIHByb2plY3QucnVuKG5hbWU9d29ya2Zsb3csCiAgICAgICAgICAgICAgICBhcmd1bWVudHM9e30sCiAgICAgICAgICAgICAgICBhcnRpZmFjdF9wYXRoPWFydGlmYWN0X3BhdGgsCiAgICAgICAgICAgICAgICB3YXRjaD1GYWxzZSkKCg== - noBaseImagesPull: true - env: [] - handler: project_runner:handler - runtime: python:3.9 - volumes: [] - source: '' diff --git a/project_runner/project_runner.ipynb b/project_runner/project_runner.ipynb deleted file mode 100644 index 04bebea12..000000000 --- a/project_runner/project_runner.ipynb +++ /dev/null @@ -1,340 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project runner\n", - "Imports the latest project version and runs the specified workflow" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": 61, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting spec.build.baseImage to 'mlrun/mlrun'\n" - ] - } - ], - "source": [ - "%nuclio config spec.build.baseImage = \"mlrun/mlrun\"" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: start-code" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import load_project\n", - "from mlrun import mlconf\n", - "import json\n", - "import os" - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [], - "source": [ - "def init_context(context):\n", - " setattr(context, 'hub_url', os.getenv('hub_url', None))" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [], - "source": [ - "def handler(context, event):\n", - " \"\"\"Imports the latest project version and runs the \n", - " specified workflow\n", - " \"\"\"\n", - " context.logger.info('Pulling project and workflow details')\n", - " if isinstance(event.body, dict):\n", - " details = event.body\n", - " else:\n", - " details = json.loads(event.body)\n", - " context.logger.info(details)\n", - " project_url = details['project_url']\n", - " workflow = details['workflow']\n", - " artifact_path = details.get('artifact_path', os.environ.get('artifact_path', None))\n", - " hub_url = details.get('hub_url', context.hub_url)\n", - "\n", - " if hub_url:\n", - " mlconf.hub_url = hub_url\n", - "\n", - " project= load_project(os.path.abspath('./loaded_project'), url=project_url)\n", - " project.run(name=workflow,\n", - " arguments={},\n", - " artifact_path=artifact_path,\n", - " watch=False)\n" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "code", - "execution_count": 63, - "metadata": {}, - "outputs": [], - "source": [ - "import json\n", - "runner_event = {'project_url': '/User/demo-network-operations/project.yaml',\n", - " 'workflow': 'main',\n", - " 'hub_url': '/User/functions/{name}/function.yaml',\n", - " 'artifact_path': '/User/functions/project_runner/artifacts/'}" - ] - }, - { - "cell_type": "code", - "execution_count": 46, - "metadata": { - "tags": [] - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Python> 2020-07-01 14:36:45,368 [info] \n", - "Python> 2020-07-01 14:36:45,369 [info] {'project_url': '/User/demo-network-operations/project.yaml', 'workflow': 'main', 'hub_url': '/User/functions/{name}/function.yaml', 'artifact_path': '/User/functions/project_runner/artifacts/'}\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"JsonArray\" based on the value \"['cpu_utilization', 'throughput', 'packet_loss', 'latency']\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n", - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"JsonArray\" based on the value \"['mean', 'sum', 'std', 'var', 'min', 'max', 'median']\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n", - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"Integer\" based on the value \"20\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n", - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"Float\" based on the value \"0.3\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n", - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"JsonArray\" based on the value \"[1, 0]\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n", - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"Integer\" based on the value \"-1\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n", - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"Float\" based on the value \"0.1\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n", - "/conda/lib/python3.6/site-packages/kfp/components/_data_passing.py:168: UserWarning: Missing type name was inferred as \"Float\" based on the value \"0.75\".\n", - " warnings.warn('Missing type name was inferred as \"{}\" based on the value \"{}\".'.format(type_name, str(value)))\n" - ] - }, - { - "data": { - "text/html": [ - "Experiment link here" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "Run link here" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-07-01 14:36:46,646 Pipeline run id=cf85ec1b-2df7-403c-b7b4-b9bdb8fcf92f, check UI or DB for progress\n" - ] - } - ], - "source": [ - "init_context(context)\n", - "event = nuclio.Event(body=json.dumps(runner_event))\n", - "out = handler(context, event)\n", - "out" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Deployment" - ] - }, - { - "cell_type": "code", - "execution_count": 48, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import code_to_function, mount_v3io\n", - "from nuclio.triggers import CronTrigger" - ] - }, - { - "cell_type": "code", - "execution_count": 89, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-07-02 09:30:36,014 function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 89, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Saving the function for import via hub://project_runner\n", - "fn = code_to_function(name='project-runner',\n", - " kind='nuclio')\n", - "fn.spec.description = 'Nuclio based - Cron scheduler for running your MLRun projects'\n", - "fn.metadata.categories = [\"utils\"]\n", - "fn.metadata.labels = {'author': 'orz'}\n", - "fn.spec.maxReplicas = 1\n", - "fn.export('function.yaml')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### How to call from your project?\n", - "> **After** importing the function" - ] - }, - { - "cell_type": "code", - "execution_count": 90, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 90, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "cron_string = '* * 1 * *' # Regular cron string as in https://pypi.org/project/croniter/\n", - "\n", - "# Set defaults\n", - "fn.set_envs({'artifact_path': '/User/functions/project_runner/artifacts/',\n", - " 'hub_url': '/User/functions/{name}/function.yaml'})\n", - "\n", - "# Set project and workflow event\n", - "runner_event = {'project_url': '/User/demo-network-operations/project.yaml',\n", - " 'workflow': 'main'}\n", - "\n", - "# Add as a trigger\n", - "fn.add_trigger('cron', \n", - " CronTrigger(schedule=cron_string,\n", - " body=json.dumps(runner_event),\n", - " headers={'X-Nuclio-Target': 'project-runner'}))\n", - "\n", - "# Add mount for access to the different directories\n", - "fn.apply(mount_v3io())" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-07-02 09:31:16,905 deploy started\n", - "[nuclio] 2020-07-02 09:31:19,021 (info) Build complete\n" - ] - } - ], - "source": [ - "fn.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python [conda env:root] *", - "language": "python", - "name": "conda-root-py" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/rnn_serving/function.yaml b/rnn_serving/function.yaml deleted file mode 100644 index 0078e4028..000000000 --- a/rnn_serving/function.yaml +++ /dev/null @@ -1,46 +0,0 @@ -kind: serving -metadata: - name: rnn-serving - tag: '' - hash: 548cd27edfdc49aed0b069d94bd049435d484722 - project: '' - labels: - author: Daniel - categories: - - model-serving - - machine-learning -spec: - command: '' - args: [] - image: mlrun/ml-models - description: deploy an rnn based stock analysis model server. - min_replicas: 1 - max_replicas: 4 - env: [] - base_spec: - apiVersion: nuclio.io/v1 - kind: Function - metadata: - name: rnn-serving - labels: {} - annotations: - nuclio.io/generated_by: function generated from /User/test/functions/rnn_serving/rnn_serving.py - spec: - runtime: python:3.9 - handler: rnn_serving:handler - env: [] - volumes: [] - build: - commands: [] - noBaseImagesPull: true - functionSourceCode: aW1wb3J0IG1scnVuCmltcG9ydCBudW1weSBhcyBucApmcm9tIHRlbnNvcmZsb3cgaW1wb3J0IGtlcmFzCmltcG9ydCBqc29uCgoKY2xhc3MgUk5OX01vZGVsX1NlcnZpbmcobWxydW4uc2VydmluZy5WMk1vZGVsU2VydmVyKToKICAgIGRlZiBsb2FkKHNlbGYpOgogICAgICAgICIiImxvYWQgYW5kIGluaXRpYWxpemUgdGhlIG1vZGVsIGFuZC9vciBvdGhlciBlbGVtZW50cyIiIgogICAgICAgIG1vZGVsX2ZpbGUsIGV4dHJhX2RhdGEgPSBzZWxmLmdldF9tb2RlbChzdWZmaXg9Ii5oNSIpCiAgICAgICAgc2VsZi5tb2RlbCA9IGtlcmFzLm1vZGVscy5sb2FkX21vZGVsKG1vZGVsX2ZpbGUpCgogICAgZGVmIHByZWRpY3Qoc2VsZiwgYm9keSk6CiAgICAgICAgdHJ5OgogICAgICAgICAgICAiIiJHZW5lcmF0ZSBtb2RlbCBwcmVkaWN0aW9ucyBmcm9tIHNhbXBsZS4iIiIKICAgICAgICAgICAgZmVhdHMgPSBucC5hc2FycmF5KGJvZHlbJ2lucHV0cyddKQogICAgICAgICAgICByZXN1bHQgPSBzZWxmLm1vZGVsLnByZWRpY3QoZmVhdHMpCiAgICAgICAgICAgIHJlc3VsdCA9IGpzb24uZHVtcHMocmVzdWx0LnRvbGlzdCgpKQogICAgICAgICAgICByZXR1cm4gcmVzdWx0CiAgICAgICAgZXhjZXB0IEV4Y2VwdGlvbiBhcyBlOgogICAgICAgICAgICByYWlzZSBFeGNlcHRpb24oIkZhaWxlZCB0byBwcmVkaWN0ICVzIiAlIGUpCmZyb20gbWxydW4ucnVudGltZXMgaW1wb3J0IG51Y2xpb19pbml0X2hvb2sKZGVmIGluaXRfY29udGV4dChjb250ZXh0KToKICAgIG51Y2xpb19pbml0X2hvb2soY29udGV4dCwgZ2xvYmFscygpLCAnc2VydmluZ192MicpCgpkZWYgaGFuZGxlcihjb250ZXh0LCBldmVudCk6CiAgICByZXR1cm4gY29udGV4dC5tbHJ1bl9oYW5kbGVyKGNvbnRleHQsIGV2ZW50KQo= - source: '' - function_kind: serving_v2 - build: - commands: [] - code_origin: https://github.com/daniels290813/functions.git#97b63199864dd95681bca5af86835d177bf9d67b:/User/test/functions/rnn_serving/rnn_serving.py - origin_filename: /User/test/functions/rnn_serving/rnn_serving.py - secret_sources: [] - mount_applied: false - affinity: null -verbose: false diff --git a/rnn_serving/item.yaml b/rnn_serving/item.yaml deleted file mode 100644 index 5cc7b9367..000000000 --- a/rnn_serving/item.yaml +++ /dev/null @@ -1,25 +0,0 @@ -apiVersion: v1 -categories: -- model-serving -- machine-learning -description: deploy an rnn based stock analysis model server. -doc: '' -example: rnn_serving.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: Daniel -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: rnn-serving -platformVersion: 3.5.0 -spec: - filename: rnn_serving.py - handler: handler - image: mlrun/ml-models - kind: serving - requirements: null -url: '' -version: 1.1.0 diff --git a/rnn_serving/requirements.txt b/rnn_serving/requirements.txt deleted file mode 100644 index ff480e35d..000000000 --- a/rnn_serving/requirements.txt +++ /dev/null @@ -1,2 +0,0 @@ -tensorflow==2.8.2 -wget \ No newline at end of file diff --git a/rnn_serving/rnn_serving.ipynb b/rnn_serving/rnn_serving.ipynb deleted file mode 100644 index dbdf3b874..000000000 --- a/rnn_serving/rnn_serving.ipynb +++ /dev/null @@ -1,285 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# **RNN Serving**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The following section we create a new model serving function which wraps our class , and specify model and other resources.
\n", - "Deploying the serving function will provide us an http endpoint that can handle requests in real time.
\n", - "This function is part of the [stock-analysis demo](https://github.com/mlrun/demos/tree/master/stock-analysis).
\n", - "To see how the model is trained or how the data-set is generated, check out code folder in the demo repository." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Steps**\n", - "\n", - "1. [Setup function parameters](#Setup-function-parameters)\n", - "2. [Importing the function](#Importing-the-function)\n", - "3. [Testing the function locally](#Testing-the-function-locally)\n", - "4. [Testing the function remotely](#Testing-the-function-remotely)" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import warnings\n", - "warnings.filterwarnings(\"ignore\")" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "# Following packages are required, make sure to install\n", - "# !pip install pip install torch==1.6.0\n", - "# !pip install tensorflow\n", - "# !pip install keras" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Setup function parameters**" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "# Setting up models path\n", - "rnn_model_path = 'https://s3.wasabisys.com/iguazio/models/function-marketplace-models/rnn_serving/rnn_model.h5'\n", - "data_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/rnn_serving/stocks_data.pkl'" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Importing the function**" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-17 10:43:46,363 [info] loaded project function-marketplace from MLRun DB\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import mlrun\n", - "mlrun.set_environment(project='function-marketplace')\n", - "\n", - "# Importing the function from the hub\n", - "fn = mlrun.import_function(\"hub://rnn_serving\")\n", - "fn.apply(mlrun.auto_mount())\n", - "\n", - "# Manually specifying needed packages \n", - "fn.spec.build.commands = ['pip install torch==1.6.0', 'pip install tensorflow', 'pip install keras']\n", - "\n", - "# Adding the model \n", - "fn.add_model(key='rnn_model', model_path=rnn_model_path ,class_name='RNN_Model_Serving')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Testing the function locally**" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-17 10:43:54,256 [info] model rnn_model was loaded\n", - "> 2021-10-17 10:43:54,257 [info] Initializing endpoint records\n", - "> 2021-10-17 10:43:54,276 [info] Loaded ['rnn_model']\n" - ] - } - ], - "source": [ - "# When mocking, class has to be present\n", - "from rnn_serving import *\n", - "\n", - "# Mocking function\n", - "server = fn.to_mock_server()" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "# Getting the data\n", - "import cloudpickle as cp\n", - "from urllib.request import urlopen\n", - "\n", - "rnn_data = cp.load(urlopen(data_path))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "model used in this example take inputs with the shape `(None, None, 11)`.
\n", - "whereas the first dimenstion is the number of instances, the second dimenstion is the number of timestamps
\n", - "and the last dimenstion is the number of features the dataset have.
\n", - "our testing dataset has `(1,10,11)` means one instance to predict, with sequence length of 10, each step has 11 features." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'id': '1bf6a3dc4d204e6e8bfd5834f5d691f1',\n", - " 'model_name': 'rnn_model',\n", - " 'outputs': '[[0.43563252687454224]]'}" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import requests\n", - "\n", - "# KFServing protocol event\n", - "event_data = {\"inputs\": rnn_data}\n", - "\n", - "response = server.test(path='/v2/models/rnn_model/predict',body=event_data)\n", - "response" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Testing the function remotely**" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-17 10:43:57,192 [info] Starting remote function deploy\n", - "2021-10-17 10:43:57 (info) Deploying function\n", - "2021-10-17 10:43:57 (info) Building\n", - "2021-10-17 10:43:57 (info) Staging files and preparing base images\n", - "2021-10-17 10:43:57 (info) Building processor image\n", - "2021-10-17 10:43:58 (info) Build complete\n", - "2021-10-17 10:44:10 (info) Function deploy complete\n", - "> 2021-10-17 10:44:11,677 [info] successfully deployed function: {'internal_invocation_urls': ['nuclio-function-marketplace-rnn-serving.default-tenant.svc.cluster.local:8080'], 'external_invocation_urls': ['default-tenant.app.dev39.lab.iguazeng.com:30255']}\n" - ] - } - ], - "source": [ - "address = fn.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'id': '1bf6a3dc4d204e6e8bfd5834f5d691f1',\n", - " 'model_name': 'rnn_model',\n", - " 'outputs': '[[0.43563252687454224]]'}" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import json\n", - "import requests\n", - "\n", - "# using requests to predict\n", - "response = requests.put(address+\"/v2/models/rnn_model/predict\", json = json.dumps(event_data))\n", - "json.loads(response.text)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "[Back to the top](#RNN-Serving)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/rnn_serving/rnn_serving.py b/rnn_serving/rnn_serving.py deleted file mode 100644 index d7e783d7a..000000000 --- a/rnn_serving/rnn_serving.py +++ /dev/null @@ -1,35 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import mlrun -import numpy as np -from tensorflow import keras -import json - - -class RNN_Model_Serving(mlrun.serving.V2ModelServer): - def load(self): - """load and initialize the model and/or other elements""" - model_file, extra_data = self.get_model(suffix=".h5") - self.model = keras.models.load_model(model_file) - - def predict(self, body): - try: - """Generate model predictions from sample.""" - feats = np.asarray(body['inputs']) - result = self.model.predict(feats) - result = json.dumps(result.tolist()) - return result - except Exception as e: - raise Exception("Failed to predict %s" % e) \ No newline at end of file diff --git a/rnn_serving/test_rnn_serving.py b/rnn_serving/test_rnn_serving.py deleted file mode 100644 index fb2f49974..000000000 --- a/rnn_serving/test_rnn_serving.py +++ /dev/null @@ -1,74 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import os -import wget -from mlrun import import_function -from os import path -from rnn_serving import * - -DATASET = np.array([[6.9955170e-01, 6.9952875e-01, 2.7922913e-02, 2.7853036e-02, - 6.9955170e-01, 7.0086759e-01, 7.0118028e-01, 7.0142627e-01, - 2.7922913e-02, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 6.9998503e-01, 1.6527303e-03, 2.7853036e-02, - 7.0000792e-01, 7.0085293e-01, 7.0118028e-01, 7.0203447e-01, - 1.6527303e-03, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0025057e-01, 1.6904050e-04, 2.7853036e-02, - 7.0027345e-01, 7.0014298e-01, 7.0190376e-01, 7.0128226e-01, - 1.6904050e-04, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0144778e-01, 1.6904050e-04, 2.7853036e-02, - 7.0147055e-01, 7.0178574e-01, 7.0236105e-01, 7.0295709e-01, - 7.3906886e-03, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0324355e-01, 1.6904050e-04, 2.7853036e-02, - 7.0326620e-01, 7.0308524e-01, 7.0490342e-01, 7.0427048e-01, - 2.4815742e-03, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0324355e-01, 1.6904050e-04, 2.7853036e-02, - 7.0191067e-01, 7.0173001e-01, 7.0354480e-01, 7.0291305e-01, - 2.9976186e-03, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0324355e-01, 1.6904050e-04, 2.7853036e-02, - 7.0166123e-01, 7.0148063e-01, 7.0284635e-01, 7.0249581e-01, - 2.7904075e-03, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0324355e-01, 1.6904050e-04, 2.7853036e-02, - 7.0133996e-01, 7.0143080e-01, 7.0297277e-01, 7.0250750e-01, - 4.1491759e-04, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0324355e-01, 1.6904050e-04, 2.7853036e-02, - 7.0150572e-01, 7.0251614e-01, 7.0281982e-01, 7.0370042e-01, - 2.1256472e-03, 0.0000000e+00, 0.0000000e+00], - [6.9955170e-01, 7.0324355e-01, 1.6904050e-04, 2.7853036e-02, - 7.0272487e-01, 7.0258951e-01, 7.0429617e-01, 7.0376801e-01, - 1.4207334e-03, 0.0000000e+00, 0.0000000e+00]]).reshape(1, 10, 11).tolist() - - -def download_pretrained_model(model_path): - # Run this to download the pre-trained model to your `models` directory - model_location = 'https://s3.wasabisys.com/iguazio/models/rnn/rnn_model.h5' - saved_models_directory = model_path - # Create paths - os.makedirs(saved_models_directory, exist_ok=1) - model_filepath = os.path.join(saved_models_directory, os.path.basename(model_location)) - wget.download(model_location, model_filepath) - - -def test_rnn_serving(): - model_path = os.path.join(os.path.abspath('./'), 'models') - model = model_path + '/rnn_model.h5' - if not path.exists(model): - download_pretrained_model(model_path) - - fn = import_function('function.yaml') - fn.add_model('rnn_model', model_path=model, class_name='RNN_Model_Serving') - # create an emulator (mock server) from the function configuration) - server = fn.to_mock_server() - resp = server.test("/v2/models/rnn_model/infer", {"inputs": DATASET}) - assert (resp['outputs'] == '[[0.453309565782547]]') diff --git a/slack_notify/README.md b/slack_notify/README.md deleted file mode 100644 index 9bde32995..000000000 --- a/slack_notify/README.md +++ /dev/null @@ -1 +0,0 @@ -# Send Notification to Slack \ No newline at end of file diff --git a/slack_notify/function.yaml b/slack_notify/function.yaml deleted file mode 100644 index 95af087c0..000000000 --- a/slack_notify/function.yaml +++ /dev/null @@ -1,48 +0,0 @@ -kind: job -metadata: - name: slack-notify - tag: '' - hash: 3de7e78ed9b7928af192badf988055086431fb58 - project: '' - labels: - author: mdl - categories: - - utils -spec: - command: '' - args: [] - image: python:3.6-jessie - env: [] - default_handler: slack_notify - entry_points: - slack_notify: - name: slack_notify - doc: Summarize a table - parameters: - - name: context - type: MLClientCtx - doc: the function context - default: '' - - name: webhook_url - type: str - doc: 'Slack incoming webhook URL. Please read: https://api.slack.com/messaging/webhooks' - default: URL - - name: slack_blocks - type: List[str] - doc: Message blocks list. NOT IMPLEMENTED YET - default: [] - - name: notification_text - type: str - doc: Notification text - default: Notification - outputs: - - default: '' - lineno: 14 - description: Send Slack notification - build: - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IHdhcm5pbmdzCgp3YXJuaW5ncy5zaW1wbGVmaWx0ZXIoYWN0aW9uPSJpZ25vcmUiLCBjYXRlZ29yeT1GdXR1cmVXYXJuaW5nKQoKaW1wb3J0IG9zCmltcG9ydCBqc29uCmltcG9ydCByZXF1ZXN0cwpmcm9tIG1scnVuLmV4ZWN1dGlvbiBpbXBvcnQgTUxDbGllbnRDdHgKZnJvbSB0eXBpbmcgaW1wb3J0IExpc3QKCgpkZWYgc2xhY2tfbm90aWZ5KAogICAgY29udGV4dDogTUxDbGllbnRDdHgsCiAgICB3ZWJob29rX3VybDogc3RyID0gIlVSTCIsCiAgICBzbGFja19ibG9ja3M6IExpc3Rbc3RyXSA9IFtdLAogICAgbm90aWZpY2F0aW9uX3RleHQ6IHN0ciA9ICJOb3RpZmljYXRpb24iLAopIC0+IE5vbmU6CiAgICAiIiJTdW1tYXJpemUgYSB0YWJsZQogICAgOnBhcmFtIGNvbnRleHQ6ICAgICAgICAgdGhlIGZ1bmN0aW9uIGNvbnRleHQKICAgIDpwYXJhbSB3ZWJob29rX3VybDogICAgIFNsYWNrIGluY29taW5nIHdlYmhvb2sgVVJMLiBQbGVhc2UgcmVhZDogaHR0cHM6Ly9hcGkuc2xhY2suY29tL21lc3NhZ2luZy93ZWJob29rcwogICAgOnBhcmFtIG5vdGlmaWNhdGlvbl90ZXh0OiAgICAgICAgICAgIE5vdGlmaWNhdGlvbiB0ZXh0CiAgICA6cGFyYW0gc2xhY2tfYmxvY2tzOiAgICAgICAgICBNZXNzYWdlIGJsb2NrcyBsaXN0LiBOT1QgSU1QTEVNRU5URUQgWUVUCiAgICAiIiIKCiAgICBkYXRhID0geyJ0ZXh0Ijogbm90aWZpY2F0aW9uX3RleHR9CiAgICBwcmludCgiPT09PSIsIHdlYmhvb2tfdXJsKQogICAgcmVzcG9uc2UgPSByZXF1ZXN0cy5wb3N0KAogICAgICAgIHdlYmhvb2tfdXJsLCBkYXRhPWpzb24uZHVtcHMoZGF0YSksIGhlYWRlcnM9eyJDb250ZW50LVR5cGUiOiAiYXBwbGljYXRpb24vanNvbiJ9CiAgICApCgogICAgcHJpbnQoIlJlc3BvbnNlOiAiICsgc3RyKHJlc3BvbnNlLnRleHQpKQogICAgcHJpbnQoIlJlc3BvbnNlIGNvZGU6ICIgKyBzdHIocmVzcG9uc2Uuc3RhdHVzX2NvZGUpKQo= - commands: - - python -m pip install requests - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/slack_notify/slack_notify.py - affinity: null -verbose: false diff --git a/slack_notify/item.yaml b/slack_notify/item.yaml deleted file mode 100644 index 6bdfd2c83..000000000 --- a/slack_notify/item.yaml +++ /dev/null @@ -1,25 +0,0 @@ -apiVersion: v1 -categories: -- utils -description: Send Slack notification -doc: '' -example: slack_notify.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: mdl -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: slack-notify -platformVersion: 3.5.0 -spec: - filename: slack_notify.py - handler: slack_notify - image: python:3.6-jessie - kind: job - requirements: - - requests -url: '' -version: 1.1.0 diff --git a/slack_notify/slack_notify.ipynb b/slack_notify/slack_notify.ipynb deleted file mode 100644 index 8119bb8cf..000000000 --- a/slack_notify/slack_notify.ipynb +++ /dev/null @@ -1,293 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": 33, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting kind to 'job'\n", - "%nuclio: setting spec.image to 'python:3.6-jessie'\n" - ] - } - ], - "source": [ - "%nuclio config kind = \"job\"\n", - "%nuclio config spec.image = \"python:3.6-jessie\"" - ] - }, - { - "cell_type": "code", - "execution_count": 35, - "metadata": {}, - "outputs": [], - "source": [ - "%%nuclio cmd -c \n", - "pip install requests" - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "metadata": {}, - "outputs": [], - "source": [ - "import warnings\n", - "warnings.simplefilter(action='ignore', category=FutureWarning)" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "import json\n", - "import requests\n", - "from mlrun.execution import MLClientCtx\n", - "from typing import List" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [], - "source": [ - "def slack_notify(\n", - " context: MLClientCtx,\n", - " webhook_url: str = \"URL\",\n", - " slack_blocks: List[str] = [],\n", - " notification_text: str = \"Notification\"\n", - ") -> None:\n", - " \"\"\"Summarize a table\n", - " :param context: the function context\n", - " :param webhook_url: Slack incoming webhook URL. Please read: https://api.slack.com/messaging/webhooks\n", - " :param notification_text: Notification text\n", - " :param slack_blocks: Message blocks list. NOT IMPLEMENTED YET\n", - " \"\"\"\n", - " \n", - " data = {\n", - " 'text': notification_text\n", - " }\n", - " print(\"====\",webhook_url)\n", - " response = requests.post(webhook_url, data=json.dumps(\n", - " data), headers={'Content-Type': 'application/json'})\n", - "\n", - " print('Response: ' + str(response.text))\n", - " print('Response code: ' + str(response.status_code))" - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### mlconfig" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import mlconf\n", - "import os\n", - "\n", - "mlconf.dbpath = 'http://mlrun-api:8080'\n", - "mlconf.artifact_path = mlconf.artifact_path or f'{os.environ[\"HOME\"]}/artifacts'" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### save" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import code_to_function\n", - "\n", - "# create job function object from notebook code\n", - "fn = code_to_function(\"slack_notify\")\n", - "# add metadata (for templates and reuse)\n", - "fn.spec.default_handler = \"slack_notify\"\n", - "fn.spec.description = \"Send Slack notification\"\n", - "fn.metadata.categories = [\"ops\"]\n", - "fn.metadata.labels = {\"author\": \"mdl\"}\n", - "fn.export(\"function.yaml\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## tests" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import import_function\n", - "func = import_function(\"hub://slack_notify\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import NewTask, run_local\n", - "\n", - "\n", - "#Slack incoming webhook URL. Please read: https://api.slack.com/messaging/webhooks\n", - "task_params = {\n", - " \"webhook_url\" : \"https://hooks.slack.com/services/xxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxx\",\n", - " \"notification_text\" : \"Test Notification\"\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "task = NewTask(\n", - " name=\"tasks slack notify\", \n", - " params = task_params,\n", - " handler=slack_notify)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### run local where artifact path is fixed " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run = run_local(task, artifact_path=mlconf.artifact_path)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### run remote where artifact path includes the run id" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "func.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "func.run(task, params=task_params, workdir=mlconf.artifact_path)" - ] - }, - { - "cell_type": "code", - "execution_count": 42, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "function: slack-notify\n", - "Send Slack notification\n", - "default handler: slack_notify\n", - "entry points:\n", - " slack_notify: Summarize a table\n", - " context(MLClientCtx) - the function context\n", - " webhook_url(str) - Slack incoming webhook URL. Please read: https://api.slack.com/messaging/webhooks, default=URL\n", - " slack_blocks(List[str]) - Message blocks list. NOT IMPLEMENTED YET\n", - " notification_text(str) - Notification text, default=Notification\n" - ] - } - ], - "source": [ - "func.doc()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/slack_notify/slack_notify.py b/slack_notify/slack_notify.py deleted file mode 100644 index 3208ffee1..000000000 --- a/slack_notify/slack_notify.py +++ /dev/null @@ -1,48 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import warnings - -warnings.simplefilter(action="ignore", category=FutureWarning) - -import os -import json -import requests -from mlrun.execution import MLClientCtx -from typing import List - - -def slack_notify( - context: MLClientCtx, - webhook_url: str = "URL", - slack_blocks: List[str] = [], - notification_text: str = "Notification", -) -> None: - """Summarize a table - :param context: the function context - :param webhook_url: Slack incoming webhook URL. Please read: https://api.slack.com/messaging/webhooks - :param notification_text: Notification text - :param slack_blocks: Message blocks list. NOT IMPLEMENTED YET - """ - - data = {"text": notification_text} - print("====", webhook_url) - response = requests.post( - webhook_url, data=json.dumps(data), headers={"Content-Type": "application/json"} - ) - - print("Response: " + str(response.text)) - print("Response code: " + str(response.status_code)) diff --git a/snowflake_dask/README.md b/snowflake_dask/README.md deleted file mode 100644 index 70fa3c927..000000000 --- a/snowflake_dask/README.md +++ /dev/null @@ -1,38 +0,0 @@ -# **Data Preperation Function** - -## `Snowflake_dask` - -![](img/snowflake-dask.png) - -This function query the data from a snowflake database and process the results -in parallel in a Dask cluster. -It will publish the dask dataframe in the cluster for other process to use. -It can also write the results dataframe to parquet files. - -```markdown - -:param context: the function context -:param dask_client: dask cluster function name -:param connection_info: Snowflake database connection info (this will be in a secret later) -:param query: query to for Snowflake -:param parquet_out_dir: directory path for the output parquet files (default None, not write out) -:param publish_name: name of the dask dataframe to publish to the dask cluster (default None, not publish) -``` - -To use the function, you will need to either have the password or key pair authentication to Snowflake configured. - -To get the password, or generate key pair in Snowflake and configure Snowflake for key pair authentication, please follow Snowflake [documentation](https://docs.snowflake.com/en/user-guide/key-pair-auth.html) here. - -After obtained password or key pair, please set up the project secrets in your Iguazio cluster. - -If you are using password, you only need to add ```sfPassword``` secret to the project settings. - -If you are using the key pair authentication, you will need to add both ```pkPath``` and ```pkPassword``` to the project settings. - - where: - - ```pkPath``` is the file path to your private key file in the cluster, for example ```/User/rsa_key.p8``` - -```pkPassword``` is your private key encryption password. Please see the screenshot below for your reference. - -![Secrets Screenshot](img/iguazio-project-secrets.png) diff --git a/snowflake_dask/config-template.yaml b/snowflake_dask/config-template.yaml deleted file mode 100644 index fb46ac2e6..000000000 --- a/snowflake_dask/config-template.yaml +++ /dev/null @@ -1,5 +0,0 @@ -user: "..." -password: "..." -warehouse: "..." -account: "..." -application: "Iguazio" \ No newline at end of file diff --git a/snowflake_dask/function.yaml b/snowflake_dask/function.yaml deleted file mode 100644 index c9cc8d746..000000000 --- a/snowflake_dask/function.yaml +++ /dev/null @@ -1,81 +0,0 @@ -kind: job -metadata: - name: snowflake-dask - tag: '' - hash: a002c7743b4a7471c7befe00f5497de050ebe902 - project: snowflake-dask - labels: - author: xingsheng - categories: - - data-prep - credentials: - access_key: ec09bfc8-1cb4-466d-9049-852081973ce3 -spec: - command: '' - args: [] - image: .mlrun/func-snowflake-dask-snowflake-dask:latest - build: - functionSourceCode: IiIiU25vd2ZsYWtlIERhc2sgLSBJbmdlc3QgU25hb3dmbGFrZSBkYXRhIHdpdGggRGFzayIiIgppbXBvcnQgd2FybmluZ3MKaW1wb3J0IG1scnVuCmZyb20gbWxydW4uZXhlY3V0aW9uIGltcG9ydCBNTENsaWVudEN0eAppbXBvcnQgc25vd2ZsYWtlLmNvbm5lY3RvciBhcyBzbm93CmZyb20gZGFzay5kaXN0cmlidXRlZCBpbXBvcnQgQ2xpZW50CmZyb20gZGFzay5kYXRhZnJhbWUgaW1wb3J0IGZyb21fZGVsYXllZApmcm9tIGRhc2sgaW1wb3J0IGRlbGF5ZWQKZnJvbSBkYXNrIGltcG9ydCBkYXRhZnJhbWUgYXMgZGQKZnJvbSBjcnlwdG9ncmFwaHkuaGF6bWF0LmJhY2tlbmRzIGltcG9ydCBkZWZhdWx0X2JhY2tlbmQKZnJvbSBjcnlwdG9ncmFwaHkuaGF6bWF0LnByaW1pdGl2ZXMgaW1wb3J0IHNlcmlhbGl6YXRpb24KCndhcm5pbmdzLmZpbHRlcndhcm5pbmdzKCJpZ25vcmUiKQoKQGRlbGF5ZWQKZGVmIGxvYWQoYmF0Y2gpOgoKICAgICIiIkEgZGVsYXllZCBsb2FkIG9uZSBiYXRjaC4iIiIKCiAgICB0cnk6CiAgICAgICAgcHJpbnQoIkJBVENISU5HIikKICAgICAgICBkZl8gPSBiYXRjaC50b19wYW5kYXMoKQogICAgICAgIHJldHVybiBkZl8KICAgIGV4Y2VwdCBFeGNlcHRpb24gYXMgZToKICAgICAgICBwcmludChmIkZhaWxlZCBvbiB7YmF0Y2h9IGZvciB7ZX0iKQogICAgICAgIHJhaXNlCgpkZWYgbG9hZF9yZXN1bHRzKGNvbnRleHQ6IE1MQ2xpZW50Q3R4LAogICAgICAgICAgICAgICAgIGRhc2tfY2xpZW50OiBzdHIsCiAgICAgICAgICAgICAgICAgY29ubmVjdGlvbl9pbmZvOiBzdHIsCiAgICAgICAgICAgICAgICAgcXVlcnk6IHN0ciwKICAgICAgICAgICAgICAgICBwYXJxdWV0X291dF9kaXIgPSBOb25lLAogICAgICAgICAgICAgICAgIHB1Ymxpc2hfbmFtZSA9IE5vbmUKICAgICAgICAgICAgICAgICkgLT4gTm9uZToKCiAgICAiIiJTbm93Zmxha2UgRGFzayAtIEluZ2VzdCBTbmFvd2ZsYWtlIGRhdGEgd2l0aCBEYXNrCgogICAgOnBhcmFtIGNvbnRleHQ6ICAgICAgICAgICB0aGUgZnVuY3Rpb24gY29udGV4dAogICAgOnBhcmFtIGRhc2tfY2xpZW50OiAgICAgICBkYXNrIGNsdXN0ZXIgZnVuY3Rpb24gbmFtZQogICAgOnBhcmFtIGNvbm5lY3Rpb25faW5mbzogICBTbm93Zmxha2UgZGF0YWJhc2UgY29ubmVjdGlvbiBpbmZvICh0aGlzIHdpbGwgYmUgaW4gYSBzZWNyZXQgbGF0ZXIpCiAgICA6cGFyYW0gcXVlcnk6ICAgICAgICAgICAgIHF1ZXJ5IHRvIGZvciBTbm93Zmxha2UKICAgIDpwYXJhbSBwYXJxdWV0X291dF9kaXI6ICAgZGlyZWN0b3J5IHBhdGggZm9yIHRoZSBvdXRwdXQgcGFycXVldCBmaWxlcwogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAoZGVmYXVsdCBOb25lLCBub3Qgd3JpdGUgb3V0KQogICAgOnBhcmFtIHB1Ymxpc2hfbmFtZTogICAgICBuYW1lIG9mIHRoZSBkYXNrIGRhdGFmcmFtZSB0byBwdWJsaXNoIHRvIHRoZSBkYXNrIGNsdXN0ZXIKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKGRlZmF1bHQgTm9uZSwgbm90IHB1Ymxpc2gpCgogICAgIiIiCiAgICBjb250ZXh0ID0gbWxydW4uZ2V0X29yX2NyZWF0ZV9jdHgoJ3NuYXdmbGFrZS1kYXNrLWNsdXN0ZXInKQogICAgc2ZfcGFzc3dvcmQgPSBjb250ZXh0LmdldF9zZWNyZXQoJ3NmUGFzc3dvcmQnKQogICAgcGtfcGF0aCA9ICBjb250ZXh0LmdldF9zZWNyZXQoJ3BrUGF0aCcpCiAgICBwa19wYXNzd29yZCA9ICBjb250ZXh0LmdldF9zZWNyZXQoJ3BrUGFzc3dvcmQnKQoKICAgIGlmIHBrX3BhdGggYW5kIHBrX3Bhc3N3b3JkOgogICAgICAgIHdpdGggb3Blbihwa19wYXRoLCAicmIiKSBhcyBrZXk6CiAgICAgICAgICAgIHBfa2V5PSBzZXJpYWxpemF0aW9uLmxvYWRfcGVtX3ByaXZhdGVfa2V5KAogICAgICAgICAgICAgICAga2V5LnJlYWQoKSwKICAgICAgICAgICAgICAgIHBhc3N3b3JkPXN0cihwa19wYXNzd29yZCkuZW5jb2RlKCksCiAgICAgICAgICAgICAgICBiYWNrZW5kPWRlZmF1bHRfYmFja2VuZCgpCiAgICAgICAgICAgICkKICAgICAgICBwa2IgPSBwX2tleS5wcml2YXRlX2J5dGVzKAogICAgICAgICAgICBlbmNvZGluZz1zZXJpYWxpemF0aW9uLkVuY29kaW5nLkRFUiwKICAgICAgICAgICAgZm9ybWF0PXNlcmlhbGl6YXRpb24uUHJpdmF0ZUZvcm1hdC5QS0NTOAogICAgICAgICAgICAsZW5jcnlwdGlvbl9hbGdvcml0aG09c2VyaWFsaXphdGlvbi5Ob0VuY3J5cHRpb24oKQogICAgICAgICkKICAgICAgICBjb25uZWN0aW9uX2luZm8ucG9wKCdwYXNzd29yZCcsICdObyBwYXNzd29yZCBmb3VuZCcpCiAgICAgICAgY29ubmVjdGlvbl9pbmZvWydwcml2YXRlX2tleSddID0gcGtiCiAgICBlbGlmIHNmX3Bhc3N3b3JkOgogICAgICAgIGNvbm5lY3Rpb25faW5mb1sncGFzc3dvcmQnXSA9IHNmX3Bhc3N3b3JkCiAgICBlbHNlOgogICAgICAgIHJhaXNlIEV4Y2VwdGlvbigiXG5QbGVhc2Ugc2V0IHVwIHRoZSBzZWNyZXQgZm9yIFNub3dmbGFrZSBpbiB5b3VyIHByb2plY3QhXG4iKQoKICAgICMgc2V0dXAgZGFzayBjbGllbnQgZnJvbSB0aGUgTUxSdW4gZGFzayBjbHVzdGVyIGZ1bmN0aW9uCiAgICBpZiBkYXNrX2NsaWVudDoKICAgICAgICBjbGllbnQgPSBtbHJ1bi5pbXBvcnRfZnVuY3Rpb24oZGFza19jbGllbnQpLmNsaWVudAogICAgICAgIGNvbnRleHQubG9nZ2VyLmluZm8oZidFeGlzdGluZyBkYXNrIGNsaWVudCA9PT0gPj4+IHtjbGllbnR9XG4nKQogICAgZWxzZToKICAgICAgICBjbGllbnQgPSBDbGllbnQoKQogICAgICAgIGNvbnRleHQubG9nZ2VyLmluZm8oZidcbk5ld2x5IGNyZWF0ZWQgZGFzayBjbGllbnQgPT09ID4+PiB7Y2xpZW50fVxuJykKCiAgICBjb25uID0gc25vdy5jb25uZWN0KCoqY29ubmVjdGlvbl9pbmZvKQogICAgY3VyID0gY29ubi5jdXJzb3IoKQogICAgY3VyLmV4ZWN1dGUocXVlcnkpCiAgICBiYXRjaGVzID0gY3VyLmdldF9yZXN1bHRfYmF0Y2hlcygpCiAgICBjb250ZXh0LmxvZ2dlci5pbmZvKGYnYmF0Y2hlcyBsZW4gPT09IHtsZW4oYmF0Y2hlcyl9XG4nKQoKICAgIGRmcyA9IFtdCiAgICBmb3IgYmF0Y2ggaW4gYmF0Y2hlczoKICAgICAgICBpZiBiYXRjaC5yb3djb3VudCA+IDA6CiAgICAgICAgICAgIGRmID0gbG9hZChiYXRjaCkKICAgICAgICAgICAgZGZzLmFwcGVuZChkZikKICAgIGRkZiA9IGZyb21fZGVsYXllZChkZnMpCgogICAgIyBtYXRlcmlhbGl6ZSB0aGUgcXVlcnkgcmVzdWx0cyBzZXQgZm9yIHNvbWUgc2FtcGxlIGNvbXB1dGUKCiAgICBkZGZfZGVzY3JpYmUgPSBkZGYuZGVzY3JpYmUoKS5jb21wdXRlKCkKCiAgICBjb250ZXh0LmxvZ2dlci5pbmZvKGYncXVlcnkgID09PSA+Pj4ge3F1ZXJ5fVxuJykKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oZidkZGYgID09PSA+Pj4ge2RkZn1cbicpCiAgICBjb250ZXh0LmxvZ19yZXN1bHQoJ251bWJlciBvZiByb3dzJywgbGVuKGRkZi5pbmRleCkpCiAgICBjb250ZXh0LmxvZ19kYXRhc2V0KCJkZGZfZGVzY3JpYmUiLCBkZj1kZGZfZGVzY3JpYmUpCgogICAgaWYgcHVibGlzaF9uYW1lOgogICAgICAgIGNvbnRleHQubG9nX3Jlc3VsdCgnZGF0YV9zZXRfbmFtZScsIHB1Ymxpc2hfbmFtZSkKICAgICAgICBpZiBub3QgY2xpZW50Lmxpc3RfZGF0YXNldHMoKToKICAgICAgICAgICAgZGRmLnBlcnNpc3QobmFtZSA9IHB1Ymxpc2hfbmFtZSkKICAgICAgICAgICAgY2xpZW50LnB1Ymxpc2hfZGF0YXNldChwdWJsaXNoX25hbWU9ZGRmKQoKICAgIGlmIHBhcnF1ZXRfb3V0X2RpcjoKICAgICAgICBkZC50b19wYXJxdWV0KGRmPWRkZiwgcGF0aD1wYXJxdWV0X291dF9kaXIpCiAgICAgICAgY29udGV4dC5sb2dfcmVzdWx0KCdwYXJxdWV0IGRpcmVjdG9yeScsIHBhcnF1ZXRfb3V0X2RpcikK - base_image: mlrun/mlrun - commands: - - python -m pip install bokeh snowflake-connector-python[pandas] mlrun~=0.9.1 - code_origin: https://github.com/xsqian/functions.git#6b31040e2ad762602f335b0589823a1c61a09975:snowflake_dask.py - origin_filename: snowflake_dask.py - entry_points: - load: - name: load - doc: A delayed load one batch. - parameters: - - name: batch - default: '' - outputs: - - default: '' - lineno: 15 - load_results: - name: load_results - doc: Snowflake Dask - Ingest Snaowflake data with Dask - parameters: - - name: context - type: MLClientCtx - doc: the function context - default: '' - - name: dask_client - type: str - doc: dask cluster function name - default: '' - - name: connection_info - type: str - doc: Snowflake database connection info (this will be in a secret later) - default: '' - - name: query - type: str - doc: query to for Snowflake - default: '' - - name: parquet_out_dir - doc: directory path for the output parquet files (default None, not write - out) - default: null - - name: publish_name - doc: name of the dask dataframe to publish to the dask cluster (default None, - not publish) - default: null - outputs: - - default: '' - lineno: 28 - description: Snowflake Dask - Ingest snowflake data in parallel with Dask cluster - default_handler: load_results - disable_auto_mount: false - env: - - name: V3IO_API - value: '' - - name: V3IO_USERNAME - value: '' - - name: V3IO_ACCESS_KEY - value: '' - - name: V3IO_FRAMESD - value: '' - priority_class_name: igz-workload-medium - preemption_mode: prevent - affinity: null - tolerations: null -verbose: false diff --git a/snowflake_dask/img/iguazio-project-secrets.png b/snowflake_dask/img/iguazio-project-secrets.png deleted file mode 100644 index 29f48aa338e3639981cadb6c31cac31cfb96d2bf..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 105122 zcmZ^L2RxhI+do=5P&!bfrFLtzW>Kq(qDHkvtSAyI_TH;iYE+G=t+h+6NX&?qS`o8G z#H>}d_l*40=l4AC`~IH#`+Snz$$hSK?sHw&xz07dN2IQ{8tp}vi)3VEwCax^>5-A0 z9VH_>bN)OP>6e?M5mRJjx${f0`PGDOIh>VQ?y`{PN6?H-W_Lncs&D*g8 zw=cSS=|x0D>zN0&Hgz_A7%#ov=GZ{o&VBG6M}e7ty$rGcAi9lqx(lG|{&ek*+@K+@vfZ*0lVlCH=}FBZ~^#Q*qkvU8-*E2P(>Y>I#Go*m6T^Uw8}^S>J^JyTX! zCw)G%bOnJN-E5rPg-r(-NnH)wJ~wnX)OsRg>Es~%(#pvKB<$_r{F{XAfwv6l(gEcD zlEd2p;^-#hEywv+3mMY&@1I3DIsR(mZZF4asHMxH?BojKkQBZve3w)HA_oV@16M1s zjNT*FzsX78E{v1&C=D@+1=L3k>hv2FD;xr+~qhqe-HFOfB$(-khkssj^ya}_q0e86#0Ee zn|yivod5F|yGBapiIbSvAk>Y&{wDx5pu6b_zG zv?oWajy@#IMIv`gn-NtEG4VO44blk@?w09%etzzN{QvV5TvK#! zTSrZ}rWkMN`6~U3Q@_9RzoTFQMcsb4hn3?xjpg(&LL&N~w%SB}7R)$JCW!(eVIh)E zyM3-6M`MDgh@On1@A5j8O_a9NcIY39MB=TYG{gE)z(>Qh9<@aq=>V!4hf_on!he20 z>68vg0^yrtLIe3(Vs7%wojMU6a#}b@@{>d6 z*|Jkbe$BASDJiGaH}Oh8#f-~oQz4_s3358|BQxc z_o^%N-d@~MneNBXDu-bs84p{L8?uOmw@G(yR_cfwJjoI@zUi~R77vB;N{!1rE=8TZ zwei#e{yAsG+G`OE7iw7fa$x%kL=|diXHCUbw+DTNS#7mDgP#qOtceFT+slIzZG=x1 zu7}?Nu2W6q!-L;D#54tacfSw==0p8Fe`uD~=-Lf_W{$ysr2CU4+0W10WD7l)(NHz+ z2)r4H23lBHjH#aq3AVDbs+p7P@S6+q%ZzKwYyfJn1X)y=e5X9q^WJBRQQcLq^7v#_ zGir`a+Cyf$`DJ`+U5~@BAg}_3gY>P6OMlpf&Jx8E5wTu<@LG1XSq)|C&7fc8UOW0< zr5(E-9qbccR{HK3B|GfyPVGer^$vXki7_8{M}eNlVpRWJG#Ukpkh^Yik{W97#?Dz; zR%Li}+v=fwJ$%+1`Nm(D0MUI~`nqT(Ic8~#=qG8S126^RCpc&=^V-2l+WSgBgLs<2 zJ)RBudde+EO!T6gRd>ljd}gMw(2Q}r^uliPs+_EKZ7o1-zGH9lWOnFVwcV}Zac2;8 z#aNQBJyCtY6nMe7+Fk)W_bTD=E#)~kyByWngv9|cCUtNr*L6=d7{4G%VI6sb~8ks zRv%WlKC?HAThXF5p(XsJuI@gP1Y(qf_E-68*ME?^PKE{T>SO9WSicf>mIsA2vmK@z zcs7&kZzd6r_TKCP>#S;zAGk(Gv&t!D`p=h5#}6kaKKGgLV6yu0s%SdE;br&o!R8Dd zx*f>gsM*cpoZ6L!nu@Df6pXW8Ib`$z5&q!XxblSyDcew~ELURm~39 z;Etcp8|x^aBG;VY*lVrV9rEVFQsmsspo-5D@;^qGw=cmtpiEY9mx+=9d3W#o?cgOD zcL<*Y9_fjPETf#>5c#1mJZq4iSYMy@5cKOXh~z!x;-QV{=D-_LPaF5Q@#{xk6UY3T zaWECkys#$Lv%G`R>~PP%(nB-UeQ{L%FvIuZeMM`$-`?Wl5I{dIsau{ftI)g`Xhcky zYfURl+FikqlnRHfMXH@lLk~{bloRzsMuK03N|=G4+oF1NR_Y*n!Pqop4SuxR3H>BP zT+F1~RzRqIkXOtQN|+F-U4FReXChK$SoDdiNOX2ZO5yz4=Ffz^juh*xUlT!xeZ!lr z_G>kB)f3)8VV3)fB~K)(C1c0u}G*M_ZaH&TEScH$6k(1YeP$+O2zl8uJRnR(zQyZ!KZv135utpo9Klg8bM) za+*ZMaQ_V_fV<>{R|qgRwf$L-{4DgGnQP{lGd7FIl<)0Gff-|!{YYU5zE$cJ6H;}d zCHyOdvrERPk{#Q!@$6fps~zL^@shHay{@3RfbFb2ecM&ssUMq5A;=p7(x_#Uo- zBMpWA8Be}BG_^MtozGZk)Zlz@YX*S}G&8RAeqGH!Y!>vIx)S$xM*F@hi0-*r{Ky?@!R3|^otY-iH{`JI4R@3YR8gHS-7j8~{DE%Tmz-U!uG6tM(##8C#PfpMeZki!qKgVisjn;XjS&fhGOc>ja)V%b@F>3$vNd$$&{=T$t6`31qLuB;_;i zJYakG=bn2lS+PW?%_hoDI1PG{>azhfn%XaF`}@6`UoWm^F%%lDV?%uiG8(PF9Iotp zRJ1%XzTHSL+P#kD0J^BF)J{q|7li6N^0pP8b3OnZ64tuL%U*z%QopGq_OiJ1oQSKLvBplBPs@v5bZeC80D)} zCBW*ZP!Zr9(zv`>3$H&N02g`Qhf`RRD&)hv$Dg52)XI`HUJd%$Gd(WoygTsDa&~O1 z$9?GszUsVFnu6DRRzA{>&<*qRTYs~;aW1QVPH7mJVySpflxwwqTA{;9CZ`BNK1yV^ z&fv_DsZL%xaei&$)4~3Y!U13Hkeq}Wew^mH;i%B~#{|hJ^0XAsID9DtRZ$8UBt}Db zzImxyzo!~hfp8qRbEJ)H9KLQ1f5p7i?}91r>C=+s;(?T@>ssRc4jmt(g#Y z`1QnVnc$@$oNVgzUOSjRPUlVFTqqbhy?w&>=8SV~#%!H}$!(F7C5h^TBPX@aQHiBl z_ZzZU@Z%fAB`ZLKO;L$ig}x*D=v{xU=DEdDSVHXO>JiGXrH#jo2jh86?cFybMSmP+ zH5c4J_&(E&LD&uVH$D`DUFtI*D=Nt1Z_5~IOTucW$zWEhJ+V|pPf>O5t3yg)><0nA zJ-f&XN$P8oHipAOTeN?8FZBE|6z51f`gM@f?nRP0KKzH1zFMdxPRR_Wp|XJ}6bQMb z)nd()%X0k_gZ5PEk>h(=$AjYK{*BS0vPpiT&vcC74M6^A)>Yk?zaig&RuMdP+;L zeb?{$Ade3qEaesCt+~jJ&t8zEF4k&~3*e-}2Tk!TXzsAEG~}Ho!{<%x-PTcG6&=3F zyiobE#4g_szMi_s_`M66tqtXn3}=P)Qb=m!kFpws3oIdQi$qhziZsQK4Mo-BK;TCe^DVg1G8Ky`A< zi(VWE0U#?aik$|F6^u5%UqQ+`#;wZLJu2h>?pY@J9fC-;PxZ!k89P<&qFCok_&>pc9BE3tH8% z;d|qAbHkY!7X<<4@v4_m;`V)y?&#CRprJFdb^Wc*@O#rFSNgVu{o;tfPt^_lRnS7W zX@8Ziaz({-Wi`n|!qo&OK;I52K!CV+b8@E`>D;px9udf&h%>gFe&XiPERM8Hi4QmF zCreStYbbmE1v!AEA$&fS=YuBcFM;Gky{eXkTu;4!tAW$vVjRL%VoWfvk+4YGN&|n={aX2vR%>*8p}v{C%A68r1v7d|;qf_Aykrq-8tl+!w&A412y}hB zFRo~P^>*(?Vdq2Ug2#mLom|xz7Ma^h*3t_NCjj1?{NC?Q4GgC)wAhhb+emAC5eWJ! z{Jw`*(|7#NkEKruoX_R%Q|Ej2Pa)+v(L(P>T&PhmRwa51p;qbrPyH8m>4B(UMxmdF z!S^I+r}9Kb_#3CixIIr~Hf*zF^_@eCO9>e_t$^Zx`c-5hHq?*cuHxtJS##mRI^nLG#eB(Sv{pKOZ;~Xh@!b86`$edI-?k`2dLjKbFq|xX4>X*A^!RE< z(dMz)6&oXiFsF*AwI(vBB|j87SSSLcJOI>SF!c2V!-gNYTnK!?`2K*EyXHmPOLzy3 zzSC7=$%`L&D0lqjRNL>o^6fT8YPptIHwFhQaGQ0phB4GgC+?6TS{_4XKI80Dj21Wz%>}^e(Nr{*6geOvMJ#A;H zs%0=5q}KXCzxs;u6w#2MjV%dQ;AQ^;`wILNz8?Qf#S@THxcf;-N*&WangKlHKx-di zc-zJd2bp+EzS0inf$8-bHw*k}5lBe|HIt?_x);uj&y3QB@>*@lU>20LpB5t_%ZTMg zH%1dUr26?i9&F^f;&Xc$FeXOL8ttG4^s8mj@_s zNc8vjYB016=hSvzH)_L*{#u#Gw8wpzGggsYyMpuLSnHC+_zvhjpL`4Zfv;WYUA|xT z%Gl1dSwZ?m?hev{lQ`rFGjMZYj1(8}=O7y#bRB z!%B6E0(W{K5F1{se_QNIh#ICZNW1huJ|M3${0W zMuAp%>i^=$L8MIHBcl6sPDb$=B_TFt91Cvazb76h9->TNkolC&5&459ewR#exA&a>g3w zqZc>uEmTGif3?~Vn5Xto*bNEtt8J#{Or4U|2v%HV>`ku_{Fb3oH)_7}`AT$|NaJS) ziO2JEopb#l(0IX9tqz-~I%31_CdhOaC1m<@!#96A1{}HQ_Eo(SAIU{VfNT0rR}T!u2|qT3UbOt)pasH8Tp`8A(;}+uML$l3F5y@o z8dB?dntV5JWvirpEHo5vwMmeZ;rfZ3Jq8s5cukEI{blz=zbGAe57lV?NjO!=xL#i( ziuz`g7<=f&9p&^*pEf&>UA`DAIvTInf_6{PeifIcH;{AaHZ_0R^LEw*E6|@GbWr8W z!6w%Mj|HkVoR{Nu%Y~%auwZJ(C)XysJbp|Kc-N|1ADD&g9R(F64n?)JQmqc1$hBr~{g(Z^*&lRNi8KeB2yEF}4$ z9tx%Gr|kP=)n{qGcgzD~WGB^J5^J z^>hQ4qrx%t)%2U_LnCl_x>JvQWFJ}DzC=1xZ?;*Twm*H|a;=U3dHfa6M>(vAr(^}_ z`4=cE0rkxH zG&U(_H(an$H3kQ3@u{4?{{|~gpFLJm7|eN{GsFMqdLZQiFJ0W{5lg@>di_Wyy7~GIw2$ImG}3E4~WOdVoAn%XliE_;Z6N)2KhjwYgnC#jKI~ zD^!Gdmp;|Qdvnk$PMyEtFJtktvD<&p$8;0HF8go3Idh-22)Oc{XiKd&7TW@H)+Uv5 zdMH?{xsw*~3;(W2#a;<96lrQL$x-Xz>vZ7Ir8>*oWhBDKKJdUlFPZMAV6&>Q>zadf z3UYd&pHkNjE83k8Y}pE=iU@2%>@&3ZF`{X2wJ# zbLP0_(Qa@xwp+IAZOX6J5F>sul1O{vK^@X{ZJ$;21rLE);SCGLs8!;!=)MeQ%h z_(rPkf17{BsmBo=CPAR5fMy@kF^Ya6&hC|^4~L5yv@+t|8x_lv1r{4 zx~1!QR$D&bXWtAZ+P{90@!#f_pNk@}%rWuaXlFZLD~Qwi+C`F9e5^0dkFl#6ZZ}Vp zVlArfdFp*McICeZX?jZXo(_}~Zt2SSCi;IGqoM#kX`-+psYh+GBb3<&F2x8juC}fA z&dWp6eL;%uPP^9rm)a{zUtK@v(fL&1CQlA5Jz#T8m?G~{6Dc1$XXq?cNy}Q>PKht6 z`rhgiH74b%aH`6DQRL6j@@H}R%T?eJoJxvP&6E_)27i$hV2@#I6LJMYqj<}MCId-n zXp26XX`8c~^!J^of{)i%_)`?DKXUV}x%A+l^*rEqo9{}lKtG9V)ne(C7zSpl(fK&B zi0&)nl4^{*QR`0@`w^BCvdK+D62(IuOj!W}E-nnSq$G$?h}c^}vzjL{LC{nQ`*v(K z(13G65XPsjHNGGWBUMWBeq0&o7qwda1nNgB3lu#sayNNa6g2UYx{L+a&|8t;3jwA#gAPM%JTw)~TV3|{T34%dol9Mpp58X-;9Kq9{gVGE&LYSnxgV*yl%*;mO7 zJb5ZTSM`-xT0u6kpp5ohxb8fW;;KQs3Qu2~>_Od$l=Z6P7@Z^@*m$SWVk*H{nY{^$ z;QA{GxsIsO=@r4hEACBwx2Gr^W-D)GunT*=ZtCN3t|Hlxp83TE_P$O+uYwn;%J}#z z)Pa;_eFUg7k1R1mX*Uvlb7>m3A2{t5HfD~Ll|6?U0a(O811*#!>q7>MhwA)_pH*u= zU*j{%h&g^)Rb|U6vLaIzsh>QD%+g}5ezivMN;c&f{Cw4lI*jc#HAW)XEM7rsNj&YI zX!{T;Z-}K@ov~^o$^i5QfpgF-@1FplqcEa+{VwccUgL`(5uMb#*c`mFj>7AlB3c@DkiM9 zrEyH#)Jn0c6~@QD2LoRe>%Z4`FzTe00>XH7J}S)fTA#Qi4-kxyEt#{4>5{b`M-pzr z*V_qMdV*oF9v4)u3I<%e`;v9nwg{+{y22iF!IOfSe52L)PWwSrWf>(-m3Um-uyc51 zp%lh7xVwraxM3GlDMA!OBVLKm26$3nrE9Bw{<(eI!Pt7&q|0) zfOhSpy@Q{G9@DtBih|XiN+ES32kQ~W+v^f5v7^&~Rnm^bz(h_rRl+GA1oVS@EJ+@C zY(D~!3Y`Kf2#=j#!%35-flB+MRKz_~T)7K;%|^@x;=S?$*!%XmkKpm>VEkv(uk~>n{Dpco%PTOgTSoFF>M=#WT%rg;%qS0 z3!!Ys9yb@nP%5Pz?lWT{60D&4ZG3qzd$?}rzI%g?pIN?6+KMLt>HfO9*bmV(y8e>O z>eno@sHLw>7IXtU_ci3qY`=hrYf9_ZykNZb$!_S0Hg(Jq_*~%emxax7DXzJGj7TD> zs^6hU+pY$No8DVj9VWIlPq?Ri-5c92hxBJHez=bI-<+eb#!~bK)y!Y2_IE|kj4PD* zm!P8zU?wv|IFB)x9-;odeX~{6p2K8gz$Q95y!{oBueRdnokfN3+p=?K7l+{R6JffT zql0ZXI?cB3l?G+2%~y+#g4(Qn-S%LQaaw~L{pjk9e;_Eofke+nA7cH4jDpyHd#=|2 zO}t9wWy@F$M7F|ug4q!FP9SQ{(*Il=2w$ym?SqE4wx`l(^_c>@IP(zd%3426dRObo z2ys%=spHYOqY|>F=6*G<)&J;87*|%{_6x>~M>*J7b?u9kRShxFxvf#m+D~@3XDy+S zp?8!Fl1`(XWes-?!)yUWb40*$!$3i|tCA%3t|7kB;Y|bV3v}@rxOUH$(P?I6ePCl-(zVJu}RQEp7r5o6-vLEAl1TmQl?tZb{9!ZEtXqv7*C8OK3h zrfZc=ag<_h@6x!5fvZ8&PgpP|(E|`QKgp6;(v{f$;8!qUJvhL-*9Y&26ZMQi!T?*X ztEd3vF3tx#QC}9_K=g6yY++t)ED`KoZS#v=@nP9)J$KM4UKPL8akZd+Ng77%nC3_` z#q`pl-!K=qQ6ak}!E(=Vs%;pc9L52CjXKY|&}TpCzteoHC!CPe(p~|qDYGUL zYDlUppA~cn@>xMUy$ORL`!bgdfPu!Dfy+QgAm(`$$7o|$$`gk?A-oJL-qrYPEQ!H* zNBputT#6P%NzxB3{V7_;G>EZ*(8#h2J0?U1kytzQ#Z6UiyUISbKC4&B&uSC|C!?SPgL4R>0s`h&9rM6^jb=56qVf%8AwPKZ zn}MERe6%b>$ub^xwb(NI)(@04ckGxygGvplqLRMR5;xB=xqpxjh~0jWZw!UJStgA7 zQG*S$yE_Rd{(D+7)OK=G#WSp zU-EM%%!A*?F^8w_dMXcc3iQC#8M%vJ)4DwBVrao3v9cK zQo_0r@=sTFnfo0qG_xa(CgtxwN;998G9!k(yC_9Tm=09ka@!^DGrQ}?Xz)7j8>JNTRUEujsi9jYr3Bz zd_S)z)@rS9c2CiPwhMLVGYDOPp+d^S#*EF&be?u^3vYv#uIWRJ^9`wWM z*_;QhiBH}WN&PfRJZD-`?YM6nyUBrRfKWkHWaRs^LYKEXr>ICcFz+!5G*rOU_1o*| zy=FAqW!&H`jI20yEAPv4&qgXb!FDiABdOCh6MEZ7^U2tps4&E*Ib3TMB{b>1gb;2A zV`9?8uL)eP0+WA8O5u6xs3JUh#cTDVll1PZ31M@ay$Msyr)4^$qp%U``Bz?ot*_&| zAoyCYzHxO@Kw}bzofu)?WHNOvtzUb~AaK*Q0PL2AX|$j29Q$Eh-K$rwONP~m;7S8Y z8^N0rNT{hjeQke|PvP)13w3){x@yjGToxZyGDssteFb1#dIsYMO5Y9a?$T}QgeD7{ zIg0e>+FCMMxj9|j5K@Or%6dAMWz|v{mDO^yvDuB5yr;VK&P~o2R4MJ@bY{v}HA{l} zt4S_-V*BpWNaQMv(p9J(j4dc4Fcrgx^Pi+n3(X(O94<_bA@i}RHI|llT57VCsjaLP z3RVn19Np;>d$~|)J}V${$+HYvVNLEx9hm%FprcFj2(|qyjd+FnxsOAre4EaL&2$sb zFFTv7Y3#LKMowv{(X}`capx68^+eFOm_;!98N1K1nG*t2f|x|`ExHXWopCga8ip7c zTNC5@5NdqYCOPw8XdrmcuN?u+31i-EbGt^*-VxSVdoW4EC)$qRr^*Nu8>4j*(a4YN z;Tp;gcr@^&-lTxs)6WsOKRKCaU#&fHYkK$<4Bz}kffc<8-y+oPuW)vgYrR@6v&TaUqhGHVMvF6s&uh_jd^bc` z!0_10UK#fX`)VJ(mKUve-Db88nTwJRe8e(*@u*>m)Gu?0`NKD}l34dmqj6t^X>ZSo z?l{upjiG)9=qUisVK{4OhU^Y?AGTACr@V>^&Vcp6BnCNRDBq2-f6emGVdgnwqN<(_ zRs)o;KNdgA*$b)9axOr2A%oDH?iD7gng}FWK@#O~sp$Y{8^D`P8MQ1>0LBF}+QdLe z$PkyA@|2GtF{A83ikfqm*qrMD%%ELQJ7R+PfPINE%4bI3c3gNOrg$NM(SExQ5`Rgg z+;d*KphH9<#*NuapR`XTn&qcaUX1D&{CW3hxPS?hWcUyid!bUZEC>zNDJXcY5xp|u zYDYaUl5paeq(Z8VIwdW1vI4c3SvuO_} zG|v_8o!pHc%fwgGRyx8mwf*Cs*~!LtF{SPdT3?vJf9->5eYtL^DFUygE+ z+QQ6YQ0c@;aEi+C@wCRlKB~tAnw}J81TXn$De(+*Qqu*yCWv0({~4$+{|oVh?eR(f zB(Q`$-pf5c>e_)S-=lI7O#Y%9q|84_)OXnCOsTa~u^9Y<(7Ij@cxO4ylqxle$>^>l zX6P)F4V=uhSYOD5_frf5I0X4RC8(Xf02eNR*dC?`+ifG;n6=oGirV?29r!c}Rl_-@ZD5Swp;?ekK-4iMCLL8QZg6yzP5TFf9B_Fi%B z`h6F@;J_JnAoKDKSi~Id-&$S!x+NZpTGoABXZL(DtC`wU3hS+QyCLN?51zVH! zJu^@|KNTR}*v^)%0YNgaWN6gMQmw(*YBNG`sIdgl6fgy1`l(BCKqM?*n*FDMHLIy; zePidf=QgNue(A2~!{2Dui0~#cqQ5xHcnyFOmSaem`Q?zT6TumSU+HLC=x>MzMS@kByvZB9^!+s+|KoRPd;$V=)60A`?gWq~|l-(D&+ zT`tTN@oKPBwOAkas}kFA@3_IfYsF)qNg)CoYFlci8TsL3Ke!hQlW=6aquj9eQPyPF zdNz=RMaNcl1A~d4cYX#i?4*=u&Gmfmuhh`21jJ!kv#_C%i5dqU7g}hMect@;j+zh% zy%($I!8@i)=9vh7ISNU~@de$ETzWe%y}WQ2z0wfX9nYIeiY%CQQtu0~d9z7@Lq~>J zye7B}OvLV@$1B+X&bH1zBPGuwOfY%ZxUvZ4e~G5)^0o2{SLb^MHwMJ-Cp^QK!e;=o zCS=&o@{+;+NZQq){Y%y9!%|K`Wi7YM0=DY08hk5skB%x8?pQxbTLHUPU0{K5_?K1P zMH3jn=y;k+pJD41LB1^ChNI5jsgbWL822BM)7?`udkP81Pk5(x7!F47ZiykxM*=gR zO8T`%XJ7<`9GKq=S%yerO5m=Ywkcw5yS?f{B|E)OK9q$AEP#Egu1MJH$LoJbKC!Mc zz3cRTX}m*Y1VoSo=2?mJbT^(%Pkz2j)bCSo` zyvD`kDp2-rtRNT4Dt?h*hTOtEV)I-(Ahgk9jNl?iDdp-nAXm-{5Rkiv#BiI)6P%uk zHB0{Aw!Qu8cN{`b8v-^mv&Du!WgT*o$<=had(3+|ES$YE zp|o6`rUalMIbgX92DwSPgbg=zaj<3w8X!5cC_PSge{OBA2k2lPW=M)?>U zz{3~6gR9M7K@*qO>;QU_@H~U~8dnv5S)K;vdxDAXaSn5v60404^AA~#F{zYq1nbvK z(y)4IAig=N!kJ4E`$POB0lr8O<5P(k5lrCKqDJpzav8rA%Q8CnjOe)f@Myb}ZJ|@1 zNnjCSKJeiwx;CKBmhtIRd>!JWONC(EAoq_t7_cl?tlDj4#FjA^wJ~o$CJep2zt~&W zTrU;NpOq=&qw0;SHYcusCoUhxB!sPUQhZCyFDm_hbLC|iBHD>@!hv+^!^o~4y8|NN zhGSPEHwWk}E6=WcJ@+#-=EesDl50?vMtge-blG@P|K^f}pJVk;!5$nwIV*sVDa#FU zRLKm1Js9J)EyqQnA{CcE@LU$VV7(T$_jfHq_da=A$QV|}&T^*rFDIrmUom<0hAAM+ zl;Ecv4!+<^THt96{b@;})%D90IZ-3*9zzoWOU;63TJcjG#K2wj=JGR<*XN?EY53BD zzl@$+^Zy~54><^?!ZqI9MN!iO?&h1m8A6UCWDv?T%?_gcX4_y!*qbu#gKA=E^Romtzsw=SDXazPM06Xd1n|kNmY}gG>F4jDtk9DcEEC>X0Fqh zVAizz#wifmUQs8=4@oe6u(z!#Ox<-OuC02Y`+kRmDJj4Bq?6|1=tVzI7(~KP;Aeo! zOiIlLR7tqTB3yZ5Usuqrg4;+P8a4Xprr8nRl9>2rH9(_nkM)-1(XxOD4fNdvaSSgg z=gw4W57*}^tS3gz*ChdCm5_W8C_})y+?F4Xm5Fbt+A*#9P zaFDYX-m#prWiph4dZ%AMyGh+_Do5By&oW^R)8&1j;APbH%siQfFk$xrMHJ=2$`a~U zR*m0SsnR3rOJ#4bM-3dc#kX4m{yu{f^65eg)egE|u!S-DdUBqPsjK*JL)`S9z8t`K zj<{wjg&cEtB#km#uXnSW{`zedaQ%3c zSo=icC>hz(qDz&WWO}LYq*M9QN`8oe2y3MuIQuKU$a<}GlOV-4Zk!e}B+$2wWSSrB z)Dg&m(mctYl96{dG7!RuU8FSU~#oyNFN54m<&)1!-pq*)?lJKK|>Fjn^@;u2y%%ikDuOmo!JiPC>fsw{G8zP`0?6p zuJO&u@+COOPGz3J*d`H0c<)E}aYX0gn%b*y$7-$q?wZKJg=c~Lfr;Ed*R<+at;Nqs zmJOL%XUQ+64}iYwSdO1akI&FbnI_~tkS#kI`3F~ez_fnOR6gw{DIh-?93f?(==nnx zU$&iWfFrwC_Vr=L?Syw@i=i((ee{LFJZ~8J;ZuD)25ems?Ms_NBUS`)$15z?CPKkb z7m!lKM1WssMo=w}=2TbZ;%eQRe)3am^2;LaV9=W|$759bD6vJbz<#(95gWvC=P9D; zmBA1gVC*#+RLmA+0Zfjx6yx^|06;Tne>o~O{&3uF^K)R?xVpNyM3kt>phhwCvIt4#~ zj*OR59H;=)NTa*Msg$#>b`v`_+XoO?Z+wFR9zvdvU#q=+Op_K8fa9i-n)EcSdGkBO z#7sQ2Qb4Y%-k_Q6oLk=aqIwg`&gfs+FO7k~k1EY8GTPmZDXTx4LXRem7l_ly)|_lmtgU+I4~kiLbyU2PJ-Vmj z1B@qC)wGYg${JTieHI57#uPZciDXmbVJP&X&TxljncPw)r4tHUcidMl<&odN)za#< zl`6Eih}WF(T>}|_7h%;+dnmf_P`in`_JtLh8%#Z+KJT%`S37KZr`n0LyVhm=kt5e@ zVV`4jc8C0k?`LA^Kn1=FzdWAo3Fc)_`1%EG7ah&U`Z*3wrfG2*)Cch&%)o{_5K@WD zkC*|%p;G_ZirBtUG|syq!dU5+ zN>p}DNivDF=Gsmhd;=mw%0frau3qo#O~ynB>L0`Wt!0-%O$3pE6t$>1khoMOiFS z6+B^uo;%*Y>Jts3pDWnzj+Hh6yx3tGY-#K?hA$W7V;|yPee+mTxgf|eECE^@4%OF+ zx@CCTC)4J>VFx5I%SYCWlwRb{jC`D2*#N;=@4kqe$7cH1SPB*ocjmg;G1hcCDM;T4 zs`1d*4#3CkMS{*6x4JZd&5cy&`Pc&(E>!}kfQ~(%D2t5|-Keoc=LKuof82c_$526D zj_`NJcAA`r5^D~=*LY9p>v#6{m5@7AjHyyr~*287J% za7Y$j^#KSQ>k@m?{xW{f+tltq$F$?)6FkO>JNsuoIQFB|3UY|~waUK-p`kXk{gusH znn*1yqsAy?az~R%#A^nQfZ(2!N`Pjc+l&R9mrZ2;!;;rm))9revNgA~_>>Kg61sH8 z0fpL~n@Uf@b)gD!r8h!8QM17kZe24Czxgj{%f&lW`HJSz$k56Pg}>SR-^b(ul*bd- z&dXpPwskAA40mAUnAJk(!&qn8LVTr@6kIxuuEgk4b zp+n93P7f~MmzRHw|KI0TL;3~i1txkSAH9vAmwm5W2Ijne@S^NZRf+Dm9m326cGLYU z8!lT3aVcr(b&r%s4=-~zUifpL>dms@7l$uOMJ$cdZ)fq8B`;KClU8i~nvW{~$yERC zegEgHqV3g)q-b4o>$`GXpNEI?G|u?s2kI<7D?%=Lt)7@}exMqe&xK%j9*_$GZ#4S7 zKK{_4@A#dj#&2h-^B2}MWzjwo16q$&iD$VVx3mHIV;aAwHud3`o8$tlN5a-ai@G>@$Fd)AE8Z^c*$)MC6r!=gSs zFEdFc4lH!b*1HVREpG>dqC>w&6mhFFAC38~8GoW$yorCHHG!*rA`cnuNy=*M%}%y@ ztlQX2BlQP_?Jxf2sgMq=DJA#uKJC1+6$kSxgSw2s7Z%U!=Xi)%R_G<0ty z5SWuE9s^ru8(lhkehY8LZMnzWPA$vS5t{g_TxXz$Wz&WbX4!xX_m%OxeLB*#pF8{6 zEi33t)!RAP`r4;{)sMBX&;888tSpT(_zHd++Dk|btcYc3Z3o`)!%|oJ93Ab*bQg~5 zHaY~RrW?SMo)4Iha?$!>TJ+D%rhks%eo8t=fm8njQTo^VB%OaUkTGIo^2D@W=tpS; z_~`RJ`e{^=?>{#C2v-JhVDM-_G#F9R)}U}d4b=x-2Rc&;(0pdEGP?`Emhg<7g9yC( zq=v_)OuEB=so6hLV!Iq-7-RX`C{aU=Ys~@*Mr34dCYb2vAdXQ=UPUwSqQ=q}Dk*V| zc({6|#s?wef!*0!5!Qc07XQ++`~4K>G}WROH7LGh``7#BapziR{Nzw5IC^Bw#?uaE zzrB0fnGh1yNw&guZ99R|sG?6PkQ*L)VWG&0q+bl%^t-41n-0aPi1Q>_vZ?bEi^}(4ogN21;OR5pRJ{n?_>D_y_p? z`_hei;}m;)cFm2hdI&J3eC(M&t;+8W&6Zrt1|!HEEuN9#b9;f7PW&ybc9JT?=j1Bdg;Px`$0Wr$Yu0rYuV`e}%(=%6 zRGlRO&<3+itq)xF{m7x9M4Ftnw=chIgd(=CBdMy_m>M58v#eO-+HKubQge>ZBCEMuQRvwRDq_&+Yk z?usL>09n8)YX!2k#Z!vOsGD+^vlzAjk3S3j%397ESX%R*TLPzmmY*5PJn>2+++1eW z>11FFC`P03Hh#X>3h`_E(A4A_es84waz^HNi24L;#TJ7`w&VY)^y*8NE%h|7kuQ2b zexN!WPPjcI`g*=}RQ75lv*55;-*6u25~PNjGJd4~l?SQi;n^5!VlG(dZp%Z|GqJ%` zD>QAD?6AWI1yj}W#C|(W*!Aa5h-wY2xu-lsM#DEGVP0G8ti`eYosMSyyuHa`I)$Yc z_y$~@$VI-g%kQQJ@D8<~P2t{C2g+Z<>+Ds*oOb#-fV{#Jqjd4og-i!8xY70>Nb5gS zZRSHMd7pHRagIsi=5YyPAZ8_gcY=ToZ{SJc9{Yy};#FDM)+4hF){> zMYI3VuBg4aR$bo$Y%p#@e2{i4vlXu1XgLRFL1mg411EodXL-QJK*m{p1fy7X!{!S7 z-lN4Y;DwrPdokE(bOT*vwOz1A|^30Bd-p2Su6-W@Zq?Ku&ol*EZg^Pl16fBjOt#I{Wa*aaQCWJqV$ zr)iyJ`cu99jsD!4U>r(Qh@f?_ebk)_FGE)V)xZbCsbivCw*$bovOxpyYj7Dc-uGgl zyP;Vy0Nm;R_Olh~LTSDSt-Sx`ULD0dx8b_$7L7Peggbt-xKO-4!X(A|w!EI>ERT$a zKF4bMsIGV&)FgN8Xk+fV-9)KYuC(Z1yw(3it_a@y>GvMLeGWL3cHR0R%V;z|+Iuf_ z3&25QZmj?q1Ss;r5OyP z5qw)GrVQIwic>f$R>GXZ-w~v7iEJi7(b~B=>2-2*!}sdqroiZCi(icAR>SqiLR)N% zT$5Yz=Wbl8VW-|dIzYv703oghYoF!wmj=zx5{hhZ);+xB*vy!xZvF zuj_gEc`k70IQ8~{sys~KgxM)*_u8E@Ih}u_Bl1_+G>Su(5_i|YbV>GyILEKo;cG?v zJu5}~+k=)apJTd(txd9hW9#2=-&aI(e#b*K-7vIp+8TQR@J>#COhNsox=M@hJ7t43 za3~~TD%+X8wa$9yW^d0O_`Vjx>g8?zZp;Q68tzj(e`y!P0^C&(n76brvHVU= zpQlqjENY2Rd&|lr#ORI^OPlMA8jh;zBRjue*lysph3Y+qWz2 zC6E!c0+xvqilHlWsQnLL(2q!Msx!ff?J;8o(#)6N-`0(-fJR%5;I}uQ%iI_{l8j@8 zQeO0J7)u6=(l>6m1_lr9_Dm^vK#2^z8HE@Vjw3t0I63vku#@CWcg#_G zsd@(Db&6o=V}y*%o9srAb%gK5Wm83KZvPmR$xE|>Y{iu;@;u^&|`~Mu&`Dm*h+Y>Ll42- zdtlszQ5uVa6Q0*42@lvI`M&5O+m{Kd-dzrICxc)+14#@y)J_Cp2{C|Or2oV6hKdrT zETr*$iT>05#!BS!Iw9P zGK8e-1n&(=F!HnQ@zz9J&FqhNhFA5HokzT@>V=EoBWwE z8}lr)>b_$$yqPPBK!@?x81R*SIkc?=-5@uW5n>+z(_ZU`rxJMZwmd@B`?vkf3 zB4O>C#{SsNB*i}uStXQZer{{p)5pluv61Qz%JWa-jL*i9=cbL9G~8ad2MU^|Z+^bq%r*JQyIjm@Lzn1fO0*@rw`ICJ zeIl36@cxj+09}7-%}WyLJV!zi;q7WXg~o3(6_6-XV(=clpW`PupK+I;cu#$fInRBqc7An5((vvMJj=A0$SYlrZ_ zE|484JeWMiU2i$CEg^F)A;k^Z{l@gH2?&s%hU9yUJ@ZUZ?D<>l1La zl>2(+UR+rHGC;g2kg5k=zMQIirLP68IMEv3sCLl=uB{oU)_*zq*cDw8PhHPtgFv0s zks1Jn%l?ZR8(YSpQh!;cY4|i-kOy9nS=WBYQy>xm23QKttVx+NbsgcB#ou?@kJ4&BI<{*Qq>WloQTx0$+vA67y@0+C)ugp9 zl}jsjILRw<|434el2|zR1rN4T_uVFR{Pgp=*>)ad!3n7{#A5I+{EhXSXnR9$ z#2SShgV<4qMOX9h^2c==NY{U-tD1s05zBN9vy`+_X}Wf}rMTm?<^#7kPg0;Keabws zRyRd}utCV3wt1g!3lm4QO7_wA%(ukP#FzeoTvYTqUv5q7n+p>I(g2Ah4Ij17VedCK z{r%HgUfDB!@MuZpVepljiq^0)FErR4v|_q6V7pU6tpvJpEyaP!a}L&|wisBauw1wj ztPH1c2cFs00OWR9WmaqV4T0_%luzvoZz@z*?u~6;m42Ii@~i^&+Q-cyHG}t{S;IQ* zTf3Do{l%JEa;51P-@SvMh2G1U=hOeJ0tM!%G3p9>nhhiwYLm)rE7ZiSYn1L_`>3S#Lm@svdM7$_U|m*HpflIiDZQ929aaDvwl$_PbtHLHDIy0;%^OHst6YdO zesUlrYB>Ni@p@$~l8I>E8G}|vow|iYrPrvu8tc8mEeqC0uKDu^+?(B=m-6a~K{=oTqVB8c9 z>g31q-ZuJ)oRZqiYI>YW0o08eyFMQht^DsIUSK=n=})f7yJd3-!$G5y>VDN;08IDP z0XZ0_nqs86&NoR2Lxp&w?1kO2uwnq_byL_8(Q{A4&?7fR-Ac0rz?u|C=|#lXB7q`0 z4oJR@WP%V)GnBB#B=UEYMb26weYD-cuhm=oZ70_4j*Ptb)oZa#eW0_oPn}s{ z)&&K9sW(*=SNld#{+6XGe>W3Q|6~}&Mr_1|rm*IRy|?GD-NGr~0*lGs{13_C{|?KH;yC_@(-%^4Bb4;7|GR-|5>6%g zb|P{p4We)(ke|K6cTL5h;oZPvJ?ZZk0>lR0e+sq!I%{A&)u-A$)Z3i*<6pl2&NkT- zaef;(Vx?tP4KVD&qm|bX?f=w7qf$~I1BLM{q5cLy)HgP3x{ajC0f7>>nDG#2=kaQT8&Yduz~wv`hXR!Di`f;(DOj3KTU~_bDar zSP|=SG`E{!(f4g_``4Fq`c)jAu#NSQfTrKtzlOOhPh{;V-00#5F~{*1aI*BvCFU5p z{Vn&MjGmWRTk%n#Q|#8b^p%t5z@%a30FIM9K@BV0PMpZXss754AvZ(>1dfXO@a@|v zYm1KR+E_v3#U8k14zvGr!0*qWhINJbRO^dJWyf3zeIKCpmL4krrVWgfi2{$vT(57R zL{)3JmCs{R??0-?5L903uAPF6zYol>T5rNGF(f@$rU@q!Tz1C;lx$CFud1w^=xqL^ zr-nmdIZ$yMj9onccw&}yNHFcXD0NHB?%UFQKGt_}%)^5zVM* zCByJOSAJ|iVx$V}$pkF9@HRG1piZh!&IoFvPaWQf_PB2nG%u$foHxqoia5@2{$AO9 zmgDal@V(_=1>fc5GN5mRblM#kk14p?oqGP@yVR{PwDGX2hK5E`Lp7CB`TU%#)S(Db z*GOMjYadb8`W@QOJTqHsY$oC?J-Q|CJgCHneF-%OP(Xviqo!*EuSMu;j^Fva9Y+i- zZTq~LEYC8R%;w;!r~Ds9n#`!Xr+z6z3VSxxmxuNCxc5cZak&{up~vT3J4+}XtTr`S z(7zb{n+MXjtY<_wc*k}&H!IbrE%}FRU+59U0(1QBEvEWj{jC&wy;yJxTq7v(;!-B| zh<^AU_(GVANDavv+{b?u16#qHWZFSip_|<6x90uvye>hTxoSnXpn!4 zdy|w*gAnF|a8!%%RBg#^dvkR_3F(wqjG99GAlZ3-4XMAnOjvOPrBGZkxM1a8T#X(K z+ZR^{NaeSHfdqyS%ic+`Xz(W1^G)CYSZ-&cxMfD&uCmXxP03`;bDxv zOkF7}_df8U@*hc0YWC5-yKqzSuU;E|=bQKSnf6+XM&5pV9;?f!mRmeERYEcH$2)9r z@FiR#zaLKCuzb8RXGn|S@d0aLVVk{Wu2=Mwu-ktB?qh2rQK=@n2B#}|Yve^kln5v5b ztlp(Bv>EfsHvro#&(6G{#?2|yUk!|RYq4f=VRXV??Sgf`O$ohx7O&*Dzb*`kJaevV zI=v{_9+sJN4!(z3qD57C0Z+X)W^$?tNgXHl@DNyLSgBF-eA<{w=GN*`LS8sZI5QgkbU)%&;!lU=m)4f%mBU)6V)t6|PBafbuPxu?C zcSY%p&m@l~NQQR48ndI9?rsFAuvj8N)PsyU%ESy5iYytyd3*PvbmJ)^1ns|UZ&sG4 zrVJ>YA1AHhrG&h1yRbBXV{MHQ6`!R%ri?w3B{7Eq?U?>aZX}t;I+?%B=E08T&(pOk zn;)biWrm$96nTn6ye`)!juIC9c!c2Q^FAMbxeOyd!@=X7sU6RT#ci2SQYVw?qxTwr zYpj+J_#{;)p>yoowN3LgJ`BxF0}v5sr1q16_y^}CY}G;n(?Pz%`EQX}1?s}BI+gBM zpt%E|y|aK}Q{ZIgE-Tgb>vxAJ`m0Ny?S~ixACI1b{jnbBA0MDc6Gg5*4M(tz_s>7O zZ~OZGxh@bm%6xz0L^0}?Spb=ViyW3O07Xst?0g1q&HL^Cw%enVyGd)vRc8u034?oh z9V<5Yp&o+7gb2$lGBh8%AV37@IYUAvo$?3g%=|8mJcgzl416oI*HJ~)%hg#XjW4(h z%>wZ;!7dl|y;T5#*G#JFSVf3|0_AkiqxY4v->`zOIJOKg&5LvV+^JCi(-1C`+LOsT zdTY5_4ViPT&6(z^p2Ndn>K6>nCk-C;`jnBi9Awr#Nf`mO#*T7CgiiJguou|5Gf;n* zpRZsyFwZl~4So6o2XxHswdI6vzT7{UX|fbBK)44J!qgv%^x|<PsAdG#3Q& zlfOr{Xk5uG96EjYkw|shdaWzH4l8_|uk^DPZC%+i9YSgm5hE_o@m&^PU>%afOadeC?FEA%E2Z4T@L6=;6NnC7t4-Lo5BbOQw%}d88{-I6Qepte^ z#oEIBw9w^FUR}HSl-J_pnFCQdRnT=t^TCa%oci84`j2)EbX>1?AgN#j{FxN zhOv7DFtR)jVUcAI{22fIj4@&Oxj-{|divq4Pv@M=&K!7R+vnO~z|PUFKq8H$bU)yD zZoB^#dSP(OcA?qJaiG}EWwqRqSX+zACikADJ|2XgG&@Rx%;#R)gGco|*RGGClKYs{ zX`anp0gYyE9z&roF)3TgZ=cOzIdE-t;!mKMv~iGMs21a?U_T#Onnp;j3Zg+YL? zixr}cS2HF^hh{cUyk)jmUzr5n#2q8*GKB9=OEsRrWk{P2bg%sIj&PI*z0}XFLT9g6 zyPMRhB^WI}SOCpFyomW!JEOqpba=_?h2!t6+o^aXHL69G=QJfYSR(XcZx+8a% z%RgVqos(>CN%i}YGT!{-6}YIz_^?kzw1-62?)p`gp@b6T*paKx+2_%0{0)NGR%a&D zL`8o~h@S0F1{!ktA*suS3j2gQPze1lOQucGc!|l>7utpG8J!IyR&?P=#{j?fc&dnX z+=T{(MTL71Nx88fJWS}0d`Iv32eFH#EIVi%%NMp+0=huWuiVBp{X8;qP?7J<#4ZET zNhK3q1Xjsw5{~}t)z*UoeLasmZYynx*S60n{_J;sB@!Wk>-b@UoEycsh3){Xy~}ws z+hhxoJQ-JAk)s+Bgf2rk^obWXYrC2PgD4_|AYo&bn2{K&<%H_$D$8HHfnC|8x^b(i zGEcbKae^3I@8a|Oe!H*3IKV#an>|GZwi5Z}>zJkc;;Z>i%UPWLd`%~N;V=O=Ifm#L zvm!Bl9G?t?#dVeY>7-+Y0uWW~73kb5?otS_eeN7u*tpYtP?=2j<%utsb&SM7g3SBf zPZt&Q=q%MPfQi8@M(*nM zx?}w#dKr!GCg&l^+5E*iaUY@JmJn`0{rDvvxlP4AOU7)Q>>&7^g!`(d^tU7RfE0H^ z#?+^-0W)?rBL}>6IV!L$tXX?4&Op;qmm*$*{a1r^r?LMs}v1bG9C%^4p{?MgvSusD%JYOzZoy{m{0K z^OB!ApE-mOl^Ox;rrD}bo)|}>fq#ZA(WNJ(QB(L zErx>QE(x7yb~5HyKq;@>x3h7L5)Fp8((xKnm`p*?3=6l%9GqzXbRj*_+1i-(+V^J& z&ei$6KgPVyHxfMHM!llCzqeZ6qh4pcyuzmk{1#nY8j`rFXQJkQ5Vk(sRW+c2@Kk@b zIfWdkL(0!ZHPN3&V>&Xn6Yv?X(#}SHHt)r}V+hNt=y1_GP4KzO^Cc8v?`=8tt&-e7 z@mW$a65dG+yE-@YI;`~RpVeP+UK_1fW92Pd{`Jlf` z6U1U8%I~1E$C?RrrkhKl4y}D7(PI8?=Xa0LTq8L3v!hr64j?v1HXtQWW8H1@5dFE~ z+R?S~G6744W@U*Q=jYKYPE!tPmIaOtI+z#`o{GS=3)>r;0!lgKzzKyf`O(qx-EF~P z16jfN#0?SA4pMM(8wOP)vorIg7H>3gsxnQ)HUT8lyz=e~K7*P*PQ8Yoe;XeucQvCi zUUu(y!z|pY_u;GwxpUs{!Fh0BBDn|lhp+YT@eFAb-(iIt96j@tv9_9r9|m)D=d{=Ges`&&N5Ac>G1(>hJVk=;E6l^ zOfgACvK6tMbF?(HX3E?@P(y@^Px3k?O4={C#unYje%_=C1yMVsZJb_227E~}@m7_7 zv!M;_#S_+lewGC&ef7Gew_scxsywEk@Y>d|G^@W=uKCCnG1dfk-}=O@H(5ScsB%r? zKqDb?eZe`UxG*tg{F77K47x(a{)NHwbi2Y{zs6wg{d4?YQExTfqC_?p_0?FTboeVj zf-ch2}FSsbAZxc;tFxA@O%ez?p~GcRSU4o2vtbt`|sRr=!b9=7TNKzHZXfZ^2Q+%LvUHGXk~*4I-Uz)5}UZxrrR> z)lUzERQJ!l{mxkxH9yM68Il)e=nnD3us2o!&=vk)dONf>6d6v#d}a~Vibx*{(AA<+ zoz}IxrhY(<>b29>Zj~IZ>FX(|&#KZypN{c{m9k3*o?P3$Cayw73o62ubV9qNcfc;% zeKA_ECs%o)w*dZzN_7A#Tj@Tu^GK(p5@EXwrVvKkcOwppzEdcViZ~?F%)s-S2%1ICnl@_Z#htdkj8r z{wx+DqQ8E50V`R{;Gul^i0!Y5T$b_S-H%cJ0}rc{MTXcp4QE`}XIZgd1}@5dUkhmZ z&Z*jT*?|v{c^D+MeX<(r3t82vn%;Lu-;U#)%A-}>7bY>}Msy(iZUtO58W>xSDGCB| zC{C41jGIkAD$s@DJbobN(o<>w!iYO5;>p#r#YkWC**T`_B}3n{^lhm|nq}o^8lfni z*~1+x#S-0>1L!FCsBD%~wTehT-bOXaf!Wsj(TaZVV%*IBs)1?`49-G|jHx^%_MaZq zzP4E?aig-RIDZoSQ&5DDF{Ca!zNKXzrs^WY)?Mk@Kb+@vpFQJ}(rx}qOR~?_u_b_^ zWx@M!+b$u#;wqSeF-Z739#>n2v1|C=&~mw+@o2$4C+U;jHkg%mp?(C6h~_=>!Pmve zQrK;K+ftQ^kpdEta;RO}c_jL3sq+4qSh?;y3<=#QTAh3VEMK>rMt7d`qJw`?O`pMO zk9_D&-dk?l+4{hy(>}8OayF9$>ao>_gwlrAV_gIL=mp8byFDe#?U6$@s6Rl{A5kvh zCXSq5l9gX|VxXM!oj(E`H9;%9lT(H%O(r`M5E5Q%v5zmeI~g35_f(#iZbbtdN~`8?#2j0t$vX|fj zL*w_0ZMf8Wt%imgk2sALGoDq+?%vyRc5K$ibWDh0x_1K1m;&09(re$*3BS`EFJ|@f z%W;qY9@uP0ZOICi^Q$K05@FXoFJ3$+=fgFNL-#YB{4N-FM!gnUmmAhgU> z0P>XMECU+BtODmgJ6>v|WAu^jzdZx+84xz)^)}5#ykyPm>yc6A!f14HB2YaO7t(dS z>0E%cE%>}^)qqil2*uaZJVs$_16$e|-Z74C9bun)g3e-=-&S$1;}daQz{gAUx=$@T z;&tzUNf!2yv`qtz7_8a6(+MFuFz?qT$rYgD@Dg2AZ5cF}h^@KQtJGbu#{)QL?lSk7 z|9;BY*X|-&Po2nzE+~JCzwgRs-0zUZMEapzh-CYP)gjca9R?bcV!OOE|?Y4l}eMjjx7i^Y8i|qRt)^MaMU_ z#Lavb8QK62OeRYJ15us6MX#ABHVu)zSQjR0%P=fK-N$XMwsybTYXo+UHN(fKM8+tPM+4RyGGI+@HSMMB0GQW5)2Bmm;b+I|7Y*w1Y zYhX(P0ja1^G>3`bRrC4YXj?5f{F02Qcyy$xxC9T47_}h?&`G`a-GbI`BX@w?7q{Bb>NX& zQ4s@;v#uQNDUu2Gt>5=Lja#sN`Vl!-tckNToz7sN*+Zr zcU{k&ieRxiz?Nx5ES-3@$N4B>4R#=YJ^HACr_u|KW3Pu?roTp4eNb&ZeTQt+)r84? zZ-hSO=-n{Xm#K{LD;ascI$*Fw862|Mzvim${rx-H+(lf0`06c_06v*(M8%!YpYEPb zVKQRbgmA20>9)0!I52xj5RI!4>jr2G74!O{4dhHo8aFFm&4~*)&N}=^g4U}Dr(P7k ze$oDrarD>MG2SG!LxW;hnp?E**%`};D3dQS&8(u9mex|!K{x#DPM^y>YAQusWX`GX zZNPw0l@>J2vH9%8+GJw!L0=oW@# z=3STGjOLI_tK6r-Le)88o5K+T6E$#AaZRCKd6_||-7Vr%<|ue{XQk_v{OZ>*Gj`qt!kt9V&=u1+nL`^4;Kmc8*Ybxn0_`RZdhkO^{EgF zibpfufB4L+1-ns36JHpYrjmDRn8~k91-}za@2mIIZ6uf z*XBor`L_M8?BMG|beP&Ba#3N;I~NnI1>cC9J`4@h86AgtU4BN$09UM@vi|a)Y~4A0 zcIKCJ-N(Hp0gFROdw;%oO`$D3!w%DQI$xedm%Bo799x!ElA(R?PoGu)EH~$VyLSTZ zOY@jx)Vq8ij(O2>b%~G5$K|#@4hEJretSYV_s(nWtANIY&%nz%x~?DesskxP6Loou zv}M%_lp$69ydv_ZVPe%My;(29mS#2rmmkLmG(V>|{NCGp#DTvfrR)u`&6Y&XgB?2b0lnx*;NVSb2Nm$TvR3Gm?wRAxiMj#8^*mGsZn>tUJcVj~NB8n|E$560mKr zTaj}@Pj0Q;CnsAqekyE#Z`Tnwe92~Z1;x`hQJTnkGJC@1g^50iUf?V_?Dz^OaV$ZW zwdj|Y*ME4{tH6V6dgr>T7*HEAo4x3}h}h`^4;UKZ8D7?$uRCsVdtD{#sY!7+?{m$> zOKqh*xBNBWGeAbI%XGALRB4EIMCuMqaIz1u%Y}Ic*?g}*nUVDZ=CMzSaB#>Jh9Vr% zk6?R`s$KWmRfi@RiO28`gfX$^(P0ffmqY%Xwc~p&datyX^$x)5mKP8P>PSI}sjM5) zw%_hkd#SWgWPoHwJ-<6{IOs$*UG1Zv+z+pP_rm16XB+EcZp5I0ige)BEuUUwxg>|i zH3m`j_LhQd+gc*8kwh<%o&02pM@zX9yOfFdva{C_Vlmp~ZiG6Q+Yx@?*n~A2tK{1r zOb(46T?%lN;vtdR)kv0ctSmVU!YRe7l=r)37Jl%C5ko$Eod}^}b64vkGTMao()&Yf z*JQ%R7qY|UCVUN;lRCKSmj|zpJ_lUoNftf;LHQqSeK($ZO%Yw*a^>iHFXH)*^C@xD zD+*`*S1OPH2Rz;+WjDl)pV5U<0RrqZq8h)`7|{HnYOfNSjb|9R5IghNQfYC6DPfbe zNdmfwPeBIBV$KGN^gFKMlyQO&@|0^_#frfunD<5OOx*cJBsreTSXNjUQ(+}1C(gMO zVD#Okt=#Ll!tb-kn<&1cvGrmh<=U4Jvt})<)AN+bh*=vaXf;Nv%bJS(aV0J?$zv-D z;cK?_`2KU4ctg1f!lB_!MH>|@IGj>m;gGQBS6e0YI=eR9avmb=po2uy*cnW>vgc+e z)}Ed2nW87tMziI$FJEWv4oQO%jx#kAZ=zQe2Vk^Aln0j8Fg-))&+k;cDXrO{v3wqd zUVLDq-Gh~%!qGK%m$|dc0SrMSGv@zSuF8C0E&jtZU01&vdPHS~_5W1Om`FcSNM?N+ zphgPo*HP%$O_i~FCW6To+pM}j9@wR~HlWUD!H)vt>s!KOh*g$Xcv8L@(d)*`bC~FN zgm4(&Ez#4BnRAlUoOI}q_7U9zP?`{suzFz&(2Lt-c$IXI$~|$xpIEa@tNqMcPN9}t z&vqy$T`MMe)KWZgI>~ioy8kB&!@kGx1)3`5JpNv%w3tI=E{9Vh!HJ|?Rr}qrfiAM9 zht$lL1M5d-%$=m#cYkN!bbL6hS&|aQ=GKTm;nMHin+7*XtRYJ>5+=F8F&Uy_i2~o& zfhqq+Q+cvC{HG#*`asD;Tc7-${{4OVyN~a`r;KUOR#%@k?Yvz{33^rhvtY~1MfDFP z_CI*+|9vMQy~A|p>qI#<9t6|_sG$#g|1_^SRq!0A$T-I-fX2sTCwA?Nw3CAV?B2}5 z68O)2^Y07(eL%{1pPVQxHi00gR#IJudUrqa6nJswR zZ>;}U+WPl@|9$X|?9S54N4iL1-~&qMlH>;mwYR>QRCOEWnQZPwhJz#ISj1aI&rKiv z1!VpjFn`uhZ<2nJXJj{MP*Ak=t;u@-GE#UT>Z>D?YRB^_t30+kw#|;LDJq`OHjhS2 zZNF6IQMh(co&9^%{e6JTNOey7SoN0_V9(#)%~Ya{J+h{5{sQS*oT@8TFM1% zier=OYJm!%83fPD-qMnV$23MT!p2HB^i7eL(fXw0uN5lT;=^gLF2kigib?U$asN~J z!M@8bGs&tJ_PI5C!T!Dyt19EKQ_)~~S}aJ*F~RJ?^3hGM=l=U)Ro2~BwzH+tQPL;t zO|$JHosT?z3>>ikRY&^!q<76HR%NQjAN;3xvc4*UzI#<{tiaOi6og9gS54zLQMEcP5K-e|o$@sf6DN=Bd87Em^Y7$Y9-8)rRv8c49T)O!G(>G+x>}sQk z`yQ6BK>~m@(Xoz5ub2Xxn&GU1RN8!tgBaW49@kvmG{J#B$05N*9t(7>p$2B&ds|d> zsovwp?4MPPe~m82z-+QRDh@p(2FdVA9PUa`M;A}7w;i-FrcqV|O^P*;)7Scu}v^>vJcHF7iJ;{jW~>XX~^& zKe^vOP#eK(zvcffXQly^Lx&V)(kYP#5Oc;uzA$He4lq96p5JzV>e8P#nms_^0IxUS z75w(sApU2lxJi0Cat9ZOMtQvQaYv(HRzTwpT{K%-vez3c--F7J&tb1lz50`nLpV{e zWy_i44&;*I1Zok6rQTyO)+zt%vx#`c_?28OK37f0F+`FQG%d8&bUQM3zTP3k_oz*E zG+>%qt6WZ7Sw$s2N0GQUzG0!S&(gI<4%EM3JYAo*xFvoRw>|+BRaEoR$Z9$a^lP+O zJZHEITXmGtof#AK`1b_zXCP*LfGw9~<%2P;(?hryDzI95+{xF9{Gq&<4(l&_Y&2IG_eJN|gW-aLZ-2odYK)GUF4|M|xgO6tk!f z(C9EfGJo>lgXbFsoVFX}p!@nY)aLxr&+Tv?Gj8d_60O8VM5fkE7HEZ)?ay+|qmq7k z&TS(%2Wz}&qnU_ql>%7VyZI~WrXDm}jDw%u714nweryM!QjGSGPJkdiGjtabu#dv} zigLuPCU1?HJQ6=$zyDeE(=KJKY<*A}jrj%CS z%-fEcb45yxnRw9d$A=jrQB3x4BVixA3_r&BFJ=3439}DXNRFRd`K(4-zT-h>I(3u? zl4A`Sh2q_LHbT?W0kbp&qei^fXUq)QZM2zUy`esQP8LOO-r7 zVU_>Z(wdcb3bT2Jz78CXtmHNqtch7vsq#0Q4e+BXmz|5!0m!W?|HXWMn{Oxw@yS=t z-DD+~i_UZ^!tA4^Z{I8EO2ICOsjt`{K59tHy5|v$~ z!cWyVrz&|2YPD&`%X#Sm6G;|Ady>8^s=9eQas8Y21cj6yZ4q0eFfaNdZ6x)Hnbd-g zw=}++0SMSkX^ilMc+e%J`NVwPx2by#C`LM*wB%IzZt)g@~ z&y*y)^CTo*{Z}xHj=d<@tir(+yAm@g%)A&&C*|F1qVu{*@6u)9)oLk~uZZd1`?f+g<>&H2nCR@SNXIGc+7$cDtM?mi>pWty774m* zh9qBEp*Pi6i{qgYyagfIJd^sfUHk6y7jF{d&?NP!^9QkGp|B+KEuYb-+To2NOnWR! z##|pi$HuV9C9Jab1#3b^txa`}CRc}Qm%G3gR=GC8e)m?dI4m``#=&?h>P(D#C#yGtn(ks+`-bVL^AK2;ms$>~zckAK7n5z>6_Y&kT~+!Ztk zBYT6awu-jveN;Pq{%*$Z2&ya3D+-OQyz-`1L4-hZJysR8ngJ2M5^P4rxeQ4uLZ%NcU5ZpMa4PG;) z5)3cm_f|42GAw{se(G?ug{ssFgn1-vHInpV$aEj#E%PKf8x&i1@%)!ak>G(tbMNs~ zrOlW}H-A5(meS$M7s7kC=;HxOJgnu)^(Hb3`gDWHiGj+j(Q${kw?~xDa&|AeT=K1K zBi;B$0MVjCiw8lG6qSB$v`&V#zBGW8-w`F;hP+!^yWM0Ldr{$ES9wk&R#7Whm}E!X zQDt8RD29&6;>@`v|4Ljvj$bd1Re`3zWBLl4T24m0KZ;MJJyuyLQ@8&@017*CT)xez zU*qDFWxnS2eDRYXomdKozC*-v8P(^tCyD}9OyCz!2B}67KH9F*(+pBIX|HSK%zo+A z7zyT=ywO&bxWm0=dU>1G@W@WiHb>Wi2QmpJg1r*TUlE6jq@!;(Aqn9(|Ra&H!zX|@0Yt_DJUYXpj39jkTiC9K z_a#lU7P4QuES+4VA|0pW&TiFs-C8LbV4ckWqp`LMbRyT=+r{n5vJ@^1JcxjY$Ov&& zcD+NQK!!T*wYmtPraP90C0UerH3~0!=TyP#6a8dTyu!8dB9llnwXJ=tf^)Me$g)h= zeM`zd;e!U^8QQ7=ErlPWa$t9MMKE{kzo$s-9=Jh|zg89%UM5xEs|K5E8ZXoHE51lE z8}$NPQf$=AY4XlrOoATP7^&RdfQ`nU9~DFLvPWh@DQFYajcvc>4d zL3x&oGgan8kI(EQ6b`sq!-XjWJa%{7f2g1#tW35b-(fv&?9=ZWI(kWt!D`)9Rm4{0|VAAc51COK07`KGI)n zBhV*200X}pgsW?a4?QjVNyq0gHNZ_9VoCcA05@CUoU1UI8LgL7uYON4*4B8ByrrLJ zQe&bgjpo4{=rr8JE>{|zng`hs0Ui$D$$m!kR#}Iu@)pg=m`V%)VQIapB`|dt{Q7mc})p z6aG*=Q|%TRw?>Gg#GQ4zH(d2J@Lcfo|JuQx-NDUL7Z4}*!|vb`;&@5vFd0AKfV|w} z_ouIX`=ABY)K_Ek;#h&{(m2t{y~w;~zjJ_LxH`Gt2rj?EjpD`-TYb$dQPtf_h&DVy zE?ay27PAp@H|UnmS%R$|iKlX?>Vib(9*J(AJ_n}e`=7p{ysDTx!+H#-`0XJH=0b$WVh zv`)OXIHeVjP}e>Q2Ja{L>oP*($w8m%d1HzNlQ(6_34~{tgZelUttf!jo-l2+L_CMJWgvoildjHv))ymG+oBCq&6ONGhiNtrc63F%0IGDV(9juJ{%cEgS;v#-HkiqyPR?&lpjJIYV6-<}foBQ^{17}rd)#BDqf4}bR`G)i zJ(Zp*E~c1S*}}JM1_W(Q`{`e1Pb6OA^XEoz7Q^j^Z*z;uH`C8dr9}H!WS%RjmT0RB z1AS=SIJlV-R#x01uFl;YpHwitNP7iZFIi5_DU1}N+^Mhpe9${c(2F&onwhEbL*!1B zl|4|Bup^%+&i9-s?)A1WFP%6jflAK;I5Fl^bMTYR4xclyM0JNm;Mo-5su>L zM~S8Vbhb172J>&>&6RHdg7<+DYK%qEoHL_3BkqHEQ$3@WTwMDN3X*ka)re1DO(+GL z?~3S`pz^Y=E+01=veyWaSNQM=jC+d={DwcYfgrE8?tY@(}Q;>*j;J z;Iq49?jzJj>aDj*EILjrNi1n&jKpEIiT2BD*)F9lns)*Z58pRz!O%+^`f}mAw>;o-<9Q^wTP#K_2pB1bvsZA*MyF-n*&VZt@37DCIe`~? zrmeLF4%t{q%ZPgj4r}s6G$K11z+un8VPb91utdW*(l`nup`F>g!67m<6(-5B<*j(; z?>EUE^l9#91wosP%*-d?lhmIzbQ91GWI*t(e_>LOI#7Z+PyU^@ac5yNp-@!)0LQ**N;xsqR-fU0KQ&K z8{Ud1FkU( zgggK(*Ng5{gr@Eff1h8&k6j(EYOX)>!GAGfi2{`+Ul0ECv&Y>c-*4|NUH+s(JUtn# z1x|HoH|bk4%>|Q!hGjGs&$j2`ZY5$<*fxhPBMtzmnHLA8<1mvZH^7zXV7RhUTyY_Q{(b{u&w04$0x} zPF~EaeVwD=vYe;~XS_}FXmi_@?HSJXX}8SMpu|psW6PO>Le#@JeP)7jJCxSWP)DdT z8U`XOG^)~}5w>UdJ3ox;{lb=H-_dEm)lSHlKq7tGdQo8wX;0-V4vB>#LB6TIEw&FYu|Qed`5nM{{3dIF*lC#h(iQ0cu#47#{|K)md{N^XzQ|)W3nf zKSc@1|DKKV=_U&t1Bxi*gyRik(-wIj~@L{U^K>xGj-BFHOH-IYP|I1*~}6Y5Ex5k7<`uU zQXFBk$a9)TInQ~{WiR<7Z&9O-e*^7^h(ruevY=bdt{OJ$kZ zOqA~JgEqb)$%7~vq`Knzg4EYKGALg_L%dqIK$()O@Z`e?AKsA__17y_HNqF>pThzX zub9*>K#YEpn9vDl!J87Z1%!=BodQ>$+f#8}_eqknjc?6rJmVw8__yghZHM@2P)rWG zO&PIy%74)}k6CR3q2spYjvvA*&-cO6wG83PP(W=qJcNB<<5BqQnlo|tP5m~mc=A;V zDN2me<+DFZVturMA9V7t3m2z0t|neBGn}6ACZng=;iZ(h|3Dw%-<;~w$DzQ18=0_h zx$WVy`@QYy{p-t}0L#`7Hx+Dymeo8K{5+;dB~RzuZmCsU4fL1q=)H9gIZX;y%aAr2 z6kWV`z=+*bNlI*|>A+%jFwN}JlT5;W^XcXL_3_?uxw#}`?_R@~nzzB;>JVX}2vNyj zLunEsrTwcJ0S<#C5pdI|HeD4~12lDN5Mj_#Y4sq#uC3->sk5aoEBZBiKeKM$^FKqH zq_`}0MUx!@mpe#g5~_!!(NXxI1?TSn!`D}bwY@EC7nc?(1&V8NcXui75};Ud3GOaM zinbIfP~6>vdvPl+!QHhufiKc4vRdmc)UV)|U=KbDy zPT0fabvuu%D*9Fx{HvW16OYy7rRI9UyWN~)yviCNn481BQjoxmmBqjx;;N45jBawf zn-i-*a1UTcrV!Hby%Y3Cx=xWXcib44qdu7*pIa8VE6>@0?4hJpO+h_2(1HO7~ zeWXceHSu|ZT(`~}-*5EA@;E=+=;FIcQVU`eE;s_$^JK`{9;vNYh6;Y{8b!#!|LF{E z>unwWa8=*Vb<3bt%6eP1ttjZsp+)HSOoh`}!OhO7$jn#`o4O~h{d4abRL5nYbqbaa z9B1|2afW6|j^TLyD<)Q(Pue&lP<-qhS-QU&RGZ2Mu6<7tIa{oF&RL9KwZJsURe5A$ zVBmMV)mw~mZOH0#z2tsmoXlgNzIK|_{*d?Mgl;iujpgMElO@tM`eIlIro16JcdN}T z^Ow+6ghd;Nu|5dSrGrS7e5WG0Z+BgfYeUX!gP$SZ=)INR z*LMCK)WG|XE*VmRU2V;dF}&{IY7(*y+!-U^68<9y@d9h*OaWf4Gvj* z>@z+H&uc}13!DPq-}@Hyy+3j5`U{y-_zWJlmo7D32Kl+nS`L~P&^&TDBBT7-?5BcX zQL#=G!TZCSUV_6||A!pa__bK;3an?S@g)^)x|zHk-S!7FWzpvw>E6d|rl*(==L0mU zVXT13HH>3r;af@R7$UkEdpwXyZxmb1UDYDiR#olEFvnPw+-nRnm1-j^6q4#yZKNZK zJtRJ-&F~@&gzmk93}&7smoKHdb%TOY^GI~GL#L5^Br-o{l>P72Z_kHVwrVGpoN?$0 zxotlwi$FBADhz-mH>azba~{TuFjRefJKs!t1H-;UYr4!IO0c;eH|`MhECgc+TjCjg zq`SKiW=-w7b6fvNLtyBsMfWjGO3?JZf}zivWLfVuHTGM8Ib07`JWz$z3&#=E>H!)` zWa7Z^I$d*K@x4YN*&P?=)v;5l&Bg$1>2ykPo}&}-8jU`e!;@)tJBkuVTqJ9FG+HXr zX6bBnKUrR)bzr27pKo@rP{nLH>m|(hXZ>CmtcLpVgPH`B4c)ekR4mH)N2Zwhv5}JS zX$M?B33jug?`49HQlw*+GV)cHWXP)VOvzkN_j-ljCBuagx)g4F9(2&PHfLBRER&Yx z_8#G(6VEV?8q&2+Jz?L{*5{z~22=fgr;8Wi4Zre#{XA z7U>&HjMRO4mj!$TIqjk2Nl&bF14b4z;}#jOd+wQMz~OvNpF`{QviH`*#Td^uGi}fr zTsRJGY*_|Je0MDE4T+?{%t|^_Bvax~6FaFUq}KC6hV93oiw+3V#l{HF3MZ=i;ypxd zPEcwdsyb)B!EEnI#IPEVPx(G_pZbQ~(~gp&8a(uyp5R>FbvTj8fo#efr6(pP)>A4# zzKc9$JrJd^31$NCNtl(t91@C`mWFwDv}Ceb7tAffQ$+`|3CDScUF2>30nVA{qcFae z7{K`!#f*fQjNM?TLc?}H1jI}^T0G(~4HWDNBn)St`>P{rW#4aKYagpf#Cd}08xwbwP$Whez4>8p@^cg}ei%MS%v4nl=W#WMP)}OH0A=%}+E{&HOx0dG`$?MyJ&jE$iVP($CDRE5YE+rBIr^u8 zjDX6H&7$``xtE>8xlh;7nr8{zA%gY8Ev{o@a&6_t9_a*fSPy1~& zT>+{Jhnp+?$&5NeD%}Qg61|vWx}JX7p-S93DQ#N z^m1C_u{JICuERd2Romu*z*WBw&SFnmjBleCtg0U%Z@S>osQCSl8V5{Rp%}4Mzi7~5>)`UMQ~c`e&yI@t_d+HYt$I&F$U8x|p-J?Ki`h5A68^=U&KC~? zkDde=ubx^9d1v*R?iR_5>QeA4?l3t$-547RyifAZhPBZ`jTTZ^jf1R8a@!^$hfAH# ze_g-QV+Ljx+H)hXNAG(k`_=&`HQF?Ox;!R3{mW${fdPyH+HH|jo;{DvlTS(sq>v-> zLkYX^^_6i$ip`6z%nryyCsL3w3RG!!170vdi?~8jT!6%I(0DnJP}r2L(n;*zpwq*s z5&i%cP}SLlw(;GgbQKR?UcBrwxb9p$^=35iGK^k(xcapoM!QG1TBcsT9rq9*#0VuX zAvZJ_U*Wb zu?_~KqtTWb(|d=qNJ_WcqB;r*JqP^2t-4CNZZ7qZBrrmvpl&BuAOcRQPN@;PsDluB zrB-0wn{61zmdeB782m)HGgQVvvP$|IXHu(S%38DxSE}*Xyxr&MTDFx9om?UB+~&>9 z36Id^VaqzK(VX#ys*Lj9?+6ivgFR!bzw)?!EGV<7Kl}Eaab080xHp2&D(R$i0$B~~ zBA*uMkz~?Mf&IQBRn}t}%Xq!$_!!sqaUlYVG&YN+e-SlJyq#UA($z*UXFL9(9`G3M zT}WTH#NB}a1Ygu6p{D+FK7ay`K;xqYHf%G41^!B`8dw`z081AZ_Rw_M_q%j)J|ATd z$a{*+5A@fzuT<-=+M;3MkkC03%PPsfx1ySSck~G@&0NYxJvMI8UgKE3S$H^Y-Eq)I#6qBcIc?7a0CfGGYo>{TbOuttlY#^FuYZUELZe~oUMmx(oN2xi0& zPHSt#Locfy;Df0A{KN=9!w=z42ld>T1>18Anl5YwseAy05IT_+5*OV%&}h&EjhjE>>F+D+d z!FQ$SSQn*!kh?3DMZJ>Bx{Yd_aS{Wq2(UI36;LsUC7+69S6$4Uw*TI=)<~x5j!UHMmGLqdzs`; zg|0V&*>ED0-tc9Q5Sx(D(uz-))i<7Fi$`z#c)6R-XL1A;6WuGH-5&!?jIV9{m%1jF z-Ci{!76y+6Yv0B^I*fhDR%GJpj#nG+ecx_Kz-ftDCbM|E?2VqtEiRX>e;_~RZx01_ z#eHM>tc1JbP35|DrXPBbpSUa?C%IDsc}V!_?v=~N{x3Bb2ozOwTBhqrQ8c{5fT<~i z-#hn%;q!wcsRk80bUiyyP#+7g+Fd;3VLphoWMOu}#WEwYc~)8muQ1w-1coX5`{u3% zGsI=u11LM}pECw!A_t5S)1H!Pb;2@`_&2Le%!g6l2fQkZmGS%-GYdl4q6i{gCC3_l z%$n-#*5Hg}uqg_4?hfW>BwFo8-w3!kgC4(M7vt4JdYeNiu;sXK_=AGbHYC%vZ6yJK zDIBR)TB7%)Uv2dTSH)H>%6VZU$!X_}2H;jCR-^c|Gy>NV->12~EjCp1)rn-~eUvsr zrx|+EpaM9QWOuS>=O77Lq(4fN!BHXT)P$F}JPNv&Ra~ zc{;thUhxq8kPQv82fAcmycTXboezl>dT&Y=bOb@C*79?G^7|7BRdl+KZbysf zh7#W81_#UUF$Cq{psNI|e6tu!A)9?sSKBs88U5Mw*Zr5uG77xW5A(#2`x7rA3>^@O zjt4Onl!+|}tkVPZ`-Id0eS?oiFb+SXMM}@Q*y(8bj6&dhbP1v104msclS*-kT25rf zY9rs!JWG#&l@`w$FycZgpR>7SBmyIcWU4~|8qEXPz-aiBj&lD;AXaR z6dFW4`8^Yf3W=8JjVq}$fW}*OzqjitZ|$a2nhWm$^$d`zzJrtMS7h#yPoFVR z5RP-qRIAozJ(e%#XEp?K(P2-2eeJfq!LgTiU0Ga>%WJpQlIAgnAVx$ch++f-NBxn?D;;mCI{Xqs7+_%9X=M7K2ww{kYfcKA_A8 z4MStQSCyu4ubaOwkkHybes3}ab5j7H+wgq0dDepg?>vv1_$1+pp034b7Apl_uYpPC zw2HeA7M>VC*5Spi035q{3hjGVh}F71T$9;+jyjQLgsFcI_w}aC=E$-6a#TH4J09p* z-aH$Jy#j0y<|Xxxd33~fZCki=L&v!mSc+j-fUI2g{S^U>w}9|$l2%K zs6l(1`JtqL@7(8%*X`H!YX{3@_;9$TbGTqQlEcHx$X&Z|>`--PnpaUe+Qu$+E~K* zmPnYt;(9e>2r$3f{Gy!VL%W|DA2^Liv2NTF{sD!QHAKoc%A&<>*b-w9lNRo*<2i$% zWM#9iuk#u?-phAcmQ@%pmoL5xlgOP0pi5CCeil!L4KyS4NfcX+EXak&s8{(QRs_%O z;Q?xED^-3Y86F~PjCz0o*jN*U1m%Siyy%i_4-Ue-bcp6c4&}3ujFK{J7=<3!(^SRp zz_fA%s=DNOuj30=(HW@2i3P4Y2`&)Y&M_5nHcnBoy8v=UVmJ!9I>Q)IFI(7@*}^uC zhQM~YzVDPJUw^kiym%H-4+ugogGT7Gkus=;DQ#TC3rlZav_Ov>)Vk!QidEmyGjND5 zq+fs_P=itVqU+uFU`}4Z5O(dWv6d=`6c{%T?H!M+f_;2vcr{AnMvMRjcN+35P7Sq- zSr8l*{qgh4!aDgQHkLJuiUwFjz&S9-psNc*ek$wRTMd~CY;OPcMniA#`SI%+JBrz7 ze$Tlo0|<@rgc;8G0_Sdzk3<%P@yPZ$vmJskj@wnl7rZ5Ez0Ni_?lCe9eMnfJjC{k! zbQwRoV01x{SD3znPlZrzY-v~7{880Az_%e6-M-u}=|)|{T=^LX9G}EZ0myzT3;_uo8!uE&ke)WR3mkEwayR!_5fy_yl=BdJ$6cKE&l zu0v#Al<=x=&#JQ{P@S1U#I~BZc(CpaRd;G;8Td%q_~H*%;27bbVF@+2k7e|)75KcB z+-$>oq#Bx%+d;0~j^ZuZydc4+VCutm@=rA=a@e{vT0>2}1c_>4Ow>%YrHLfqHI{o{x39(^3!sm{mZRP##Lwlay3C=czQ$uHhc2EC;MQns`~z>)3_U_K zY@jiEQP5ItiJbCe`V)asc~+^m0raZi%TAj8E;t55PqQi6#aQ(%5$P_MZgtWM)OaBi(o zhaS1<8DIcrW@g(mljboP$>+mj?P0?ex0OzvPNOog&nX#yAvq`##>3c95QEL<^>Q~J z1p%YbORQ4jY?Q>|P77^g`RGbpoFUk?S)D7>rSE$3DT6V;;wU;(un<7hSJY! z3ak9aQKw%_lREF3;k<9s#qn*dc39vdCE`0T16|W`QY#{0QZ=*n0I1|$#yO|hL3KoRsfx@AOyb`kQvuyr)A%$=aBa(`n8UMRl>Q1S8k2* z_0cl-Jwi#pTn)cg&odBA36oEo(Mw!lj2F1aPc?bYQy1&v_2V037;E3q+di#$&Qe9= zy21!hjIG*toJDc`N2FGmI&R+?c%gP(KCYfg|1L zx_bD~fE6$?z|0lKq9A7rdm4+P1t3A*;!H%Q*ssy;|2Vs1c$nBF-%&sV+h>GN{|>IB zf|9~^qc3Eg{d*lSEvLPqri4)$i)3tb)C~_#`@)rn;3@eZ!a=QFi4Po_T5N~Z|z$urlfLZ}C3H%Lq|3EzMv>~_ZZ7W17?-q{Z2HKUy**3b=?tTg7lF6iaxjHc(;lky~{GUGl`;CJS3cPqsXUeN*FeZ+Zo^U0* zdmBR;tzZ6u^ZlPcH85deHd4`Oa)jrj=AvN1s9uYFHf?`u{YkV68UEdKkZAgzuEui^ zs#go(29W$eOZ0!T7l>d3MLQh(s2gWKFkwWq>`A`4G%B&|6DsKaGhCRKJ`hU9F@RS` zRV1a?#+G`V{G=4aw{Dtzbuo#v;h~yUCHNa?)xQ8OHz@E<7d|y1E;dn9KSvIw7OgMC zZ>ej1X0Bfu{~1e+MfNP1n^IQRjfUma(qN*>c9H^zEA1Jl)*tg9U_@XDVLQ?IBVesF zaH(;Gh~26cehaDj^`R&%{5vVa7Yn6Hph1vg0Ab0WTjB)Vl{ZJCu+@G?4vviOCMb~v-_AX_s+G)UcV02Wj zkV%W;|KLmh4QmeEkR^i>OE4}tNi>!t=viSDdPVC_)U3!@tu)qGEv7<3LW^2{|Pquaw*(S8!nsa}qFhcA({ix_?;+tbh;BXomnGecp?S z8-Gq7ijX6-WB^2Rr!p4-BWKcbuRp2hxutPVZuYv0^atU;+FE@SqCToi0oMU@e7GK^ ztd^>pf8x7Da9S;E^&0pDm+Cdj;EsNyc#D3*Qe<59S1{?{OY`4LMym?jx!AHTPIZ{W zRA%g4r$h)Wo%+$UyaYBKt*tQao4J&c#sQR->D9?h%rYf03n3*Np#TLgJX(8KgY$21AUp5S~)1^%2+M29%lrBCIik0b;$D#)RN!ANn;9+G4V4SZi zZJZ5maqW1YQKc;P^2hXNA-iD?jwDow_-NvmY~jb-p3Sh7LVTO+hh|Az>1`6nc#i4> zvXb5)sYh_O$sDEj-uxCS`Ilz)yus|(T(QOM%SO90yxEb&tSj=Eq|+=;0dW^pBoc!s zFBarlgzTRz8?=`0sx3z~Cf|~Mg6hLS4?zV7`y|m{0g>M>Q6>P2S+r;^#E# zlL!FiiG-Yf)NwM%K_05-YRW!x0>_mgZPAJG#hP0iJn3$coeQJ7i=IOQ2V^4$$`8KgE#gRstntszydIB4xOTOc zk)07?zd@V-^*}#H=q-U-$L@l-ED1v56?$tb4^4Wl`;W8u`Xl$<3GF*Hfo?saHi+OB z8S3l2cA2e>gHL&X>Wd5+dtd-OQHX9Fnn;TFD1E1G-Gl;<)%>X2EAIp1aEp~1n*~~H zez!hGu|H1mI0}566LAfz4H2#{Vad0$Jr{HJTk!7Xm#JcAyDGHopkweGpYkm;sd;H( zJJgHlKTnJ-SvXFPiWt=v6LDRl5kkF^Qm2mv{0vPhv2v$V$W(m@aLc1!cTB4evQq#J z-puf=i;&oB_vO61ON7L=9f?*o*-htyYOS>j0KiK2KkWRXwy>S=qJxa$mD}=is!9^y zp_DvCOusibm-cvRGm4w8XRM}DJ@T%y$YW=+S*j}nG+au==dk+~?7yTt{RWRQ=rT+1 zi@@(CMdb_yUgAe8_74~G?zzTW5U$KxUrE8tFvBj3Ho(`L@tm7RKt6FenSQ#sVEeTL z9PgapAvu(sj*__EbMX#^QLje6?a>bia4SpE%=0dgW|zeAkoxSJdL>&!AKbA}f^Q1TR4v+K4OI@L~L3K{jP)ius$9_?Bon$HEUZl|WDv%H6t z??Sec%-=6yK))Ar(V#3MF00_SfjV?J92*pKl%AyJZ*96> zrbJMtWVNj+N9}<)&q>psY-p-)sMNqCv-o zqp*(QwVYCnW^G8l_Rll$#U?jaAT^qeT;Gh<9Tq;chC<8P09JCVWsPpcv}01gp~ST}!K|njG*E56VK4%=h*$$u zGkXeUDW>QFFJj=TseLn4$v~Fxz#}_43kA5HuII%A3^WIRp9>ag-bK8D5WFX>N=H_D&}fNCe&mbg@!Vk!gQhQs18~ zJB|tYlXhyricN>%IcCT3Lo;<)o#Wq2DUR&ve85}Jmti3$FDx)~*&w{Hq+I}Nz*s1|7#hIs+x%{)X!=+U%pa*JsQ5z2 zyw4C>_sxHAcQg4QI__vH>nL{?CEBc<{=QAsY$aFcV5@7gDV~!a^nVfq`_ekbDe>d7 z#>PLTaH&tvk!>}^euN_ng!-I#YBuC^{++K=G(B#a#;{~h7C^FXUQcFL^H=aDYg+Bb z9QZ=Va%4fTG69za;b5hNsg%*Uwo7ZYdWUUO zKA(Oa)UW<*L_>9wy*ssQ^*%%Pjitw(acJ-BmwoNi=`vEro6&IpZQ_se?0hL?#ybN}_rW;K-uq!}f{_8}NrubUTX6A`( zR~nu2HgMv|lR-5*hv^O|`c6*!pgwWJy->H)iVUIuiiyx=eh+@W#yK@V6Pw}KgJSxv z^48l!C)Ja1$}P{>q+UOeZf@SRGxo34kj;ApLWIv!RX$UwrBm4-H`)Wu#N~J=*)goZIZc*7g{C^buWUDD8p zZ9*19pK4#A3kJ)5LF9@a14Ws{^yuYov$NMkW1;m)qgygSOIUTr2!(Cd(B00q)FCQ1 zKq6qEsfKnspoO!Vt6rMO&A6(iTo5x;$W*DS)ZMiMb`7<}$|@;yijnNTVp2LSA*X0p z+_Zpc9pd{N+d=yI0x|r`!vt~It&bfX$`Brf0k^H`*F#YD#Gy$#htj{hDRM@KbL#J; zz*SKu)wX-|dkbYIc&YXU!!DEh&1qy%dLF0AUsFpf03XDSC?4lkUiIAuG3t{dmz+5RyVyixlT(dVDXi*xb%?elC2 z(y&APD>s%^^e6j6r;la1&C*2rRdz`}{GJJnI*sZ|q%+I3Q!%^e(*|jT$;btT?xIZl z#d}T~6`MV3tU9EHuJ7{cZ<$FZH8cBzyH1K`&4=PN1qv-M#5eqsX;_cII_!^W%1e${ z2P^XBP1_VaBY!{T2A)e~t^}k*jSB5N*^0s=v%)yYe#K8#d1W*G$JeUcv2i= zKMr3GCL1!%|4> zUf4P8NmZO0r;fJnKa|oNKHH8?Gql4THhMeKlNIT z-ar$P^nEK4@>+ElJFzj^HkMN>#hctLGdppJ{OzI-W@{z365$601IzT>`yu{*!^9T4 z-JFdlfriai&B2C31-=4$+bjSIF*Mf6DigiN?K}BA#Z=ro{E@ho#mj)Z9+&v_`W44C zm&2(U4gFfn5W%O?ns?+}bt%7SS8c!iFU27AzAGLU2A+n9%_-dHvd46i|1q?AYe&0T zNb#0>B?T>+CWC#-*)~)mg;x;@k}n8cVb8Ogly;io%h`%zqg-PD>}KG91ebi)idd%J z_z-bn_Ty8VIq74dq5qLk%}RkiM&s3;{IKw{7Na54D;k9_jr%7@Ki0apPX?XGNOQ^+ z@0!gTBw!QLL|H$T?$_BG1qcyl2PURV5i<6-9G$GxNKnSd%oqbRu3(hIYWJ zq<0z;`V>u1Y^6ZFIJt(z+aLYPm~nu>Aj~MdAJT2m5^?!sV9Nz)0g&<{j1ZDd8fZG1 zyRad(v*t${c_<4Dikxzt-@$U#XFg#kBfDubqqQo}DlH zQsA8orHL(6k*`gFMA5%rhtm4Onv4={(J!33gGbYvq4_|WsL63V?P&KjSCg4e(4F z<4);dpLcEt64akvx2=TIg&Z)JwptfIYZhnfH&R8E=e`lU7(xLO$ROz>HW}jByV~#@ zqJ}d@%=%r%YN_XhQPX%Utkwp4)fEp!zrN-gl8H4pl|5?l0s(K{cVnrwTv-#=QPu|b z=L3{<^W>_Fa9qYfV;AA zn;nl5)w~slv#;_8&5|-0fD6!?S=3E6O80jgC!}sA1^HS@@$uO6?N0#%-u@2-PEEQ$ zTo31ZnYxLw3QOT($@RJ5dy;b;hWs(cVA3BD3*+L5R0quU!YI&Q zNVyEsuWCRv|EJt5g z&pN#g8wy+p@-|E#kj91z3|r``iX>?^#4S3}r1Hf!zo0TnU#%jvX8} z2ZEQ_|Jyszhs_a`7ccPR=yu_Dp7C+#f=l-dEfPnoX7i2wxjR4I!hAc>=+xeX<#}#+ z-ROUjHC&y?P(=swx!lrlf03qS$wKOPSKV?w{v65}Ow$acq^F0*T|~c25_4Tyl8^#% zqw3z3Fj{%1n>pUkMNST_SMpLreT)r-)ix0!>M`M`aDj+V(hc*K!b^&z!af=X4`X`N z(T2ojLTi+o|HB1Px>6}C9Yw&X-{i(F|NN7Pls{Jp)+j10G&0;B(W>**Dmf+ z-Erb4pjq3EpgVL-=dzi8x%okBs@UnQB-+E?r=DzDG+Dx~&zBsD^090Owne!&m8A zpEEya7z|e=lq%||hB`ZDDbE;xJf5Aj_m9~>ww68^N8=yVKuZ^Nb55w|^rF4t;^MM> zNNaZCeB|h2=Q4&^L=kXlkUW8FwktIo$k+4}jnL2kenGYRmYnsj_GNkdQj(#^kiMge zCer}oi^HuLM)l5s3;Nxb- z5%~rK`jDo8``+rdV;DR=M_}guY*a@@?8X!M;H4qCa!JC?cHdYT})IBN}>D@F0-d3yr; zMDU5tB3)N>9Px<^`)_HefutFtLy059D>hJM@$&#ch$XU*~Hfd-sp zDDn45Vh%JTw9UUjDw5i6=KDu=*8yvNXf*1x`^sbgh&ET?%b%?dk61OaMKyidu=-H!k(N zbvX2hW6!u(soTXu3-%W!5#D}N7Hp3>1Fp-GItNzKLt;~w{+juB}yXjY0 zv_Z}f0Z%73PssOmFUMt&(2jRN9Pc+jRhpq3!yNxTMEO7q-y0>wSDK-&gpZpjy`H-U zx4ErXfqw6ef!er6iPUVyQ_6ye!ujbM&IU?oxvA~Gf~7^uEHlb~&spDg-8UCT3Q1#D zZdE(YDb-HlvZs#~@~rT{abMn8qzN39CfoZEf0w1H3CQ{)b?06^#7bLB=DAD(C6=R z#4h=)WC~=v%&b|QFVh`rB}Ib24<%XDR0|Tbny6KMcZos7OWVvjXnDZR)Q#=xsA*1) zRk#XeBy;*!+v$gvXd5F$=v%7^twTBqjWZQW!ed5rrjY4QPuW{E%5m2)2Figf8@~Q6 zMqw87OpA~6`bLjo|CwrgC8;T8op!SuQ)eWUm_X%|D64=DN6j=(s$T<9Lh8jaytrsc zL^?Ba2;Z;soQe+eqW{W~?-r2}Md*tzRy1m_>1$y)d$)@A~^QJtWJkAeYRN`issi zkX>;#x;{Oz_5Mn}K)w3}t)gjeb*PKBDVQ-q&o1KY@<$|A800hlg#_>IYPWaN@y! z*R9q}1iqgGME=d#{rBU$P~atfYsGpJv4gCcM9vFy+)8ft9EcSD+4uiJJA4v_{=kfE zs!1pAV2x#oAozP_74AQ$>OaE&AmV)u1JHDRf_X?vN*X}M|L_vo_@tF52-PwT$5BO_ zN%x?ueP*8D`k=V3OjGZm)HOSviWap$y&8q%Q!CtuXDG=}D<_NIcFW#6EG#Tv&xC-> z#5;@LV-R?*hjW1)-sh(;fW2Nh!rMPocNDmVv{OfIZM^zMiFkva!IM3MK^A=miy82@ zw*eNgAS?p}u^<~W>)pb(kE510%zbx4RUv0%P$fB&}&!B%A`BINOHVLim&5bePpSNEk&fWFqlc=zwe1CJa5iO zRCbZIkd_uTQXI(KHub8iMGzG#Zh_BfQ~&^vHIh)8e=cAL!mG+8)gnHMsjoJQLDtgJ zdKYsBtX6`^(DewQtMt8h1N`$hMPFf84(TG(Cr9sCa)d@ETGZ|-fbv^`K`WXqSny;& zG7);C31>6!!$$x4V^FE(K4;VBUe$bixt}#+bAV4&{wICN2~fl()a)6woP16$W1gy$ z2U5JA!|p)*gPf2&pDUBXgclJB#UgwS+HAnQLwijjj_7f-8=@obHTwA?c z#TovO9s8F6Yp`_S1Etkc2MxYNf#ljvA8U&DH+{G+r$ySu%LEG=3cqrI+h{RBimh=u zR?_BrTad!-+u{|?S?%3tO8=C%f9pgC985aAqB)auzxm|7!QvofiD`mxPtD4at?WKA z7)TeB4BT=C?YX%wHGx@bga7?s{}`~SD=diJ`IMH%4`0V7!PaaY#6PV)3zx*-x3fD8 zii-Cz!7J+i28=n{pAbpv+ra$8oRgr;{d@zIHvgJFH0;ShXC7y!CHy+&!m;kM?oi7i zQVV}KZi>)pCv9*?Aw-nuBG{8GcUv$_J{uE!Mf2iXyQnRe)W+% zTh>|`Xs{p2(1xr(?h~G&nJMw+c-ZJcC{>^TL8wyT5A)*AdZa9B@x{3L;>fg?>bH-! zUTSnPMD&s_%Ol+G6uPE5>@Sw_Y;C3Ies6#)ZhPCFg^h`)AQiH+ce5tc!jxMHIx3Rb zIW9KvJw>Cc8P$U@w6k6P!JMx)1}qf}zjM_%CNBv)?+p`40}1yQr{RnqxN%J#978G( zIR+JEE4$o%5o5B$W5#RU!5G$l4N7dS{UT)HT8thUN*RMy3uweV(RVKhOctg2w-?3; z(ofVCTaO96~(h9Un!{d4Ii?ZC~+SE(%fs7Tin-fVF%?DHyJ^~ zE_-7qnV%{_@g5JDb&(NSB#_Y5G0H)q&pkSJG&hT^)6A7p@^TRZx!#3$ra!w!b_iMC zFR%EJhCL>&;CcVSGM~QdQ2KV1hid(fq`rLGsj&Kzx7q=F@(dQ&C!QCe_LyQd^d(% zb0!P+w-T56h~U8nPx)=zWOCjHt7Y96XK;d?dn|SfUCox;@Vsh%S7BK;LZmu2*!Gsl zwuUg_T+piD(9KXJp)S#zLZXg=j*cQV!_J~s6(4A@pD`bZ*s9Y``4My|R}PBJ33JIt z>E_o4BPRTQ#FsQiNeb*ZN(>3%seC-C2j9$H1rv|&9k+FpNFdYik(24p9%xAk_}z+c z&3lO)umo&~am7BX>U#u!2o^N`nJKwjNQ1#t7x5jYF>Cy!>3X)w9&YI~BR8>Wr1eVE zgh768254bQyRxjX;e*q~sJcwav-6sJtpyBsrF@GmRfkWcb$0lcTr0k`!EQ&&Ig2$r zR#(F?Q_R^RnqkDhvp_!p@P}qt9CUGS*wO0;7R}7ZG(+)Eybl0&0&e>WQ#mJ?b^Dej z+{RPXjA}w<^4EUY6VyRLdidR>1&@c+a>~NxTEuA4mT#H#!iwGYyA^ydw4Mg-7b~xe zhXCG}t%}7L{AAjsouEtbE33D*{Ff*AvK-X>rLcNvdeT=qgZ!Q`cdoIk9z|y~dtWN9 zRgsbA{XCacT+Pf!b?*GS+0HL&R2BME(7X6dS+tx_Qi`vL2eLs~oyZH*@F8Bm8Bhzg z(RIQWCNE<4w#6hq&_@od?+xdAG}DT7LTRn!!VQF?PQ>`G&N`LI|Lc{Ve$KYdbCl@J zwO1A@C7)CVM`iLF=F&EY=G8@-&qy{9&*_S64Gt}Cqf>hgd=bD`#3NdwVKElX#FpS% z0caAPQ=$7TE0+tvj@8kFvknCXROwD4qkYd~ApB_|G$Es&z@t(fpdr3eE+8k=96Fis z`tp1gy#fArf;(l+^VB`6e{n_QkF!J-A(SG8 zvJH#j6x)#dWd^8x-s-oQBuD5*J-)|*Fyq-rJP`WjdT|&=$PMm+PwkpM6Nw|O5!PskSAIh z1tfLx&2AYG?Q*b;9l~xSZRq9wh`!Lq6Yb@SQlII!X@XKd@8eron;Vre9~w8Id}duG z{e=tss!&@2UP=m-Z$Hzc3Xie(AT>p%+sJ2taGE<^Jkm#-t;p(nLL>*~>j0VC1i3zq z2z)7G$Y{Tc6P{@tOFAjotUWiH?@jy2X(EmxCo_}>2c^B#zbf)^T@haDD*CX4?{gnX zOf|vgxL|6cxkJt$sO=+Fp_mB)f;k1B6M&2zmo6c10TI$NhI_A#K?nE?YcJ=Y94hW zZ83^7VL{|^?;K{ON^2fOSY+n%>xM$JN;lzzz?-&VvyKI3#c~*a6 z77qU%Y+xuA>(fPEJZ6RhhKQGY_dWpbSW#D(y7lRJhtqZ6gMh;h`UzbV;(u-7oVYLo z1J*h322DNEMsYVhl9ce&6I_L~c-TayIHD{N;Srt5dNQ!q(N)~c>9;c00+ckg!)6wh zu4Oa0Dk|z9_jWk5%WP&+;1^mn-AMaYlC31}YPIW$X4->MYaw{4(&drN!zuI12y^>C zZ69mghNgq*!r_3`{Cj&GJ4N0uczG6SYvd&~R)F>^q85&75`XTz09Xze25@+ zT|ijF{;H!Hgx9e+;Ee~bkLX+L5r#rUggecd7dvD>cHyTkUedfG4FLGHb9g3!d`=-lz6g(V7q?ErgG>m z^}cx>xM6u*v7@nDU*PKR_2?=mr;t^@>{rz#R8ms%1CIvkJrf`tZi^~#xu+yB^(O7X zv-u(^$6i7%eGiEM&tKRLbieQ&u)EIrtj=7}3Z9*qZ&Amd`G0(UbwE^G_qLP>2q+TL zDkZISr?hlQqjYyOq=M2RAl*51cZkyIFm%TZ4BZXi@qOQWulN40-+%MZnX}K{Yp=cb zdY)&k(nr|D{PNs1x9v|bF>OuA!Q{|~iQ=U(5S)BVon3o+S8`NW3JyZXdwX?-ykp;> z!LgL5dt}z7{#*u5Cd$G6RxO+TiE(DU;@kZCpEv}A;RlQ?w~ulHI=s7M^RWuhhL35r zoEoJUU$^G;yt`Lway6phEO3s$J((`ckDH^y!`k!0QVHK|y~foh#ONM}jQ_i|N2+X+ z{Lbcz;z8I-yI<1cMfu&fa(!F(pKh;-%sTmc=g743>;HA3=WzB{QSh!PR5pEeetv%C zMq<=Qx&h2l!~`Ke{@%TGF}k~;g+4?I$ubc37R22_pC0%ampEmbQ@PI@$EI;=<7)YA zw0vJ9zwwk@_8?czxvrJkATF*#wak%EPfbla-!bgdP772xAAPZ1=1DM|hd+{s5xBo;S4nw2Gh(7$z=&CBmnb(Lh3u&>qD z3t<|!H=LlYQ7LRxB)>FiG1KFNek+^n!@4V-C{%$i5rn2=m)sBblUyK0w1P zzUwk6Bgqfi?tk>k4 zDcroLGo%z+T)ip9V}7#NY{u&}sc?Jc_WddHMb_ftlg6_jGS%)&b|kgl`PMVq^K@efq@@6~MK1Krzk|~nP7&xag-;^%hdrG2qP=^U^}I+G{o)Gp zp@?;+Zkty4Mgms;Ht>0+UcHoy#8v-@a(2)U?VCh}liZ%`%cC`V_kb&K@uAaW-W-;% z=s-4$5vfVRe8pRBMU2&?Qj6HkdtsO8CRmIL$U3~y@d-jb;ZZFm)WN7U$pLD4HL&aV zS65w_HFF9vKIFkqFCnCQJfhlaw?@O({2@tLKImPaQgxB+O!<>E#c=bv+E<7~@BdZ~$ePz)q4Q7=0-XT^r|u3!!&6f9*h z2cq;W3~RbbX^6-#qdj`(F<-0yvRRCQ)t@1hhiP&1;`n36*{UwIRPRlHQDfxc1M?Ti zcv~@)wbYB8mqFn5y(J^d9=P3(C&1!8!nvInxk=U5kHJ&jX!S~4RQHoxA1)IVx0wk# zw3ATnC#~P2p?)lSyjQNIemJ(mS_sD`6^a6IH#sYx$yez<&3;2EymGig&iR&ylXkb# z-OI}sU%!pw6F}2iwOW((Z)ch;=l#KDi3;5jG(RZQy~3y8*JhI1HU{D0#h8ER23t_w zYPc@NoC5Wm^sD1{n`}01lLSoe%LGm5h0lz&sd5$1NJ3Z=+i#OtR(!T{6jHO*qT~aD zBZaQeIgi`JxeerE5^LR;92+k++j4PlF1Zo24OHkDp-|iys|oA8@;xOI;kWpwPv4Ih zPrtf7<451*X{T)?80zB_6)x7^edaIzXH32axfhoHBF5L2xL0^npv(s+R{v`ISh&D| zuW|l(&aaD1FU1n=@`GrK2aDN{xUY# zc$fbH;56{cJ$F0X-X!Yd&YyB@&pna@^ZIePO!I&@i8r*IhxH;04Iz{U?kRV%T`7`S z{R_N@!eV1mQo6LvNX<`&Knx%080}>3rX-wBUG+|cD zkYZHIq^E&y-a=u^8?oz4=;^uH!I9vVvIOt4TBqGsEtAIHj>F;Wa98UE@xtb5I5ke` z_2D3{#bF*f!$_@1zXGFuo9FRe0&{-v^)GyM!p^eiGX^&$5eAo+PGi0zb}qJy8A3<1 zkg^|}zMx2bpJRF#1);{%b=g}_I1S^y^Sx%c+FaYbhZfrEYKG#%j-?NuxhB~=aP3^dtR3$BR4<8qVZDn&`-KqKR<s+MAcK_rP;~sR8lXxchpo zOc7G33cS4AQWoB>_jDLSD^4`zbT<*-0`js5B{wf;bT{4m101X2V^}D zsR9h|d?J}rS<;hBzR4X~@c0tlxn?@ZDR1t?rfKX3p-!q+JGDM4ZUM=ogyyvA@Sf3{ z{J>vXkO1l#YhI6taut$t`_x~!ee-_<_n?-FB(_AG%1J#^51Nr3PiATHd^%FfT>b|} zvY1a4^`U4r;X{>OV~pofNaljzIa}@_>vN+E_=$=6fafvgq6_!v6x@j+Y2A}Y>~lv?<;=kJ|OMu3mM4eG6{)BB6_es+$qFA zlJOLjN+|HA+8*uty2cmWPt!hLYxBM@3hUmBfN_=miHY>=gqtJWnvVfYIZw7(QiVQH zd~wUi^p+uh!|g#QvI z^VzDibEY6OiA^zSV2SItDwjc3flY~Kl6I{R%&Kg_RyTc+%c!ojeX}4S5t;y!X4Xc9 ztcG?KI01)$VqbjhlbOdu=CNMhb??Yov+Sd250EiczJ3)O<9a#zMNxr#FL+=(K0YbW zM-}YrH|F9w6XF}nNV8+!JiQFK>;Iv29BN42>1NI%uTPJ~CN|*jfDG)ILc4;1W*k1{ z(9uiRuyD-E-cpaoUW(+hnGx>Zjr@%&lgl}TDQ2AnfD9$YxA(^8t9QlUAX$B{vw~u| z?ozBAW!23Kg?2e`P0hIpWd)IXUrmvcg|P>yDPV(nq@|@bI0Jc&6w1t#;^M?TnmzM} z%xBtm9|)fGP$8sv}y8eArO7VUB6*Mn(c?@&}FW6 z>QOM1)M9rh(DDuV^$??Ija@>SiVlJC3YxrR7@6z>=rXWzn3{dWbS18WhKhxmIdwUb zF8Y3E)HBA=zAPq?$8m8{$d_I*JmQVS&oL=OVAmQ=`Pc!s5(Hn$Ka$u=I!H}7a>*}! zs%sOnK_IMGZuc{vr&YJrc2tdu^F`~1A|u`7!1ga5U`lekWp7>{!zXBuSyZ#Pf51xe zqT(x-tl{ST<*pLgT4;YA>j>7SUoeta^hhs@zc3@A*zp3vc)M(ddJvK&T<|uT%~_RD zY#q{s&iX~^^{q~MScr!S1YlwOu+&!*4JoE&Fj>sFv1BvQda@J(W?2%M)0%E)jF8RK zUGNc9BgII=rudQWdR3ngbe~+h6;{;2mJSf`g^RedyKvNX&g)NaW;S}e4Kji$+=FZz zu07KND?ZNqp0Yz142D{IEj6cxnCk|!Gyxs%PTQ%Qb7#1w!w9q7aa(0A&p+@U09$F| zG_IcJm)1^xsnL938s5@+P-pqv%x)141y9v!ZU+b1+Ez95DJn_0h_tj--@rhgroN}@ zv6jY__Zn-Pswge*Q&Gll$H8H<12Qu&^G9X)7wNdI)f$w?`wq5n_wX%r2gB_|VF9t- z>q~a++Oaph)vV~lj!uVT0my~)#0*qh3%fUoX`ag_+Lh}%E?652M~OfrhrX-SU1{y@ z<0Aw*UQ_X0IX;Cp2&K5V&TLoF=g6PiPhg}}(E4)`a~1WUed=P!7W8t-&d3k}X`6Hi zo?m(#O{d=i?NeH_hujck5~qjkH1*FV6uB+0@nlkI2r^#F4QAZlz~ySD3sd4H2*(e~ zH9_H1_@sZHtN%sZD6r0s_Hmi{b=^Zs8+J7Dy%wdgl^0Y?J@;y;eB+JseX^;P(+T%c17v*oM9gz$^q~r=%L9qwW)D*nKHug_g^iL=#0=fAyGNtxQuBlO@SIPb3 z(I}Tc-497xJ5}U$(AQ3cZ-IHIm($TtaQ4rK0Qb6rSFWm6*vW_PNoL_R!&PcW;t9AX z4pt;B3;h2Btdq3*m5lY7#ttC@VE_$w;rY?Oqg)wpu_nF8CY9%UVe!3r3z}`hzr$np zUbn+KsigWtjVt#P2)wOuy4UO?1^=gBJK`OCce5upO=HPt%h{aQBmz`@x!7&h|PlUPypX#YA2kwe1rg!BEy z(PxrMjNwTeHhkIgGaSc8*J+c#AuF_tBYA0A6uwkZGpX7L<4WjB(taVDU$Y;ZG+hzw zUVs1Z%lzlaelj_-f+HO5<;`kjQslqd4KBY?HT(@Eh%}70#T7S9Yei|_mBcBPoH&{n zJL!B^M^*4-7FY8(0~5h`ykXtE39)9Iq*hg&*E8FqNB8~>E}NkMK&q5f@+D!iG$r9! zz8qki)GPa!|2jo?P(S4X_w4pNb2(E zWRI1%b1~IX6)t^5$rAMET>N2H{~4U$Q$xmVJR?Pw8Q=q}rk5j;e=`aMOZ$ohD7>ks z+1k>!*0jE)iE8KXFaG0WM+VTg_-ZA&Qb)%JB=#=;_PW0Yu_w~bvd^h5JIFgNP>;d# zUna;Y8YULjoLyM_KOY(~N8Pe~h`M8&7)A;e<=$w`Y2?f|@-V~yxn+n%41t=&M2qz9 zsQ(n9;X1cV6y2BkuFkzW5q_}dk!4Oy8PIVo&P z_!vG}3o&#Kx_~d&J@qNMyFBO2e#d-JB0_2kcVALwG07#fXJ?-rd!fQ>mNudTEsr_; zNs}S^tuIh(*LE302S<2Xe%$;c)$))1B|>J-`0}waYT;&z^_}1FKklH;;+2CX9vBOv z-o${2?-s#z4^L`AYWI)HUI0*Cby0=WkH~?sSA5l8j0CIcC3QZ~$%h>Oc+nqA>PL#~ zszW?H+LdT+VnQ~(%u4S$#A#=Hje=~k9JbjzF)l(hL!J8iJ z9?Ch=@da3MccXKxeKicJk8#e!Fqx+x2NfpGZfCajl4x^O-jGerlv3owDj; zS2IoWOxvIB+Znd7cGM<@a^dGHjW6cC zPO;pDj}X;OJ4u0C(n*gac0)kFar$GD?Y1j557eUXvvaJ$YxwRskJMDuS8ZX0;CmH#7*`jf{2xHG&hl@E{(>NmMJcCW_CV3sR{Dkg(- z8kZ{pS5AnROVQ*O17nP1J}#84ttr;sG`F+F!9Gy&+oDZW#BVHeG4Bpw9UY7;F`Rh; zwx+Z4-3!e%g9?gyu~QcKB%}U~N9(($r+w=qJ0-`{tlazorvI{pKV3$FM!P=Ds5t1! zVFuvP33Ox6k`t$3CEFplAn9oGEczGqxd=5Sdn^(*(iY0!Rrp5S@g5kxN&O2KceDzkLe41*`tydn$d*1n{O6X_5 z+0FkU)Ui3Ly|ZM<9(-+8#WoRkxJGPP!;fP(5uMd|v>IB(a^v5%r<^>P0JpWuB)O}C zMZ)8#@*z~VU)JLGUOl8b9QYWMNNjOnOuz{~ znGA;PT;P(H*=qJVb$Wn($HlN3v$4H()cu%guB#E};9Z+6E%pm#=k5n(%`(ak7;CUp zI^m{pF>N#r&*86Gczht30NTZ+NqHGa>uNNYCPkaI>(W&!wfvWzSdCf`hDu#xhm8E$3)w`SM+j%?q?Ha7ao{6xNJY zBrOHd*Ae5spoO4T`PYLrm6MXV8?PP(xws-KalQ`T(#HQxmnpcUQEUk($un|OP!a2U z?+!N07N^^N&;gLA_se*SDxrqHjh8`71c*67hi7$u3QkI883mIia?Y1u#hfBU>q^C_ z{>@yKe34zrJ7d%(XzE^Tkg6wId#aCWlcDcfW@*m4<4#>)rFfE!*HZN+SP}Etl9cFi zrT>VDEL4}LlW$AobKZeX4{W&aeJLE@uMqb#-^4ZWnlK1ps4#A)u0 zQo(54FtTe-r{=$@7Y^@o^c$|gCrbFSzj`22XhePjTywA(e*un9KY+?RA+2ZD7&Nqb zKRVl5`j?no|G`t_phWUYl&3YPs&tr+!Q@Zupi5oK3sl}AlXBb64zQ-HRH0|*EY$u^ zSRz*7R3|jnvhaYw+^V-5&zO-xK+yr+Ro$yZ?S$k-z&lTz3_Vhy*>;u~fwd|TwDy)$ zsPHXE)KVrJk&X)zN1+_d70()6no4I^eJJAjK_oeaky(9d3bRfi+2EBFo`MnuF5~uB zaV@>zs(5?J75{p&K0b#CzE`y`0-6bS#HbU!yHy?OIjEy9T(RJ9iwUX_M<$NL*(rJn zgQr^)Rm>kWUe2o5brT}X%NRoYj-AhxbmRPM1)Fe3IuFwvJWbAXrGLSZ`)>siV?_rj z=WDO&(Pu#9Yg2ez3`nz?r zEx}ezAGt(7gBa3d*cDXsUAw+w{V(`o3C@46Ln-& z+wewQMuxtn&5(QeR>M%YF8y}Efxr8i&Ad*z95x*kK4B}DF;hnTwNovXoi>V@mE*g49;Iv zx*>cXA+5C;8Gb|Qtj?bHE-nh#4nP<)ChLb{(m5O1iUmB=XGl$5ZIN_UXEuGlB ztZ#x2TlLJm!g(eWZ3tCVzI$S!2=(A{XX1BT=Y|JLJX3m}|BNr>tzJM&N2h*no z*X%51ER3{nm$**)ix?3_QGNs(b@dFJd~9vn<@DTAYTWhbFNVQ~)*xHy3lgN`cDeOb z`(rI;Vd`}J2Pj%1er*t5-l!=wO9pj49f(W8U#YP{`9+8#Aw<)FDU&!sel5Li$mv9j zO(t(D4|~RZe2dXyDPXOp)GONe>BTRo^UdW9ujgetC4{?QW3;4hIInL0D&t3EcY1TW zWJDLMO7cv)Xmi;q>J~NCzI$o_o726FIN~mTJTLHIg;8BG{rp)DY6}B@{K>_xWvQ

|X@G74UH8e5}6cg|RRLUp(mvp=VnAt6lxDM_J9h=v1 zoi6twx9Hz7cF`AF)xs+#W#^GiCFOh`!sP^d=Lwwxu_frb?(kx++;8sBKgxMOraHSz zI4)R$;@`z!|9wEGz~ENjc}ExN`;Ksqem2mb=_r`Ur=QpoxZ00D#7P23C0dnET^qu; zdJZ4`TA%aqd}L{L#$(n$V{kUXm&tKQOgID; zu<8jJ;n3lcV$BNductrHvOxtMoReO}AB_eZ{K!vd$~i=X7i#YAIB(iaK)yJ17JF{} zaSH@2n+NOLw^zz)`8Y@C`B>`T0r;YKcp)+foAg!m_E*88*z^g>a1vRob2Gl{s^r-= z#f3Ed$EyY=sp)NpA8Fue%RtViK&HJE#o7~xi}BIGY~gl=Pe$Ssvw_Plx;a7DOV6se zHka^eec`f~)+W;CE)qeZh=$o_PHVL^&2PkJ*}NIG&d`U`L!r;4AIMjJHH+%y}Xj;?fBYh7>QG_4b+ zh6)S$L@J@TuU~wzzySS(kfl(Q3{|7jo6QVa?G=1fk_120rqy6(4EMp-B?i8M!;Idj zVW2Vq52$8671Ie!vmbzZI(LX_78*a2Zr~L(U5gE02+7kp|AB!)U&JE7q-K-uc|~*F z{I(PQho3*lVh}fZ9TG`20<=lo@b_rTNrvl!5vSXN(y@7A**9dU&PNMvs1Ftn>T_dR^Td4Nlh-Hj{xa=-;9o??Ciwl3pG@i$nlQKt%(e zea@ESE_8%Y-R%Kwl2bpYrSC-;z500J-Jj8<03MJ~BF?098XRS?d$xp4o>NpDJ(ds` zUr-dq^$?$6uCp+4^Z4nKB%x%XNBC=F*6%}3r49OwaJ{nAm)&KXn|hmTB;6{iEk-jz z^{n73oOq7k#urlrU8S&R9KMv<#K0GQhAlqI>dvL|$B@YuLH8D-* z^*o;r&+LI?g{C4~IBxYLy`X116?ut%$C80r=M2*0FDaR9%P3miCkqfb8hwYJ^Vu^6 zzl$_=I5A)WRXRr|5wS2FW_Fsf{#|ql3@NoDvdN|QRB_ za`MfFHcNl4s7FPNw}kQq|8gPq4?zwk^2radcG|r_1KmtL^IWbV+E^LB$@IB7tll}R zctv1LDrj2ox%fZiR8=@ZT~o8djR%OC>lqE{_=7XtQKN>;!HBPYouq3exk9{4_pmA0 z>*OIrJ^!1Hn;>_MVysePG{D`=xUkNg)0Jtz^+m~kGh(vrstYuL`w@X`;-Is-h}HfIRQtW1%wTwt>cK)l+tslJ zp9@0*eD{s~23bAhs>ihPdcb`&m;Ci7$<^c>1?ia8efTIMW=j6NtBY9hk@}_UC*lP4(E+(vy|ttFea0P=Kx|*7)FEUi3Q!`&c6gMA9}l z5-Sc8>%G0CXtJ#3TRFl&BZt8|PVlqWJ2KiWM~@KGUN6qkA8Qf$i+13T*Jd}m+)N4Z zCB|l+n27CuKI{C9Z)W_q_j-cmHo)_URJKH%s<_nIxhG6cUcfR{I5Ab!p@P4}$n66D zILyh$Zms}nSNe`T%n)p;KKfjrDYG@e`YW5K5*>kg?02lY_IHXa*({Z$DD;!WW=`^y zGTO>W*cyX0vErT`RuyiK3{|gR2BtXFCJ^zh7qD5M>|fbLm;hAG>ewpyd}5obNlovp zeX<*!^2`XK^C5v`UYPB!*=cSS+IK`^jKzddCD7!@=zeUd<%BGEClHa!8ZIvSAV8H) z3VbaS7>75N0~mkQ0(j#iuB8>t3(4EFSENVs2N-Ek|GU^5o7IaQSsxhJcgwDIYNGGx z=%fY8!;bI0xNRwAn7ta&Dz)X3V*m3ZGGdJeyYa-i?z{3j-+LH1Hb&}+#H6@M`52LOD=rH!e#mcGJepDYY`zqs2;TdP-+eiu>i6Uvi0sQylng(R#j=L+y@x)J$1d% z!(Z;ME{{fhPh*;Zz0tbzp%gOwoe(&UQ~mV1!D^;D+odJ7E=>iGN%*1XtWMkfE;Q$| z@?qI73j<{bo{61S({WBIh&2O-i`j(T*4H#C>QJ)Kfi_xV8@Bh(a)K!Lmqg;cdE=9P zx!U)YwuSBc&GVh2mzdWlxFa%<(-KO)u#ajVC@D>_MF+=4kdjC=5!*4K|P z=Ab~`U}S(v8+DPym@GRZLzM+hXl40KWb(!9`UPj5nvdv1K>SbuF8?4B`+&#ept?_B zT-dPW%(cX`Y-i3WWquc+B#!c=NuTcUQ!0023*BsC<8%>=+U5tTlTlMsgOfT^Ugm28 z4E(;#=4N&UVjE9?(sMJi{{su%M{0`0qb?LZ$n91echLFb8{x>tdsrNE*`sS+q;@P& zI?%gl1um!)vsNF|bkWO&ho^BQ4V&XsNAsdWY9PnF)?>$7J7nm6l_O9lY-Xpj5Bq(& zoCFwI4LZh;3k~E9+>*1&e`1^(ri6n{x5PDP?Hz#*N81PL_+(_S0Q%OOC34S`uuQMr zw!oYH)6;c;{+Gdhdo1>WOFeZRZ%ziS#v~4iqR)whsiIo7ZURa87&@hBn%)F}Vuwx#R?^zpRwPO78%HsNd&xi(Ga%acush6az%Nikgp#Q4xcL2Pm^ zKhG0{WW^=!Xf#zf5J{WOsJnXEnaOLO-2a7)@gF1JQw3nss^A22CLvMbbP)IUg6~=W z&M=zq&G=v_%NhITMf)*-%vjdFh|}n&x!BCvgB8A$!6M=WJ;TFr>k>S~$f(B`XK}=C zb#Ik4;Y;_;Z6ft{a6!SBXV1hoTSDm+VMP*n)J~;E&pKsMJfJ@`nv)2A>PCNoG5ujI zhIQvJMIFAYOC|Ha7Wef{%*O_d$lM3{=~E(G8a+3dtJMrw^Ev3zUe_->@8{2=2j!P% zpAWqG*~pVFl$?rsE40-oQ&37|C^eBFQHpY~q$KcQX6$1HgrC3OpKkG&x9>f=Hwkj@ z;K3-0>wRwuOEF{Up-r`*?H2a;N~QeNuAPeEfO3z6#9xZy68l3i^j2w95+xwwAJ~?8 z29UN=N!AWiJuc%lE_c`)p%*Xn_c0n_{`pF6!>$2(Pon=oyuS|Lyd9ZKMw|W7L@~K> zHRs~)|F|KHJVCP-sy*8a8V@rp<(P6XKDxR$KBxHe2t%kID~8?iGDLQ1}qWRW;Nosw!NokN>;8JaCq@ROhJaJ)N=!JZw!9uJ%81xc^~= z8CqDAy+gYO#sk9|PqRz3Ad73iq*;F5af_{hvhp+BbmgS-;yK&&Y_Pc8Z`ObZbFweQ z#FeYpd`0oRjTbk0nk&#q>i(}+r$4Op=M)2k6jE01qL9IqoWUMT_Ntj;rn)?@Eqb!j zzbIgTn@7KFzqw#xAcE0YFnSlO_C>g>TD(^#W< zZgCq)8&zShdA$Er8(?`a^7if02lxAFK=SR^u>tVsn|1}Dlo8|FZE|c?Z&$Ze<&3e_ zz|{EXNY-+0il3m6A5i5F2kjq^JunYIK2gCbl2LZ5mZ44NvlW`5Jrn%S2v9Asbq#0h z9k82&*D3zF*6z>R&#hBxyCh7i3q7V+dwR@ctGRlut@^qTT%Z4K4*r6A3&4KzrUtQq z60Zto5euMeInzb2>+(F9b|Zk};_n~x-@ov01eD&OK@puqYhKXcfT0!FOe5y^E}cV{{~vOr)+HC=NUyHLo*f=&;|W!c`#DpaS;1Dnu_{B% z$^H-<%Ly2jPDDW>u}#;kbr-$Gh;iiF+CR_DKR)Fr5<;xGavzu&#$j_vqIH)>el@YY z+6x8Al)mqlGaon434Z&mCV+Aj$ux)QvQ0ai>YQuTu#w7n9gdFIMvo_#`Tvg1fq0g| zHcd@G_yt^26vkBuXZ@7RS;O3h_m4B_&*?@m7_uI7#zLB_5$2mC_9URuVZ?M9Ntv+A zv)I}aEqxi&RFA?q3(Zc4a78hce_bTtq-Qc@UGw|gXU;CXnN$h0I?W#Wc!KDq$BaFU&gBRUi-3-0RabwrR++$WN=keGyLANq?r_OH?%~bB{*T(LOpAS~jrLM` z=cqPBwtN1<@hJQ3zaz^T4H%QwR{-CtqcT3d06e%q#h!1`kI415IX~{&Zt>E6V?Ts&vALilYN)NVegCiGTVP1; z&kYkYvn5}$cb?B5b#7QMj^OJV07 z%j8z9yNC}ywaKsk#^b_sUb*REW!$rI(P%5UoxbVfvW;H)SK9Y)?gKa-2JZRDd>OHwIT@5``S7qhhFpwVA^0)_1w$a~pO_?<;I~lAk%t z_>!|O(;4mhmxKPVpM#R^JF1qbsm4L`s5bV23gU-IS$v2EjJuWrIS(-xU03U$Y><=a&z70z63QLs5aN#Vxn?L?;!&V0K9 zYx}jC3B@T@SMUT?0pBc@zT1IS7=hAw?-herR4?f2e?oJjA&mi(8R{YT*N6uJB06!F zgW1n+-0`jtYJ|LK9?MoT(A)%t7tHOuKH^yg7M|(!aQxHj=9Yt}Q z8$C-#_o_LA?YQP2{`bTmkw&xq;({xm$`=77FBCp-?^W+84!0@De4;-Bh?yGoW-cxS zBr5=iNh#Y;Pka4oriMbY{#m;V*r`W%c+;q~ZsxkcWV4)vQg`C+cbX5*w5K@^q``3T zql-xw9rfGI=|s;){b3;S(ON@-%+M*~A|j6t)uj~ll+MSrp;-@lnp3J0e&~LA!vO`n z6uMH=N5aohF>(ezq%%AQcUw-1+AGU6i!QG>a_KCN9#i?%0;-{y*S(pYcU3NNw#QgS z-WlOSkZwG;>U?u9;nyLhjTb6qCn(&%1VIHYA^>_lHFh8IQ*NpDM$LNe8-%aH|9`lb zn}Sb4>5wOnDavp><%Uw-37{KQggF`b*6 zzuw3UUGa4dGd*X>^q*~Y=|4E3PxSZwX4J*PU1 zo#FX7BW;*BD7<2t4fXVmW9fxHIh>0o*My_mzzf7^m7EqQwP8Ijltq)utT%6So>z@r z&d_u>n#q!s(AuPz0G|+Y)0tS*>RTp_x%w3RTsRyEC?B3*UaGsFm8MFzyNs=O5Fktk zTYq4-%B*z+N^Nkv2`bpN?27`>wfAlAx-b~gzLs53@Ue=4@uZ_dOa0o}T6G3=zVzZr zd)sus?AZcF?v-%&O)u{addsw1(Mevx`#)pAW2U_gy3h?UYy3!9f1aaAm{bWhZLD#U z&2)MAs#(JatoVt>o$tcJ61S&|>>L&cZ)D{LbJ)#>M`v!YzO-u9hS;`%xLS^3=yBtM zH>=kI2bs&mpO0Vr&4_38-YaNhmRlyeO>bL`MX~i3JMkW1znC;vcWP!gwQ=afJvpRunl{H;McBytJMnX}?sNGKxUdhF!eGNr~5>d5mx@rkUFek{d z%iPY|BG8b}0l_6iCO+Lr_kqvB(C)<6!;H;k?a63O!jUrEXEwRZzl}`0PB8kG z%H)dH981`4U_^n#r=a4?w^pjeqiw$I1;^1mJsCBP?*+<0H^@~F*u~zgMS|i^VcOHf zp&- zt}0O_^?6-AFUU(3sbMFi;E&WgmT&dmvdVr<^f+<|r7lckA=2Haw0Rc*j{$AjTQj^* z6)eK#pf(B{3s?@t1%B-@Q&cP{MCfV}OS{b#YYmo`r{B^9IU-hFM_8FDX_(?%FT*Yi zKBCyGnCg47OGvq8)h&P;`t*-CAc1@?5m~D__%5KC9{cH*%{$lUYjw{#Y2>e+xZkZR zHQodp!(2?d?KGb-q{rj*bdHHC%~56}CHbbq1__4tQWT#`}K~e@8M< ze^tyH{8ur%hWJx4dvLf^pe$~&og_a#9O8H@1oBcbjPi|h@j1)&47{lFAL0-~YAPeX zEO+1;_;MI#AjURJebF_=E2Oe(Z!Ni4RI2%w6D8|Q6-5Z9du!prH=12cj~~Jlr~9^% z%|6`ZO0jp6J|mFLYAiOxBLGw^5p_x%^%BCYiM_lSo*M5MB^98XT;FCyUb%C1z2yw4 zK~Y<@m`b9ePE-dJljdiB5k~;_&yeHAj~=UCbCmxbjH4=GrnSBFWPYS|NJ}4YB<7Cx z9pOz^oUvYA6TCM9Y0fde;!bJWjt}G#MxvfHdRp2sEE>*8$+<%2#S(z7@{mYo)8};-`K9V{u7XbQdo0W&D#6d17g5npk~PsH*{IXhJT$qQnpDZ#E4f~3 zdk7fgU8T#~5^2I0Q4LV)d#;e-y9KoeZ{fw2{j{eUCrivFPL))N)&?a!85cakHft?A z$?P#g2DJtzHGr!$uDK9B-G+LqaSS7nls>%Wk*zj(68LAaqXUo7xwM{<4kN^Taij58oQoX+W`# z&3HO{l$&c~0A5VI>N^t+NP%Tf}2~h<+8CR*3BWUb#KBIvT}g6%FW3IWq})|QD!YMJL^G#f{(?;8j%4w`o81v zjIYO|!G<-GDZmq#6e)1F18VTaXg~R&Noo2xrQp8%5S@re?)Jr5Sf-j{UuRT;*KNl# zowyIf+_=+@6df1CHp5O2Z6gtITRy)$xFvXpzQ#gl4m4XB`!+#E5@%hcL)3@ixpS}k ziWf7|^KLeWSJfmqBtzi1p)cmfV&uz)kt{L$aj9Mk*k;Y%vWgc|V`*_^Dt)3eIy?}0 zwq{^o6h6^DH8)ad@5F$mfMD{zJ{FWvS^?6f^2A zib@%&T@JMApUKH4Kkusc6W)COM(m<;;%J)AiuW2Xt!ndCN2|&kPLT!lg5bqCV=uwO z;NXoNFP}_8%=)Vkxv_L;Du0-8D1g7}&pGYPaYM>>8b77oLb9Dlt1(vlJF3R2Umx#6 z5f}7Ck&k)bptVxJpp`&2P@z7Hi%H5nsk58Ooi}=HR?^#ayRy4mu!1Xe!T1wMU*9mJ z%t}m<&sL>V9MwE*)0kL~B+~bZxw*Kg;<(IXwCKFBL1?Yeg=HlefI@bbd*3hI?y>gC z|0Kx$aj)c{Mmk5!MxBDqA7dS}_dkkkkw!yP+o0@9nv~j|F}bcU3?^e$xAr$AT~aX6 z;Ne>LMKih%R4~B@0Uyh|l{DMm@b^nYDOIei9f#sgM&J2d{2K9w+AuqG;F@O2*dxWC zcI}75oJGeex_NJ3C!!e@pDuh|#llbKMuiorYL|zfNiaB2gX{$)Hw^CK#MF9qUypi5 zA!IREmTG1@gU19&qEwMJwLUExe)fLA_}` zobyejW+mIaUow${*@ShRgSq->gyP^KD}G7Ec#PuWxzZ0DXUm7G9i;F;1BnhXqLYE} zjAVgIESLp&kUI~@h6Yk^q=wFEeQ>JS`qjxv_mxo>PBvO7Z}#Be0U$5eFsCPe2hdJD zU2bAFVx?PMcP6C{+uq31Deiek^==d?RLoe~b_ETRdM-3@QZLR`h$kS)ZszG`U>OK+_M)#{xP0s(yI$MtQwQ`E+2W5){HW zB*(oSgHi20B|BZkuYqj`9i|)-mvZ5K;H9>$zfnFJh)P2>BT=wa*GT@CWFsT27X)HpW+iF7B_7MY^ADhN3w59E-p8G8b^im zPooW=$>t&-B+RTH30Ok-{KUuzX-5tVdL>%P-bXv8RW*~bSFTRJ?8zkoI!U#x3Gww% zBNbf2!r%%@0JSfLR% z;OWytZnlaLGFTY6s6>cz?&&Kuq)2-}>OR3w8$f+V>tCx^LCnCzIY|a%=oHNb=(p?r zPz^$pD7=A=9z@8%2CbGgUX;$TxHjG119`X5`-8F2Kr%TBblP0hFT2d592nKTy6$Z7 znRa~}sIUPBne)tm)0WJ}M2mnsgSel`4Ug7+Q8ogaMfcCa=(n)EXac1-hAqjk*DwL3b`Be+1eQJ-yERI5loGxu{H$339K6{fFrN2;AJ4EWco z3QVMY2C8R-02Q)Dtu*ZNl(U12+Na-t%2t~j&EB`^cavRA%1_UJAd z(4(WN4!=Zl>E5is{WGn~Qr)gQ)P5Ux?xuE*w)`I%xPXT?ov>!_u!CjNbDyW@CMEhD zMU9PE*E{YMs7Wz?^1(^g-+Y_k0~m48@vvI?jpU6%H{QXHMAo+O+z$);ug*=Qq1-gd zOnz{k*ep0f=o%BVo-vIgaT+l(OaMMXWR2)MpS>hH(6d@3Ni);UN_!yD~@XEcgV`hUw;m(w4F0Dhevv? zqf>A(5Zkk*&yHEFY@F?*UhK@&AcP*#5&PG=rhsTAaP5@PuPS^O-qw&pdp&TZ^jK#; z*1kpTEoSm`Lni-9Xku%NpnuDW<$IFH@141M14Oh3UQ0 zOo^}I3kiwS7o{s+1)H01foe<p-U&B7wLi`2uKm>9i&6(2_X~_>4eY&fq+PFq1Qn8a_;Y*d-NXe_fMYiu(S8< zHSerh^S*1%92S&Lcd@ueU)ZPj)KWDZ-lCJb=fwx~HFDT_cYHW90J&>bPTDRH+*7FE zE+>|K3sm?ue7w{ue1maBc#?(w}Uh{MIpoZDJ*iWu-y^+;C(Dp^(VMCy0hYE9u9 zk6@+*X2IJks1A2nhAvL7VjQak%Yb-?f7N=4)Lt@*V1LL8Dw64KdNpdBKbKe;G^~yq zu)hq(`s8C^BSuT0s~4Aogb zyW~sU@jF~Xo^$QX=etjJkEgCP3LQ{?t3=5IJeD+X)1~D*oEXIV=nQvrfcT>trS-zX zO{*Uuqq3huq$td^VB>~QOtJU{fvmyCuSDPAg>ofgd(4oL{GOB}GO0>ipb?5ZR5iLM zG|a;~0>@D08#e_A9iPJm9Ji{Zik&IZ+4Wg##^hU}uP=$N#oJQl7H=Xv_GCXj6ZJT} z*Qjy4(0s6jO`bC-I`rSjpSWZkwGYOlB36HKA^*}UzOs=NOYXavSpre%9jJXq~x~MhR^Sq_y;9{9Dnt zk3f(!rIbthN3V6~W;7mFIaytS2)vCje4RupbrwIY12_=I5yB4_4Xp} z?c2eY`&3#6nvY4<`#D2ABKhZcrYSDfX^=Jo>u||t&B|l**F)92hg5!rZ91ynW0K$x z`5f@}u&Be6-EUp5_GjVKyOHzuW2CJUIv)%-9D6zDT2jLgMLj#OlXCIANi(vhmV`p#!XK?_*E4%Ml$4cuYL}7{s}}sb7SoVy^T$kA{Ux5D7P-!fvdUEQIgTU>c?xyV zjR2DUzr-88juBkU@{Ao6w*7-5*mC6W;*EdBxj(edrZ=56weUu#e_6IXNg}4#iwDFk z7hd)`DP0;%-u0cdSVh*gZ$><9y?)N}VFKy`T|%At$4{V5MSzJoPNgYs6PP;q*tVDCLU^U?eP~!*EU#eH6A8IYfCPtKzcU?oURkG+sdjoT5{d0K3K_H*)3L$r)^w;=zg&% zHU!&Z+-(iC_1#&3;Z-H|FSs{LI~qQ3)o2J7D=vPczIKg5e%Kkt(Vs;v@bcx$g$JmK z`R83O-syR`pJZa=!vf@`uH{p;FsgSH^5?TgdQGVnu$?)xFR@tcwltM%7m3p<#bU&? zJv52ymx<3J*u|#~t$89QUJtm5Sz+UawyHW)37zqNI&amLW(%zRqkJMAh&vuL=O<4@ z4%Ym4gPo7x3wLadinFq&ZZaC5(0d55L`hu-)&hvU_~ znKo&!!J%^cr}2PK#>e0CK0z4!)sSnPd%m)jz;g#1YK<1b(4!p@BI$(sll-F=W54Ia z#m`Y4mWg2hqO}crU+A%KMFT*uR++dEc_JsUwdzmkT;z|!?HO8hOlc;q;H=Ym81PBsih&q1W_#A?NX#y`qjeUMXj|=@Wie6c*$(+g}XUh5Aq zC`OCbcGuo64tUd!_TU|ieXoc6F<{{<4cKx(zl`jev*dI#e|(El7{{yBu zZ@oN>c*i@|t)k`j@maDkdX?aUH#yV5j9ylWKldPS{EVwbZ*tkqjNQee5xN&e^TAR2 zzT!&1-|_$MecdO3W6pV)_bo~QyAp7C4W5{wG6pm*Mkj~^z(Rd#$y4|K=y18rlCgEX zu=ayKo=Dy$UzwSKP=WFaOV@4J^q?(gxst~Kc1`aLAWpQuHu>L;&p08gWyk_(nHV3k z;&1D_ohRBdInbAS$J&P)wtqc82JOdWf_}llzYQ%jdXYTZ{8q9>*{C<|`VCtDcbU2+nJjPb4H)!02oVkw@Rl<= ze@b)D1n_Xs$-9keB&VTrs_*1lR6h6WE*Xz3eIpq8pWgVpU+C_-1jsalv&w-Vix+in z)iT(xh=Z9&SVm}x)0+9&mP<8ayn4!c{IZ0-!e0pb z^(SAc$*(+Lv29ICE_#01cN6P4>U9bBkH`OM=Fj)(|7=i{4sih>Ge!sgJ=adXzqi}3}2EgPz6m06G>!e0r#9r>Sc5=vZ zgm|<8a_3aW)%q=`pWsgDWow+JdpM>}yZTJPPl~KQjw^FjHupicLw)j!$qGNl?U~pa z9a>pWZOt}ah+-DQ^MR~UFrPolssd4x3O@-xe`x5}wyS#em*Mxf@ec3Nxq7@W3eTOh ztaMDWD5HsiNS{Tpo`$5e^i)5uF{=-lBtJ*b31SSQP%We59S;_?p>0Pu{)X-^-yWPH z3t$noy#TgnhM9ium=oNkJQL0J_rrhr)|*3jWfUxxjY|AN8W~}){OEr9=P#H@m!Vs{ z$Pyx-B_^00gu8MFf=Ru%_v&N<{D;h$09l5Gg@x-g+`SUN_vu&UO}g)9z=I@<>X!aF z>q`1Vr$u{;#aQ) z&OIbq{|#f!Vs7#gALw_p1_)fCB2CC0cCerMAh(x$fY!LgUWO+|l^*9={;;Z`T`jlkTHOFv51@wS=3jhoHd7|L?w^WM0<#DuU*YnMj%N1>=!| zlWCm4MMhn{tmDeCPcILn!3B&`WGctn zLGRluFSQX2c1jU(PhMDceqpM#*y8t3I{17Ai?+@vB?#Vk`P-1ce3RKAla)Vb%@Dj) zz0F`=$+Bg@E@=NLoK5n%OgapG0;fKkG*B}DB0o-bY`r?60;p|no*yOd~W%lpcTgT-_eJC|3HehdTBCw`)A zP4`Z%c(M6_*k{x=xkd;Xp@355bFg*xomNkyIrKqIavYR@Vd2=RWh01@65u zYT(_1H1$_Q-}^%+59bNfE0BiedWndqr*`Jm+Zl371Fc$?7R44;A=kVlY`bd$+3aJ% zMzG6vT%g;`fc-Q*>bUj@jB)S7(nKcrJFJXHFYn*_#n}I`Oq4N)ZF!-2N2hLA&l#S> zN288w;w|K(#X0$xJ89JYz1LWEPKDtDh@+Qj^qYRt$lRRe0H^q?X1=FVJs}f2xDT3k zpT`@k_IP$i$Aka#oL`5?n09QLj}SwwPick$iT=)sbyQyUlw026uC3f6^fl-VIMwQx>C2eQY%+ zlptfiPmDY(wb=H*zD^W)*ba)M{>^?eqEzIv89d?c?@K>9RQGB2Foq7eW*YURiE&~7 zNCet742ECVS|~AXO-0x7IDyuip@E^3{yy<3k8m?yn_nfnTF#I0w(qvnBV#0e_QP0W z2rC?KWDcm0_CHWNZ-Vo}+4OGaP+}X7v$r<4a}^Itf^*wj-GV5Vfe=WvTpQmWv-OzA zens83SJ*hhsXL0Ms>N&y9hF|4apa%%TJBdM-PPg<;r0tCYf*|}GW!r-MP%C>F6il@ z_1v1xo7vB-@0iCfWfzHzLyJP8i>t#k)x=%&WIA~L{)FcN>J?u*&;@%vF^8Sqe9Lg6 z%sp<*cc*09EC9Z;++g%@tOsgu18(4rm3`S70qyX#Q?h~!RU5=56<%Z#uF&U|+w^&r zJUINavmM)+vNt1U<2zAt{4PZb!^?u{iCfk|wQE6o(W6KM+vO;m7^TaHm9I(jVwj8v zyxk=^d3&w2onPo_Nw3Vc)YvqSQiaA$k-C6VFPj*Tjx+YIvC+o@p zbUcrOj-fpsehmYL_I7K|ZE6iZ96CF7^M0G#Y>dh&58c9_F-zZas1K;??f24i$4hg= zc4n25nDz~Q65>?YJ^`n9^Iuyq-iOSF{S=j1lN9rDr=VzLiiq&Xwjg$TY7)=Md>VZ zLMKzwX6#Otqnv^dsh!lt6OCyC##9vs4Z+Fv9m&Aix)Krf880(tgm1$8Cy57p?I+~WFXXV*#Y7A1 zd2qh9f2RhG`vDBZ@7BKSYem}0CKA|AtfgjpK31Lz?;q^vb6Az+EQuGxAsCHIU7A$| z?R(EHPe#)-&K~GQ*3OSO>hds*ix-4~rte%bCRi&kz5=pMTaVc%yH+xmP8|;1tYxi~QPl%)g zA^L(C^q_knap9r92kN#?(t%J{eio}j)w%ryU)Mg8` zGa#zI4FY)>eHwadYf#pM7h$9xM2pG^gI&*< zb$E6d$7c2uxe{~a#C<(o_U;w*W?E|1Mky4`Rbi@9khUWrtXmcg5x&O-+T5r^QJ>`Zp?(WmM=bz_|#c$CZLU|a2nCxj7 zu`}s@yD-Z4J9$!t(v{4wyouE+2jjsyi{H0~D0Ze3h@K-+FWEnjXKgKfWCw9JI-fA3 zTKg35QC+=jQWS>$c7AkrvaL$!7cu3}@ocQAI2OH=&Mo#v&WvU#O`s|l3#%|G1iK+onq?cyG0 z-=A4-l;%mau{>J_cZVaqoqZ*dh?Z31iW1jt!4~4YbZQLiS{1@dwJ*G;@nAwUzGe4g zDJoj!*kicV4;*-eUBe(u#!9WV$HdQ%8OS))Q(p9+4>LW%`=YNObg~aFrnr3EY!cU8 zBc5-RdaliPVX}+CdE`qXSPF>}GjsDNoQQHoO899Iw>_+^e?YZQh(?Y9HIenw-4R_o zY$?Alpr6S!-;Nv2$%I-5bv<}mzR&S;cNo(ywH68K5jN)DgB8{)LiLQNg^-c9yOis&*LuK?7&Wm7+f!ycuPuwb}*F@72MrWlJN>8IYee27ronujO#ek zYdOAU*EAU{WMVNi19XvT5WZ!oS`%{m`ZJ?*_?TLezu&E zwmqI*w=|%>Qu2L-rN<#M(%z$d4((^11LBxk_2^mP-wU3ZN3X`C$Ke8c^>9BDV9(O% zw9QhpgLtx)v^vmfIBq4^_7aP=5)cqDN-0it^%e;%b3DNB=i>1h9b z&*m$N45C&yl5(kvRf3r^XVT;8_3%n}+d@vUbhD^SRYR$l2o?ci(kTN=sB|ryT?SUZ8ObUN@lY%M7fZkerFUYzg=wkt={GC%49XO z^_E3A^y^{u$%AH#rUMt)jx2C&(&a&&!}y3yS|3LyKX7Gx8$s*e&hXpf=dLaX-7&5` zkbI*m92U6N^P4c=R>*>ndMtpw=&4vSb+h)>S(uU# zeJ3MGuL?dSznU~uo{@5*%TWaP_8F^(*cI`ZZ@&8@seUNok{QHrIRR-gjZG9=c^2$Y z|1?n#KKTH%gA%Ls>bQymir){8FpkRE;tcSjFB*1^yQTES8C|KCAJ$u190;4{CW-g*BOBq)@`Bi6@aXVQsrsQ==3hfJ4)qWj zj#}ox*;_%0kb3hw2Y;j;=s!oMbppF|^j<6msvxE!ye&2gm%ku3>bi21A9_T~Q>VOZ zXcO#DOW!K+HAH*Tb3e~`5&Yb~_YF21jgK#>z58v4Cl49U%c3df?)uz}Y90CN+ViLB z#U%x>%W&t)1*6B%huO zET|a>P>=VN1m`_ybE3}+@eKu-Dkf7pa>VDANO9s~k%8M-q5OM8!YnB{ z>N72+?Ecf&I%a1nCKX+tCYTCl<(3&Q03PfOp>Z7Ew8HK9b5lN!lLI#k(rZ0}x>G(S ziFd-$3;{ehO-5}&9#u;|0Rcxlm~X{uB+kfm)^= z^`uSq;> zo)4}TAS9l66{3pV&~y5g6trM`^F&t5y-2Yt#Wgh%A>P4(?8e(AT4#!gyCC4frD$Z#anWrE>NTGjd?g zKmV$|Syf!UH~y%wO|W%=P6NbYQc79Ns}3_oQ8B*udT|cozs0}9yi)p!QE01J=D*T*LO4L)2 zZQ$B79sZK&(E(?xomp?%`ClmtlYOchndG1^-$b!1Exvq@7sn<9uo3vlJHeBqH|NnQ zQ3*vOTcSfloaRd_2NmGej{;r>b7hhzN-=4DafzA3mGu*)1cp-lDlf{MN2nL5>ri33 z!^>!0dF_!7->M{apElkHLii944QG8Zt-ag*C~@ATyr+B@JJvRvr?xi48^3WFwr@jP z+<&PJnRJLZ7SwyIXEn?2cNA9)dT2c~e_Oq8Ctu zPdIT$4nWMJbc@uWm zNI-Dpq+pNl$|H>bnF{g7-A1+eleP3_E!C4LB^th0_OVcv!}a)FEClMaQ2%Nsnc3uM z-X9XB=Kq>30C>6iw-~^SikxA&qW5#`I%D*DeB6UavpuzTEjQ*5gK`(m-qvmO(>oJa z3go>H(lLE~Ty3=Yxar`DzM&_Wj>2V!0WA&s=^!76PnQ=R{9|yY+>vk4Y&S%+G|wqy z&sJG}z$I9X?#I}NrM8l8g>jgU5}l$+pEp#ak+A_J;AE~w+niF|s2CplBdJbH>s}yTFtH%;| zQ(5*0?#lkaV46gDk*7&#&j2IaN)GE6PP{kEzh-=0EaZv9sf}-e2 za+`p6)CySNoNXeshWReJ-OUGdo2-6eB($B#v8QsRVP z4{DaPcSP!sI+_ua7zT&^$i|1y8AUDxe^X7P+otNodExG&l2~TbAeeiQ--hX?~f3;|QW?fdVZn&XT=*K%Bf~LDP(>dusH3ApLhV}E<~V$T=Hg}L1A`vpLf-USi7mJ z^s*(>lC5h-b_WQa!fRuZ zIMdLcf!(#*n#cZu;9W3_qJE<8>;q%tN0Bo3$jC7&8q;jQa)dG_KPbpVSYR-gNH=cM z>A`Oy_OX5PBd9=~mK)16c1)71czi$1kS|5t80?KREQnq8X+E~$Cmb}OiaeKfn5&1B zO1hsUcT4+L_a!n%@x+5+MF;4&4X3Chqhzu#=j%fk#)W|Alh9WxRSLAknn9tW_O@c_ z8!6?}+#QD?e9t4DZdnTp%T|r)0~b?DBYn=6ZM1zCG<|qb3rSdK6j|!@_z3lvTDLq8 zbC6RxZWx>TFAeQq`o@pYi#2+w01R(8U1bSnrqFDWG$>iFgcMnZv#~(0>x(;A@#@xE zel=R@-rzi5GCm~G=;lGy$BG37tfD6Mk!`O?s=}8W;{-`IS!m=EV)_xyFpk}k9knkzdXXKHttg>f#4;HeZVe8za>!IM+9DU&ngSvVZur?`X{fY>I6o+jX>i1YtT z#SKx3x_&3A-GKVZwuvt5*4xvX`bOr2JkVO`j~81m6$iVw7;Id)n+0e6Rwk2wkK_LO z?Hww4&T{z&D66u?6v(1We<qw`bqssqI66)WFf zRNAb-MCU?nJw3&#Z6{JK=u-hrURGj$0^bilwrBE$=(WwhHZ+b7JhjKANs=zKjDi{- zbQ8WlwmJOmOZ9e1>Jw?0_dsOu&mB_Tw&H2BWQH_Z0J)fXm3a57r?VA^8*oSE0vKD* z&$-gLNhkj^2*%SIca}b+^&Bg0p;L88Panfv?oM20Txz*CZQBvY=@9o1$al5LO=`u0 zBY?>PfNltjDrm3Y=McCv?Q6 z^)XP6@;0IZA+rsum)ZAD)y(KFlK18Jpe#44io=1RGdH_!s2jWX64)tD@eXIn5Bceo z6~2@9$S|I|0Vz-n9&=hkjs8wp{ky|69-TFG5xSJ8AXEbAHspbxJsml3-YC<-Tj+$o zF`%RSDCtghAyikn=;TR9d;lIQeh!Cy>&-tcDd>$1>`h4!9m8|G%Gb&1@c<0$Myb_X z$I~6bJ+eN$@^#HhFz}?j{s9u;0*b&qmHn?7=`UbZP>``pgjsa=#qp#Aj)$F*X+-;# zQ-=Xbb?4BW!iQEYS#NlapOy&pM&>aG^9Qau%csR5QT#e32I#Gr|BT^ZQSw!WN`6{Q zDVas8kO^`yXUQ|abuK`e<%DsQ(f&AmH59rAHpt{el3{={97bK4OSc3MWUcLu;vRx*>1NpU?e?91u zRRBj)LN=fquVV2!XZLZMvou?eG+JdY7={kE~hqq67*yqfn_I|h`LNoj+W*hAXxol%Z!r+V7!@&O!I!g7SAUo$}p z4n{(|Q@WDG`ARBRPPK;P5S9EuGuTur8EqTJy*Wm?Fv1}2c&D0XZwcL2%2M`z`V$7I0(Np5-=n(MULZLC#KOXDQ_obI_jR0C+D^Onz3P74=qz-5 z@1P#H#^!p>e<<}CN<+k|bwtX0X_-+Ij&99A&93TEGU|U=Qx(d|f&TbdbLX*rV&bbr ztCAs2D25fw6Q*O#?*i4ElM33#n>CBtG>oYCZ~5?3FFM@x^@K0g9nC%TB#F=X4VK}a zPYr=qqdN|I(#RR8hpYS37hSV6$9h1_CT3Twpjdz8B=6izd)PMd%Zaz^k(S;LEbH&>w^`d1DnPUwvTw!BsVV$_)a`7 z$fp`#ewHk0%=IiS&8J$-@MBfZVxV}#;P`Fs2t3BO%auAE10rabRxOiG>~j;3OSF`Q zB@VQ<^B#BiMe=*orp3UzaQHG3xK(tui7?Kz&GK>&g) z)w*gP3Oi4-hFL6qiI#-MI{)-}g%lDtj*E!cW;!Pbs%Ut|CQVu}F&dS|!1q$(2$S%a z3_Wb=dY*KVFucT_Qi$};SBg4aM~gaq?H;)eT0Mp}x4RfsKUohFr9ynx%2S#5i=+8{ z&2StI51;?T1IuKR$x4X23tg(o5or2mJU0;8EGUT%zPwEEqt05LY|zT>PT-u*(|rT* zt;w&>Y7#(Q5p#Ep)Xf0!7R>ML9U#`!BhQN&!s_Ku`&ZNri@P+KXHC`D6LznMRQ38w)S%yq`Ki#jCfKscRg4nE3cL9Cl(@b=y8MpVcLM_a~= zVt76Bm{n7IWk9bg*jr$0gjsk!E*%$~>Fk4P^)yQ~(dG(g9nX9_z!mvP!G*ils>D!C ztAVh^>U?Mr#!@TlVfn~vJO`#)&Wm)X2?eA`%H~>FAPueUjl$mJr2=X|egvMrp4 zNHlgnGK0zRPeHhdLCsw>VtTtCvY#7!%=|9Y_6KBqfqB&9i|jdAmTwwI40m|cdum99 zS}91zxkWhUivZO*Exh9KQqh`rso~os@5{o>Bl5HudaEns{^U=jaMo z;%$!&lB(2x5LROEk8C)px|eM}_xYiKt$gfDEptjS&<@(P5!P2M=Qqz42qj(!+HIb2 z!!%UlVb_?=OA0Ous{4xnnOph0yW)V$(5+nxURH2*Q%m27t*3Edty#`nrd9E_tF&|P z$(j}L>lYHO^FRt2=6lTRgZP$!J9PnmeshGNlFK=35Ha)q_YaJ0mwUr2E~Xl*&9rxL z#nZf=|F}gd`9Q#FoQl!f4ij|0cxCvY({JNv!0o9xNSj4-lAd!{_0*9g$Rq7Dqmg)$ zlWzm<-cbTAVWp@JG5?wRhhF_*7=$Ft%2l{F$&VJGeGG#gRg;BOqyvte2TT#gXKztE zA5?gbJ=AkYnU7|>`IZx;x0+%*GIWi7_x22H2N%;yIE!aDXZhfbexsKVenXc$y#{XX zrFhc^z-UvYdwKwUU$MR(;c+7FDb4|3<6*-H?-G>Gq!gbBH2IE5DfLHQs@=D39--B1 zoqW;H9d5(~6|X05Gg^E9%y}#DBzS;i4qJ~dTAB#SolRM|!wWtv zole*LqrrNfam3@p@+YVjG&+8M2Ms|m`~85~9o_q9N^_Lb*PKFK>u+t`;7Kx#bJeLO zVeH{-jFxhf->80DQ14SRoLMYUuh9tTq}!}(G42P_*$o;RH31};YRig=_u zpUt`O%1(Ho5X5Kqb$Q`KSpO&vU?=p#c5ZOcz}`*Fr)p3X>#UJ7_@;NFdHy?@D8{;M zrH-;O?e)Q%aAv(b?2X=u^1@4d&D`!tnSN}}3j>BkiiEbK4@*D`0XF-%X9NG7_B$1TX7! z_hwA;q8N}Ik&R`>6C=KgQm+XMQiQY#_=RcPSr0$AY*K4eF zi#nnw&VT-j4MqkheR2dD-RRLHvAurMo8S3F_OLiO?`MTAu6BN7(Un6Yb~gr(koL7q zXAwIDr6V(y6w@S6#t)v7GOjc$$Jq|sZ3{Hkl+wr6$b4YE-xX8g=y;!FDn_qIe8a6I zA^fAO&3oLRiANNg4nr9>wZRTy|6Ip@Aty6@ta*!>r|is!*mdt8Fu{4iCngs{&-+AS z=@f6Cq3Z7X<1u5809XLtc5Y1iuM;huB+*dZ zYRo^Kk`Vg1H|+8=v!!z*}aGEC6L; zF=_{+3V@W1lpCY#g+(NZ%podyxaTU%oRu3G`T!)+6bOcK1X@Jt zd6Louaj=onKIR4HY?t)v#Uo=Y7JXc=0(Gulxs|Tx_UQ3u{>c6Eg8uxbq>HH{B|DnI zw5t-g^H`D$WjZh+ZAXuu_$!Y4SP|%DImN|2A48e~U@K9dlfwYZUudd#nx7JuybE>C zRmQWCSBDGo0NYZGtIYc92A2wk_egmHT1}UCA-slkluIkESZ3-at;JgmtW2vj%Sy47 z1NQyt$vQ%In@H&~e8uI`KE#22qaQxwvQp0i$8T9ue@2CuG|nDFj0HbeZ54=Ozln5& zs?~@suIVo9yhXw~5HW7fKimCm8n3D@q)n(A6AW>zVwddSSrT0>wljO@3A=>CwPiyH zi2YK9pWfF(6~Qd7r8Sl|4^l7IT~Y2= zcQYv_c=Kb7ELE9S8UEPs81Non6t_>{uXNL09^C?ny%QA910724;mVaSug2U5esOy6 zw4vehoZ!G1>_>X+)7nD2f}C6BhefCtR!lApBTil>sB|>Ex*P76WMLh4Qa*v>_WJ}zn$P52hcrM|LfzJabYt+0K^ZV3pCM;gDN;|c-`2dZaCxQ# zh3+hYx1P3?tR8Bwq8g++Eg{KO39ZlS_`6-j9ESOku3}b$RU^{7UTx z8Wh8sIa$^YLnKZr*E)l|QlClp;b%yBgtXtLD`Mvy zdwib%QDnapg446nA#a3uK=^dIrqdl#E(On2%yJ!WZF0YXs|_d>|#&S zp|Sm%VFtTrO2wfs$pd`$GbdFnM-ed}O5s3BK=3ZUH=DSF zo@}%k;9Nu*YY~9A2N(p(DEQoch*{L!B&A@?>!xxLEZulvB+eLfLP5#+s4npW65)U?v<2ng~IFB zvZb0~B`eshX2mwU_aLZqL&5Xb@ccNAhuk2j*e;*p;k=35$EN_G>kx5xOr?C*h*a16 zr7|0oi>3+&#uEsee*cBSx7>8f76#z`e8uYiv`=i`D+Wwkbqf_IF=}?bygC`E9s-BX z2q{-5v5E`nHucseFQVf!lgqcT4-TN5&6R>_ZSiTJd&gb2Cd#bX_2&d#8HH)Ey<11o z6a}wbx70x>P2`H&V$rb&a@FT-tJ$Fi)0=6f`OAaPuMc>%W)TDzX3?2Z#PA!GXJL>wHj&HaC~i-@hEHV(X6p2 zY#p%p^iOlGTSV;-Hc_i3I%&}i9(-E-D=#*VsabYcZ?n5-;YT{N<1#Io>=#?e?pF&Z z1^|j)xee&>Ja7CTn$;gvyo2WH2~Y2B$--A4338UyRJwe6Xb5`QFe%ouS(f`*B42iE zczhVme82WJ6u8X@&~5!R)xpO8la=2gYod@_?#rP;mFC5m{$cFP^({fdXr(`~R+3-~ zAjKy%EY&wN>@~=7aF_aPAo8XqP`Njua1sPEIE|jZYrwHybWS`(0y^G7mB78J?08bFnK7$gn3)^K&;k~fX(5`OMIf`AgW&mlxfjl zrKf2K;)x&QLx3(W8ao{GZzBdz>s1+!cp19o2;>?4TcrRURu5 zyU&W>0XKisJ%H-1uGgXxXlhg!6M=Xl`f9e*|A7 zmNvkQoD7sY@J-jWjAwkg?@gG?EMm^#fyHHLEIvh6fX>`R? z?%2!8a`1HND)H28!`H1;qY=wCY?dF)+rY7uk>>1A#+ zSVCI>5pmJqbI&)=*1Q(M|2K2}@DYGLx=V+|gu7yuv@wHmkVv_JemlxX&|hL>(G31ENPuUKc@i|>dV1|!(>>hGVf z6nwsw5dcgt$VGrZ;HF>ziU{w&4D=gjlo>->Z&xqntJ#t&Oerg2!+Dr>On-|Sg)R~2BLbsQ7H4?-?;kaThtx8k@7o*qlMXivl=*;Aulj} z0YtKga45H3>#g@cg<+v9VaNi^w{Te5op?kp($AixW#fa{;in1~KghG9Wt1Y9>=XJJ z3U3RJeR8R|BQV~%osfRIoOH-TDjoe0PEu$hfQ;^83-DAmUb#ZGC#;~Vk|XFI<0ryt z_I*zx{Z=oY{Yk3Qu&M~EgatKKyRu1(fDlG%{yP7oM-)jq-#a;_RW4~YkgY1P%lTex9T55y~X8&>|%j|_)>zn|4{305K5)7I>HDmE|DQwe3FVjm;}gt5WL(q^W? zPB=BRb!BgI>ed|>BGB-LEdS>Lx0AGg5~Y6*H&WqW*_%X)o{D+vN|(hJXkL7G_F(6i zh8}e!o`5>tIcSoTHAdx^JWA4ctNNTQ#^DFGhqBGz!V@(+ODHweeQpG%^?Y( zhmx0Na_aghNd*T)gkev;a)u#9k^T>0Itn-rL+CvU``1AEuMZg~vViZ+nb{=e+ug;0 zQ7L4>0}ZJGKSuvK4C;(-FrHrH+zz1o&iPQ}^`OA$w;+Y)oyUOJmUp~aCWS#0kS;AHZoAdL;nO%h$wG=Eu;%-fSWPAfLrTRZ zAT@MU|3b@+$kYc4_lpV-eH3|1KFP_oI0?aFUli@wE!t^0ZzCU%ujoJphxb=bV?-C_n4Mlek75@{bTN_ zE+XpJmR3mk*gS{sF z{%QM7$Z2zgj4Wkgk86y5geHr0xpZ$S$5yHsV_YVL-f$DV{ERW@%Xt1@Vb>kc_V)K% zMfJKy7e$ro(AKOyYP4$9>@s6-lGvk0xM-`bmQtcdQM)v;BZ-EVqQu@wjH)dSB{t#t z_CELi#+5#gzvA^u&i9dr_n92ZR4$B6t88uQr&=m9Nd` zev~Oczm=371d;9uc?Hwvb6gsJs&YmVx_rqzQNeSNyMsqUE!;U1G7?n&RUK5!E-?bM z$W@{OFljKRIwu{&*ULOl2Cr5?Lsk zl009)hO45=`S}wfpL(!-CpQW8%e3)$GveCn28z`-66*zDN6>njyh&yu2e zq3v@`*RXhWY({bbzg&%bwguuH_Rp zW`Wf zdXlo36!e-YgjuSF0wjSfn+4D=DWTMOc_(u0J%GrG;3@ACtPlvKmbBOM(wk>GY!Vh# zu~yic`UCT1l96kvs8E^NH6GXP7K!y%JGR2j*#KE7aTCMYU*ms!Tl=?^zAzuSRQ585Hs#&r~3^!2? zSeKG-9+3~<$hWLLz1vt3Sa}wex!vWSB8ikETvXl{@FZ6lv#&nAU~;PJtJbNnvwLg0 z*MG0twdP62_8NSPPFq`liejn1%4TNJ50OoyLI(QEd~~H-i82%KOUgIfC3`&x_4SXv zRDpgsZbwiBBz0$c2ARCQVEH`~va2Y5&L9d>kso)IH4-*;k8JhQ9l!Pp*$VX1Nd%4^ z>kx3!(B2yE+@Q3!zmHw}Z64Q!Sfvx-#|?^=D%JO}UKkRDwVP+Xk1$hUm2ypTbz zm|%CEPCbbauE-kEQ?jx0_ZO_&{p!CA%B;ISlg2j{87`+bTqTlUg9X|y{6X7FxiHjs z)MiCfi*Lyp@6KBCG}AIr0*e*2b8*qnfKhbHRq4n?hAHxi^Qf7%+4*B^GfEXO+K`vN zygVqr*@E^P*C6$5- z2AN{ZCbMK_9OTX;hg5Ip>&Va;fj+lz?byNuzQHy>%5Z6LJ|t)x-RE||`zADR)HH9g zZxTuAYLg~c#GX#F+U~%N=LCc!2KW#WO$%oFXRzTf`rNbH7pm;u0+`XuUZowsv2NCK z05#&t5vj;PdcuKTbr~z{`|B-+=H-HIa9@t@5-K;TG4eJ1smm%De1&#lKa44PlZk<~ z7-K2N7K_GotY}(hybgQBDtID1bV}5+Gx~NoYjT-gV%)Vd(ukziq3b5GuFi=F;vBvt z)bGXfKHIpY6@VHeDi;D;@O1hOimujah(4g z??c5B?%p}tVLZ*-N_C2m61gY*qV=A}uqA|Rg;+aNDI-(m^ggjztyWR50tME%1FBI9 z2EysM0ju9o0miy#a~u+eQtWNUC=z4SKYW!Mf+!T(n7tL^O&ZYSU&b3mPx@*2`n)by zxrV&1AOeQ#`L)@cJ4Dncw6uEgMHD* zR)>`_enB#-JyKXTSu7-}`&(>DZZuH)SR!qzDYmeoyB6rGQhc5J><`}znm)Izjs%o9 zQ3-}5I3;dUfz)B7eqLhHl^2`ta|3p1o2^&bE)vrr4%&lX4trRgmoA8f1F7-e&7t~} zcc>#SGh$%MmN8I6SK>k~G%piOWg?8;TlVS^ZcmWFA4yo_0qX+HZhstoMGR zHkkiKZO%qj%z!4BxY^OZYgNH+r(7$>a}ZE|`^9PQJEiYyY{Gw7RKITM017@H=c(@2 zhhYk1L?D*RB0rl~|Kq6iXQdJlL?R`H^((wBL(l#0190}-6bDK5BDy9-nkDEpFJx*R=1)~C6teArBL=F5Bu>|`O1QJrfNOY-OA+*zL^WAj8&qQ zmW?iGf|n6%p*c=gui%MOu6fznOj)I!-e*joAm~}S_QVP-eB8Nl&8GisKI)CKwB^wF z(41rpG!GexH96#N5}U3g)?wQ$%WGDlNP&nei|eE61*XQ zxo%dwNZCL4zqYQwF<#~Stq``TsGZj7DdIuq3HY#y0<(Ry{f@lVYT}rC(=j+<=Bj@j1K29QysfInvw# z&a@^!7OsH=DUKl42!$!ucBnjYtt?t;zj7wY=jMz;&3SEqYgv%sC*4=1-wVQ1w>bXJ@+4qV`frpi_2M~0F8uSeUYvsMfbNN{lLctJlq zEC3Wf^FBb^^p~a5y?g$JV*9t~k5THJFV2_O%N+PBO{kYv4M%Js;w!7VuyMV&&Sj|+hh3aC;z|pzplS0B?>ON7PB|gV?YmC|SxJuu5Y+ z5-0ry6!Asi7(vS}#dcBjYCTwcME4C!A+nw3?1XZri1pfKES^Rt|73jN@#>X16<{IV z+%ap4mFY#ht5#2yg26jOyqG=QJIIGTR$rjiM;6EpwB0>F@sNFIzH8s%z^E{ zg;1*<4s5nB6HIgZ#6O_8C>rAluN_pWepH;d6}F3Ild`ZR$>0 z;S-RdLaUlW4KLxzpy=-|KKsj-P3zP-d!O@)z@d}A4d0&Lw&;Xz+yKhk-pUJ;o zuE0;q_TeS0l`b8g=auhOYP7TerPK^dDTMyC7kgq>f&80{x%@Ef+=U`x3$M?9BOxUP^i`8I5F|IA;sNrYN zhEtS#$f;QJ%$Jlc_~Z}65F=K>Q5bdSEVWg*Lk1t0`Mc=_9Bm7env|^kRo>D1N$kFS zk@_bfgK-GV8`%Spds_v`#J`%p*Q+~n@r;@+C5ae!V^QQad8F92L4o#p4--5d<&8nI zCHZnCZuJNcJVksac$FDkj4k#TdBC2u3sQ5LP4`|O!RSou^UZKMoYI9AlU1|}6 zqfk3JHq;#Bs<^UfcSd7jy~y}e`#1_R0#Y(G!wpgk#TReaoQ`yR#4>oFN$TDq6OFR_ z#wK49g5EKmeEdzh%Vpm&^^$@5aU!JZ4u9D@0>pzneXXKNSJ`l}NxjkqR~n#8F|t`N zSWl+cH*}Cu_auvbaDfxje9e48v45q=c=qa`MF!PPh3DnkZ!nK{t36wI`uU%{-x#q& z1^83GDNu2L{W?)sCl@_pKK7X5rir_<>3r?|@L%)++>a3Cb*)d$uZr(BxpH=$?&)9f zm_O^+&p3|*2e_Wyx_SLRm)Y{`7aaZ|r@3aiLsbHlSp;of8(^wfN&ZEu_C!}Gs`REP z*0~pH{_qNnT;c-YTjqljeVOpA^1DhzQbRyFa=n@sQLEOiBDYUbKi|JAdV)Nz6=-vX zf>erG+SU}nmIxDZYsOiQt$eO~+;4SGZZJ*a(a56vde81xt964fY3*Pw(KLg);VdKCmGy~ZZRq_kiJvgT8HR2h%?IKs7-wZAbCQ?h!rA&G zbf=-q`D803fd>*$FGYCVGEk@X4{FE1PS0ECI1R)_-Q#7Go8Td2>EYo9pFX~x!R>Qj zaC9YFsLmOer`bOI5EAe`$KxHsCQasfQ(njR#M_WJne-J2^p^~shH}}&fZRd$9&)Uy zkH%5UV{4f&x#%V@IlC2!GVk;6R87uj!{6s0MtPH@)TX29Wrhw!L~yd$TP;ZzI(N%$ zqZ5YM(Vje7c3HUR{vG^J#hj4H@~ylO z=>~8j)}GBAm$QuhB~+mRM1**B#BXRiN+W3n)w?8lzEQPa`HRmG&!V3#Y7HoH~K_&~6>zHr+*1A>D?SKM7-BJC{G|EhAKi?OX$p0jCEl zR#Cr*KPR3vi2%8M9T}lDHGnnNpcC#3dOWE2a6*obA`UY4Fh#m%#iQ5_94q>@kDEih zK_{-STji$>h_6rlA5g`U`O{GGtm%Y)KRZTfqZCmTJ&ke-v;6H>NQD^Z3;$*-_4dR4 z2>?q^4q~{PAhhpQ_O}ZPo=j~V$P3H|xU(r6&*hYD2AUTr`wlRF+F({V!<~3Dg_|;E zTZQet(~lD_qNc)z)l?VgLU)2Eo@|}u5b7>uMd!JoFqWuF{p#{-)4~V$XRiS7KFH}6 zA&0m1EsnK_3g&dKynq^imoWpwNufPT4CtyI=xV$E{zidIuOd;XrLF3w?|2{N9g61t z7okvetN)l0eFIMfn3P5QQswZ5dL8Zg2(y=MJ;cWiC`K6%x(^Zr0l34%0*LpWjNU+;)JyU|ca+>r0Q$RZNfXS#=v3{yt@!*2EwJ**uz)xpn*Q$J`yt`tIic#m}$UZiQdR`^@bvnw32c+`kVe+dKXrpDE$^9?c#)B;hRr2V5gOE9+<-1PC zNurizhWYL0f+Xv$JehRGIi%Ay(@ylnPu?^20J`!2CsFpX{u%__^RA*ZOki|Eq>B#epnfb^D}b`xkW2G<5;J?7u+~P zH-*eP!)IeIHLb@}nRqusYUCEbI9h`RgVl57LUwA>ms^ zHXkS6(Qjt8ju2dEip`h8SsJ|fu&GGURYVRX-IV{ZuKIK(p=F?-aJk0+ZIrL@d~Vfd zByeXz>(J6G$K7RFfXzPp13yX>1Mpk6y3x+-scol0qWS3tT-_5aq3Z1}p@e}B48l`} zf__QX^Xu;_n&%A+X_)Aa7iB1)R1=86m`-Zsb4kHFflAEn5IYQu`yP{Sy{J z;ok(0`5cru+N!tPp%c_{_QZwrXN45Fq7Oz4K4gNFmc1|+0GGtQF)&{5YlAFy=y!@r zSci)Rfs4qq^G0D?bA!eWYpx;%20RWUn`9>(tQlETAJLp~8k=gkqrdV696rZHxH~|K z-MBRiN06xZ$73)Pc5AcoZ+#)7<+9gkJ0)9` zOkC{a9M2CXyBV z1D>qwCL==0->0kO0eI`ih;`WG^A|HajET3btJa3=}3m=O=_Ey$Vb z_k!VC=4l~pVgDue{~hhMVHV39L-~9;PIJBub3;;TZB+%#?o>F`TMa#wH@+48s(*W(cjrI#`1kbQsADuh&y3j?x=ol%T0{DCF&S}Z z+E;@KqaT8>#-Hy9)a{ulEXS!uuP}i_dh>s@#gshs6Pha&L119baNyQorF`yNjW2!7 zLS$QQ=t#DgwGu>8cyRgRBew6|@UZ-1>*=<#Et+Lr8QP33{m0_4_TnRs-R)pS_k5Lm z%!FFs&UqSXzBgZX>_hDkXa>E?1}^C=HvG2%{xxxR+q)6juX#3>%i%{%IuG!f+E{*& znxi(Rqo*|0L%DYasHe=nK=C={dj#rWWj2 zE?XGm9n&f^BAs7Z3#+BPAbmFGsNLR;<~vO2@fEiz%t)4=zi8#Tpf;DfD!gCex4B&d zdt7psJanD$WvCV9F>=yUGkQ%VdAgxKkJT#*lE3~Xxn(*CsavsW!q>M z9pVz|{o`Hc!xhg|^2&_wlu!uSt+aL}-FnaMjQj6&hLmL_BC`1cp3PZ1HJQFg7VLZ< zIy~Mj%>F{?PIakFuF$!p4Ib#69zKzaEr?~(m> zDs$oz+}VeBNL>rGXQF5@dI)g~!>K0lMtRM;p*h-X>RY2^_=5iHy)AvT_e5}x_aM^x zi1yewU*dfDK-S7=H8N*w zJ%2swZJX%UjXGoasCsYJeO9#5!9d$mni4SN z*UUdvx07x+B9WoSJ*vFZDfJ49BjYEcRq!x6R{DyiQu%FfCy0EpbV|Q0y>kH1l2%Ha z6t3`Itg0B-Rts1BK3Wr=;E$M1lcG4sojFpmc*goD8nn2zHDP+q@DZK4erBMgWVgDI zDM==eM|yB49=BpV+(s=+lHZA0t!^8`#=nI=pc&P1jKv7YtA@>D$}$s7d{_6ZYb6Cn zpQU&MYidsGZz=4@E>nHRfS(zVeAO<{F}4;aqMlnE>OmfbFdtHB{Hfy8t$UmzR;%UM znys*YPsK3R;A_sgFn@dTogXRO^`jwmAlPn7G~$kR9qN>78cgkU$c&d6HmHJ>vpa0r zd3-V$2O(x`U#4S$`WL%mFFQ{?nSa>*Ngs9M{Z;@udAc)N-kPQ{90vuXXCe+aAXT>O z;ZWsobxcQVEh? zI^W56b(l-KrwG@@Y(YD}KTN4-#jD2)LEM_F+>23RCZMDd)DrncqMy+IuS#lV;rh&r zP^Kd5W-$cl2yVb9fTUFtx74k_e?7pkHD83Eumi!~9=d7SErZu8kx%G(`uBSO{fmz* z5vN>>Z$J?yThA)fI=Bw6gd69`2g4r5U!=S8XL%hyNoA(roCQW6>>OJ{D!kDS2ECu`_h zgBI=scwLf$rbGiGjexQvANQW4BW_!0gjAdue{`3-zcdk>0>-Jd$pyop+=29*;{0t< z)9Bdz6R7S(%`(v-9BhRIzOaAwl6SMrS`U_2nv)8_`o8FXk~XeJj?P>i7{3+djOc(0 zK+_*V;#5YAwS>HGNEin_-@nZGEamkxxV;O`NXOUNLY$iW|9so<+IPXmj-tAIZMH1? zW642=w`d+Jp|#|?L#5bZl1FvV-XS_OB%i!wA#H31Rb1#EV8a+YJ+B=RUtJaAHTt&| z{%x08!#QGCE58ZnZeG#wq`4=}p(4nexFk9rmVffnx9Xe`5dmRhrz5#~h17#fpF`at zQbR7$6S~Ol&k|ZSrmYUeJp(52hJ_iG5s%fion2iz^943b%F^CmZSKVZuEp6NiVg@H z6OP$IGR=dsHn;hljvL=T^0|R?>x76j!KaDXB(cp9c69$P57;*95Fm590x#G*&5B)0 mq3?r_eC60L)d+u<9f6?H@We}ezik`<{%&a+-bCK8jr7h%|Q;+ED>*xGIK*Mk?z04E0QqjP;D` z_4LH1YH4l1N8N_}(g#+c(A#W@7)kQ8j&Y=-L;^9*%2L*9|GD+m=hmtVI`z8(n5v|O zJxlB>#~QYjM9_i%h)0xEq>~hAaPUw;%dgKL3LRw;&h z+m}M2W#BJ#+k-uJcDkPp6v)=WBe>=Od#?tS-(G5i90jfBzglN(?Ya8#^Y`!W z=HI`6yW&DZ-ZH!khTH;cZ`Rij_`yOBVyud>nuLjr3>Yow9u^Ed#2gF?bO#Rl;emc& zV36^_U@)L7I_M{y3-M1YM1L;iKlk8(w~n8cgd`+DS0zISV`CdfGg~K@kULq>P>beY z)ST30q(2(kS~Kb!*%}x#0k4#NQo z_JHQ#XW?ey{nP*dX!-Aq|AJKeFC-@u=f5HU)$+d~l^u;8glw%rQ#$egcV+$t{&(Zw zfV>~xDE}8H{$lf=w;(?A!}EUl&zkYWW9>B*fPo2sNeF-b0t7$nG>tV{aU%rIc_?kV zDrbDJdaxmIH(^4Cf)x@VYc-OG4@akAP)&Saj&u@}jITk<(XbxXBYFn^*`DSdmIfLo zE-Jv6ML&|q_r|?{ zqeTrN8GMiTKT$GMgGxl?JS$s@ru!cxfDpC(58nS5(f`ZQ|KBp&vr@zNw=~W4z!oa> z4X5_w4%k7%Nq-VQhQP zHxYiw_eyH|l2HU2^ggiXS86XOt?2|*hmZMM=ONpmWbo;U#X=M9+qSpEfqD()Cw;9R zvwv>0hIzfl^C@z}fpw|}DH>jVDDRlL@~!I3p-;`{y&vXr1~nukgO|qA7Q8{)N1&hV zN|P&mJ_-8Sa*aPpnC(nCmrxaopAbs;<8`-6v_VKE?(cWM__G4|cM7O+z5Bo{@Bb(z zUK%KeyLZp{NhrstlR>}N9x~lSfQkZ^op1PHK9 znl_bNdsDA(ybU~f*D${5{~Na*4;}Kn0YqU1GSsVU=D_F}@9U>EI%i8#6_$>VRTS1i zf7C2WWKXrSw*fs%w@gqC<)1~&=78vq+Iw#-{^L_a=Uqd_jrg!C%d>cmIq6@`$Pg^* zj0!ZVs)=$X(C%OUytEP9=A?r`Yij-OpqOP5Z2KvvP2|t2Du?;D$YE5Yv8c-?{F$&E z4jebd$TQyl>t=lXjysRxst#xbGjS+DZtx=-_78IwHU6MATw4}bsTKSKq z*uGTV?7wYtK`NcCQT27vLMugq0x%8tjv>AloFDu&W6~^3-f81jUX_0v`wP@7YgHQR zIG%cyLN(ZVGnxD+*b%NI5i}N?Z&HdZS|3a$*UH}b7turfWon1hdJVtn_o_VzwDbd% zQAfOexy~D^+)Asmf1&D~Qcd8R3;iL#lk``0mI?@^pF#pdbhyn%&tF3djarDN;+qse z)&(G*umK?QTh{`qg)G@Pjea-I zgCs=08+N3HVo#ONEF=kW>C(BF0kj4D*rR)X(Dpxaul3+v7P-iQCRn6}VDTR1YFbT* z-}&Yp+~5U*or#4F&6$!YyrQ_{_-DA^up^HYd&?F-cq)uqUaQ&9{oVxQGt?^{25@Du z{+&m$wjyW(Q9*$rX548NM@5#7^y%m(nZIa2p-liHY5zk)1rn4ILBh2aI~7HL`O}%% z8fb5Vbc=XF;H_Fu`LWr^t7QpD-RPZ@knE*KnzB??E*2UA8?$k(Q{ITBhafgY7Rn4e z_pQ`to_Vv&J*xtLrZv*itdtnNvfyJ~Wqjka8q^d?FsY_e|AX!y@4*rQ9C!Nv>{KP@w|}=$o$vga@^I8_n2$>r&HK zGCqwe(m3E~gj>p;u?4mM$8l0Y136N}4@R-)D%6lE-8W^+`RQPGsDCuU1y zpj)GO=a*F60l$TwSH z<~4F*SA?-$*QN6H4O^u1UYwN`2VH4tpU)m{X8(tW8ov|L^-jEimSW!uSxeUBda(#j zDlRBVRIu7sbcs4_l?C2eUTQuGI{R4#uY!Nvc0Ui~hHJ+_NeQ z;yGltZY>m!L>_^WqRaQdh-mmKdd8g%O9Zu2AhUz_J06&HG&qkqlS33S)9Y)`Y!^eG z@CUfFGsJv}i2-__fKUZ5D;2BcjpefhB=$NW`4J?Cqq0%?mkJkzlZJS4iNdW#>?bm3>*1io)y3Q`5ti`r`>h;8l$r_L$moD)Q>ZHtRt zPw&UtiY#htBOrb$l-4A6Kb0Hz)KKa6J^D)IedYx>a3vO?;@@o*cC^S*Vga#)^^z9#K0 zMmIY=>4xHD!ErN z0Gx7bFjr3vGizUy>;KTVCk?Sgz&Os5&D%$rr6mX)myk3aXKlWp%MR9a*DSzF{7viB zja^`TCx)}v@{u`Y#Rwg`dCe?L3`a>#+;j%#p{C-XQ|&tw!SsR&+Bly5Ua^?CqJ}$J z&R6MP`s-7?};<#}q2bmsY7D!Y#fMY0ytf6`{ZGUNLj^qp>}`G)$>dP>L2z zL^i93rdZa}CbUfyd<`_#c%!E-v2#QS|dgb&3yk!E4BRy3;b)%-zwb3t!v;*n3 z|8NN;^MN1%Y0tDKEp@T*d)Zf-Zg`q`Z6$oHpI^G_UErcTN88^ZHH60EwB(RYCeLEU zRsjD(t+t}Wvab}X)q7FIo{DjKWBe#7bw|H5jMHQ@P+v*P7FPxzmZ&J`9@K(o!=Tw> z)gd!eUAt4*7fqoY)<)_sc$F&GRfuiEZdBJNKq4qPs75=W6IzYx1&3}B%J@5(wjvgw zGbV}t0rjR@dNlZpqE)e!;o6l;5xP_K z8_hzaLOoXu{F6wLq60HQ1&UhuhcwJ3RDJUZk?o>yH*wh4%mKsYGP(6*cz4h+Cd_Ja zuo*CTWnbbk8!f86@ds%kRn06(Uk2908|KbP!zIIRn4T9I`x>;mFaG5#F#cW z>{OOZ5VSXm%b|(y@1big?TF5-JJ0K;LafNt4 z)M?cDxlWgcv~~Y|wkI583M5gTP+APtfINM1(0BFtHY9bUU+Y6u&|Y{6?~^6Wf`p~U zZg|F7becXlaf{QJKj>O+>^qRi72~HINP_dntHQzW?oWJ@wH6vWFR0#!Pe<}&gTA;D zLpD=as|WOPG?faU+GYEgnm2W3$H4^Rarlp`(dpF*XF>l?9=2-V%dATqYWP(ZEris zDd7jVrPnYY=LJG?b-y1@8a4McUNeD`;IeGE4~}0Vd#Ph2B9F(ydHp`4U`4p@dtnAk zoWmW`&CBVB>q-@F4D))`{er~bUI|@|Y|V+ih2Gem>|-2nw3|e}mqTgD$J`wlwgSEH z=*AZGygWA>2zEpoJt19?*3vAj!L1(mpikO=7KeI3z;Ije1Rotmzq@eF`gri6)89Mi zpv%V3Stng?drTUAMavI`n4`iH&QMMl<6~GiFF|U!G05GXbTNFi2vZ~7KT^`PUkqdZ zie1EBuO16%yK3Ax-M(tEj}-n%9&Zz32!2rnl}UuJ_^ie6 ziEK*F@HqC0wEW(#_ndmh;H;;$mHzeck$s}1z7(qSDnTzsmtYu<=;*1kb3shZo-ca< zCPlQ?_SVEIo|-&SCQ-6*Hjy)QRD;?m2!NR=9O}`kARGQ`^ojr?YhyuxQy-(?`n*}h z&IwtZ@)qc_hh1@bQ^PxmR3Qb09IQ8=pRv8)cN#|UdUi!2ec%WS1Ht>Q;Md@xos|h66f?iOSqPeBgE|38aN3rJ*uLIc5m`nleOd+4{C$A_RD$YZETg)MBwlcB&(bS# z6g`Q|_{FP?_A)&Y)pD^!at>aGpda>8T?jZIyqokOM(Dfe7RWkLo7oA zHKaAP>;$^ zQ3TBR(G+%fW7MyAkH*g%d>J(~+iZSv15EUQtvc@N4r-@=MZ*_a-n@#B?ShMEiCD`_ z0ulRDRRmL1eD6)6S_9`JFe1z&hp<~1LN>I)&&-fwG3Uvo$1lWEitU^L;I2%8gC~eu zMdnb=uKFJJG1;hQFNFw=1ql)alxSUS38oDnQ6#c47{$=6F;8tQuk|&>V#Br3rKcFA znAbN9IAA`VUKYr0A|c+MQR%NOqYqfs6MtcoXO~u#Fu~2;2AcZN>1?cr5yel@OPIE? z`%{SiZ1FTEBHv)J`{~d15s$MjM)ilyApXla>@AmU{C%&9tQ~^eb!UO9ePpl5gN<%T z(;|nb(<`mcRt?Svq)qImPqrhEaS~Ou}m2KcCu+GI|*e!@civO8}x^)*YgZ0m0S8tLWbZ$*nqSt{3 z2;Zo*@%}36d;$8y7f{~6+_HuWLjavs zaiViF`rItS)*zAZ0zKlk2U;#O2}`h82hCiN-Ia2w;N!_BnZZVG0RnZSzT|;XamnmEr=Z2ZMn;hH3C_d4B&XU+Rld*aDC2mG;mZ!`n z;4c|s78!ap(8a*V<>vP8wv`}X{Zp|a^mm(#>$EPFhWOvtpGGZLq`lT=i3SF^G0=vXmygcb;SNw8pb~Mzd{A%eLzDY?vqK^XWV|u?z zh*HFP#A_)clZx6`MS$hG)Jh~nj2`ix5~azJo>8T3%=nwgQqzYzCqoB*v8BN6lJ&Y9 zcx0=bryocUcR>*{T(b*=dfFa5g>88T8V|oaiZO}cOb*=H4(MI(gD}A% zh*TRM^;p2CGEZ{}Q_fgCO6W&e9uEnoW;j!M`8G73$M)ajv;Zg-y+?&Ky8?4k2o;s< z^G;SCvMDa^9S1k6jpJ_h)a1?$W~bPUkf2``{2>dnhD5|MvZ=g8iFgM=XoIlXiWAw0 zghTxS)SlfwqB>!OsT~jyOd~5+ui!7g6Ua(hr`^EV&Ac3zV$Nqn0gmlpQDW66VQG(` zeo&6jVVXqfbpHvn#RkVHLL`KZXUScA^gMc_sp?vf02n8P0cik#t;-r2S|ARXZdXADY;Y zHcBn65xqcRemUY^aw%Lb@uL|Dx(juS=(j(W~N9=2?2fBpy$axNL48;{AFQ9as zR^kZ%ywiH7WXcS7QBpbvjUd*R%=C5t>b{YT4eAom`&!h&J)MGrD!{`}vo~*Di6zn!NeKG1xQ)@q6~R z(!+-R|6?UUF&^N-Nfo*#Q&^_!Tmad9PJd4%fV>>V0y+n6(a%0Zqy5elmC3q$u4Xy>s1B$@#HM z9M~^{V?F+e^rXl^7vg>_*Ct`}q$z_;sH+7#2(hW??U`Em*OjeRnQ08NuOw*{aSt;- z)d3Av!U(0Pa#WT#@~qprTA%%uQmdF@#p5F5ubEIyhzG(oi|-qx(XAPwSd2KsxCl*i z1F(-bGMaZ@l$4fN(FS8=waF$Q^gdBNPRhpYRXQ$Ruj_GoJ6%z`*lffgi|#m%D-2YO z0~*!=HY~%3;b->sGA?7HMk}84^}>E=?3sW?orC{$ovyn0>vj$LD9Re1Gb1kHYo>SrCN_>X68NT(GU%EZaJe^ zV&+!}TYb36p!W~X!B;;Yu+OZDl#?eeRFNx=BAe;BT#N}cOO;;6vWXBy&1ws&wkIcv zp2rY0uLCeL&)C_aX(JT$_S?z|A#JnMhLfZmfnJ6fU2R!`(avGY4I}Y)J)%fIel#OVFj zgV`)Ew2NchfY2rsUwHNUxqtxg{92>pEj!qafQ9u%z>LS#rM%_Il*!^CsZ; zfkT7O?dF|HG|u9p!QpcegamYsU0pImRp<&XfdG?eW5gIgcR`3!`{cR!Mzua?4xT6^L;#;JCNEJmx+{v} zV@86C3|tkdk=#hJCjbtvaW*Prh6k<#Z3j?$pGoZ+lug~`^Gv&nY_(g+AuGA}Dw6CT zV>2NeBdcH93iqp&5IK&(25xnBeZN=N%Zc#dOIp>wrjwHR_83H~O+t(ep%dY>WM`Nrmms^dIi@bhV*)Y{4st-kN{cT2E!DBaE-d>r z96#Q^TH1o?wCRbqAwts(*_OoOb5Jjl?7;Y5KdWk42C@MP6+u^>2OndnBjRONr)1eM zuXqfhCG_5;hh5k4q3&B!p=`*r!qF^>$2shTk1 zO0nFeA$)4L#RwKp7D4fd#F#~P^jGn}uds~TRd;7Nj^MgBaD~O^V!l|nLci3>=x3^| zg+7$PO5vhl)B_-K#CHgP4djf`)n^JPYaCoyaEM>GVhONhMPI56#c}5`gLY=>a9k0t zE_W^v-9}_is<(X94%%M~KKk`h;71=Y=3=qyx}Sb=eoiajXKxe;zsFjCY;HM3rwMcX zE=q^djLdfDthv*R0T{v`+C1B-DOZl7)8q_Eb!vTaF_6}Zsbi!U?ZP5K8k_lZ*vB9U zOVP?pr71#2zphEj`}(RT${~3A9JdEWHfyk3E@gHWYllc9XwO`F0w66A9Xcxupdo~- z>%J3__UyF6Z;#oeBu2K&-_9YH+4;cFAC;||T`(L426s*A3B1PZ67d-i`Phb*AKgB| zKPf$_@9qw<_OrdVvJslJ;2ZW&8_cQ2QO;Bv+~oSAKRMpFO`~@DM~@*QFrXV>F@AUU zF=2vfyAn~2fQq*(pgU&8-a4&iLyo4pVx~%o0&p=$RX@HN`XUCe<6L|;d+P@@s${R0 z*do#8pn1eGd-soF8(EAUEqOTj@6YmM625WF{Tc{M!z{+tugdD0dw6KNK32J56&T*M zhUDVKC?MhvjvJNOqv9Nav-`b4$8%+6nVM_>^HXg~G|ws!Z$F)Ver|DfhiVq1+?r>y ziC=|)l+@Epi3)=MQ$gLJy*?0g#Uc1&4Pu#ZCcxC=I3Zo#Vjv_1!goG^#Jgz zAvG@rKG={^9-lBea0U&zpg&efN|&xd&9M@1B^NsX1Vx zq4omkkJnu#4@r^m#ZA_fJ~%$*ig;~;y`NU;pBr#;VE!yPq{m5fq=SqIohT^iOd7~0qEu6{z^aIE*p{L`irOFq9I?>M1XE05 zYE2w5CrFdZews!`-@+D!wkkO*3`xK;w#YL7CgYA=ALIdYv*b|xw3BT23eS&nbRh3W z{w#=RA8f^;z0BwppKMd?X9qU=ST%1Z{s0upIxg0xKy~zw?*xhq`)r?~dCc&T6S1!} zbqV?$WdQvEgWnJR<$!)%Bdy*0ltmMCR+xf@q!h{-EC+p6l*VB%g|l5t?`sZy8a(j< zar!~xK!UZpEBIE-W_`g$h98#6{7V)YZpm>I^cL%B*P1FpGvk~{L-bPVV;o_>cQxXF z-mt_)WdxQ=G9iX=`zOW&+$6HoYWr?n#?TQRleFVSkVz1VttunW8de#tlZ=e(ZzNpI z{R!$9yEG)Ofuz_69Rx93b40zu4IK^yz48`pPsTfo1B7#OCx+I|BV>T?SVz+DiYW&- zQPUjj3gf0<+sp^8WDb^hl{Cl*(;OF-w?-mPTKUun>XaV(JJTCacvJ}Hh~o*5LcfO&!8s+^S&4HHklnjJ7}wy0U^vKKUFtQ1biE9Eb!Lv7G%>|GaQm(!M;-B&MAE!vU(-nAEu zcaz_4ZWyu%vua#5A*Db4=46?r(zL4cul%bYJE%ksl#f!|5xcvCo;uy7a}A4D30@LW z2lu+JDp6h#q^l_ z0E^}KuiUpu9})d3?gb>pv%ijW*mOxq^mt+0S2ZR5Dw35b)*)Kc=&CiOU}oo0Z*~Tj zBS(DE^k^CXI+7Y|e62#OIoZ4`6`uN9#nmKorn(?~X!{KSre>{W^BCSXVNi+GY_sJ2 z_`L6qJs$nXhf7of{qvs7wXt{Q%dS&~4b{c0@Pd09ZWDTFKRB2k#RfO1(Du7t4RdhfK9J zj;;21GuNLEk8Y)0BJ>I=*~*$Nd0Qh`vnyq-b)tg7#_UlyN?qc3PAv=+u?9zO;Zn6wR4!x?m_y)5a^+^09QyIzOOxrUq}K6f z((8%>l_*{kdd)CfhWeY9=8O5a|3*nT_h{md3ZjCv&YBE{JhdAr>41&9PMb9^G9+F?C!QX{b8pS6V~bb_>N4fz z^TVfA>wv^%vOE-rP>(~mhP@A}>U1_;2XfZ^4#=QN2~dL5--C`GCkrCArMlW2Ub0`RAWHjvj4P z>RI}@VxN1ZC$86=r(3RsG*jP#3b*DjbY$dThc6282T8nJ%`v(Njz!z&h0;Asz`v?U zc4PVRkjpd9!D46VH>h)F=Bdc?A2|c|1?JNQKqnM!xdP`nd}OMnj^_j{76YCf*uj^T zp-`0W8dpe0XR+D-5hp&x;+<`jf$`^RB&Xa2Kk{S2_iO2piNZ6#0;SyyDcG|r?|INu zOcrbRFV9ysR)9;Qn{LBYpLs8xq*t2iY}C7yx_BDDCfgc6@f^%s|1n|fFn*X}SDIz! z&!=gIYPsydLAI5H_&Nqn#%)O(G~FD&Fr~EjugwoxuwR`FaLKiS0#MYQG)HqII-fop z7vlRdO!r8s%{LyrH5#7oRP2V`l1XTv>NLIHwnlWY{U|l#ucJhp>Q#q4d7^<7-4cK>PP837w~BM0R9l)~h2o}_+OFh% zjW6FQE7ZJFiFh?OslFYFSw-Hb>z*J{BZ6%J4q9o9*lN*O=O$mES{ z1wCc%cs6YYe%r@cx@j0YJf|AlB&4mmc^^HF*2l5b%J-3vi&^ai3*2h{b;yc|@QtJ* zHaW5nC(i3d5e&2(roTR4m)3{#i6h|Dp|V)J*F7g$u4g61ubWz9X4?R3buNuy;K|pX%9O5YOF^5hA>Bn5n~91vwN8!Q9{^z%4BqL z$1N+yh*Jb+SrbwTlQbyun<988n&EJZ!j=cT+op`AsD6G?l!j7}9%X}v5J6v6T6w=K z5+}y1@so))fQGyp*l?rqwHI~U>#Lx)J2StfQ$LoD$KaF1zG<*I1vX0sXI-P62sN$_ zg6VbKPl3MiQBOAwzriuh?Ya@d@P?m5k9cNZ#No9SRhBCMu_C=v6@s4xBr5sv9DUSIpr_6?>5HdZroPF1u>s zHPt6hNh!2R;SSX#2b%98AM%9>Yn@Ns?@sMyKC298DNNcC=5&Nu{FH&vSv4~jBz__0 ziCEyBwcX|Q8NZ1<-91}YB3F7$AodA-@o?G2aYZ5PkDV1cq`{I}o|z5G@p!qlv%Arw zqk3m^Y7_L!v6>7IkwmEc>TaEY&kQP7ZOOl$JOTLYk>dH~gD3WB$HEzNp>u44t(R3q zRMQbln0?%5F=4kevXd@uyJ4YwSyoVep8zBYi68ropgd7Q^%tLXFv1vxAZ+bZ*e?=x z++uZ_g|*cDGM!gIFccoBK?ZblaD=qPD7%`ukGrN^@Uz9UHQn zQO8}~Oxc7iMDp{)Uw)TezranB6(9Q?f6a(ozSniXhrVlS=#k3*c{_c!6C-8?A9lC({sq?Es(~Zrda^lsr)S3G z?9?V$w@)qs$a8Een@;7;aq%(zN{v&ok(HBUy{m!c*SUvt1W4Q#XduwlWhogx$vVtz z#T+r80yFn4SH4AqMTvMOs7p`fV&>!{G-6AViJB4WG~mfXMi)AKjP6@!cTak6;LG4n zrNC*4&F}nEb<{2^yCuYeV^lvaP|>M}eHFLFb4PCT6-+yh!_#Dm5)r9tm=b-yN*qjK zDB1?80nhe5vW9}ji%!2%L4co@Rb?|^(5OJIal(x|bp#G+fS~z6CaiuXM|MOj- zr=7fNk}KEF512>_uHMN0u!>Pm)CbZLLtpPy8DWfh(X!~IQKWIy9lUn0jtsp@i#I3J z@24=y012Lq3kq@e!F4MKBd?Y&K-e)P!7v?UC^?{j%Jw?aG<-4ZkGMCQ9~GPfV>9rI zb+{1Phf!1;lHB_csiy6TCU*f;3xoQOO;_QGzjBM+-#+x{%x?8)YaRXllfe1QonSUhYAr%niw=kNiDF-Z z=iem}bGcSiyc8ITPzf%2P_pbOD?vKtAs=BC-pZ@x24#b=TOl@@l^?&zGO$LZ#k9C4 z#Qat0ZA9%pJ6zjO=w`+Qa4@T%kNE)Yfm-CFCKC=SGf6bnlT!eJnS*f#PX=a<03cr&q z<5DkBz%&e@B-hwdHc4wBDH()}YxhW4RI%Ihu~t~MCXsIE`x%*N2~KT&0kneQPcC6LtJ=h8|_!bTN`9!aO>6pu~#NvAfDiPNd?;cJj(EMdW0sjfRzj|s^>ar#eu z6UxNodCGQWV_AuI6`XyJ?AF#xDX!x3PmQU4XpYv6v=iDY~gYSblEU8Atpl-6_n*qp`v8H3 ziy_NbZy%ml^Ldl_)eIr#1o+QdO<9wJ*Cp3znWKz) zap;I}OaL95{E6<#`N(d0OnrN{e7?;!?FiG_^M#Ix&UMD2* zxBbCqZEGCq9;vnxiqaeymapq7u%W`e2u!_7NFS@mXojg)9=gyJlbn+te*<p+35kxz%coEQ}2Ol0moY4OfS?mnckp6 z+v3>vE1TzNsS>0)=n%)Sai0Ubns zYfiDo29J;h%aO3U1mo?9$i_PM&U3(u9?dWc@1Pr_KGH?Kee~4>W%+mUg=+$Z}zZG|y}$*ld6k^_mu`r9WE ze`iu*NBV1lL61?#B*SI!ue|nzQq96i7++}(%5fk!08?u-Qf`A$Ar?1y$x^Y;uMg`I z9XaU+vAOGn3=3V8*2g_ddX@0@J>>%GVw>XpQGxFpN%gKuDES&j<1OFo6}_Bm@rw;!8sA+*oS zZ2j_tEl^eyuT@J|rcj_->L*MuGkQeMU09+N3MfOZiWOcp>!kp{f4Xw*{^@qMy6jh~ zD$5*Jz7w0wqDI*XM_jdx%dv9S$UK)B5|`yWehncjJ+XWI$Xl!gJx~Yz-QAjczEe6) zqn&KM;avouD1AR2J-LBs^z9hB>-C}7W?S>76TID*q(3s zKVn8!5}h1B-96omEA3+UrgPoGE?XJppNNbDT~A74-<)zx)c#A6!HkDVk2>uuYl zWK{h!!00aZdJ9v%cNIDk_h>KQDE?6I=`ELw7a9(ZWa@4VkEm!T!08*R@~5#x=3K!n z?NNu?p%mf6A2#KSRA9=1+MvF57m6z(i1!%vfR6raGT9;};IA<<_ZMT{wXHzYpBYi6 zVI=Grcal~R9xW#iebO@eS%`yvuhA|YHEU6^ONO3+;nbyRzo9{|zY6b4Noz*2`aYSh zicemWRUMUW?YB^(eCOAQ{vC42u>2jF_4~o(UaI9Abfxm^i>>$% zbPfn;zD!Q9KkJcA9f}SGX!*XdVns90KRkL+~o|Uk2YD$w6 zZhvfT*-U$2=T+lfhD|8`rgq*9ps1v8Ra+=w)7no`nxTfcNjYLni zzEUd0$tn0fcv4eUN$~dq9q253dNq-@)nU#VV!|J|$~}Wq1|3~gs=X^iGC^EEOkTW= z_xV6BIU=8p+K$y{=zTHNPUSdUW(EmdwhXDt!%SPBgw+J#Vs@;x#%5wrB(t_+?@+gPc&N;s^d9^$d^t9E9;xQ z3}a(SgaTGoIObCO2R&Oko;tu`30Ikor_H-horbLW@i958lc$uYcMgV5Z=WRNL4#i# z)c0ZyOb7nF{>_-lypz7+_k6V}5ZtY#-Mo9@cQ8ZrtodP@$*t}VXvGC{uy z>$DE&1O|R>7%?OvDQs38nYGAZshW*+4dLBlC{Ad&QiT7Eu9ZrBbVDy$78obAOpWCP z^=!x4jQbSevw`d$V+uxx&QHfW{^=UB87cV#E$M910>dzV3l1o@U@ zo(2B;aD=F$1D9Vkq~TcU?b{utfY(`MdSSJ9wohv4Fz@DblR3wc@ZCQ$&tQH=U}%LV z9i2GXM^U!HR%9H!lu9Gx&kWt$n1#;Fx+h6vbnMgiMb{Z4HiKhiJ$_fu{K@Coq!_=l zx6eE1$pH*ik$%xEp%CpG0O}j6K8NNAoB7@AQ*3poDx!Y#LF-a5`tY~YC(J{d z472t@p25zslP)=I@Y(MsL)GLicT^(`(glLksKy%BQqNx;a5aLTrV@XUGarZU_hQ-j zw@3N0x^wm}aYC6orqte@ z*y8SwmFe(_dMkBOdDm^sukwq4JPn_4Js9m@TZ!=$=br8jUbfls97guru+?1PPz#T&$IcB`3oeVmtlN zd+y5{eHI6}uCSReTW<&Yf?Fy`yC4mSG+I@(QHT&wTz2<#=E!jto<3belHd_uo=C@B zx9DLhTIm{blSpDGfR0&9TU>t>x9}i!*Jl<@3f*L~LD9i3U}`a0C3f=b!4hWM5i!kp z42=a6LA>^eYpb-+FxDrN8&7uj^q!;@Hf-}v?=l8oZ3Ucs2jd$qo2FVGywAfA=kE)G zfMf|ntXbAW;NB}V_fqAm4Km2NZ|Q10Wbm{;dGc;qbZq-;U^C#N_U7>ZVZ&kFryU6T zz3DgS4 zV{~(>)B3M3-NN6;%vpPOs=S@f|#sBmJEc)LcpS^@+(S{dy ztV4f1?77231e-lWZogRwN`4~^m%_H^N=)JmO=7%v zZ~)Qf077(=At6|43peeqh-_r}rYDaJR+1iA+fDCQY*p%91w+3Sfi;1EM38$cJx$kY zmaR5P|G^(;=kXdRxZNp*Nse~W?b?TQllCY{EYMwa>((qJNYm3pplTVeFesE5THGyd z2n=k6x4YkvU%6wNNORm@&gW4~#*n&t9`RF$uyp+36Z^@rLMRI zMMjm&jGSfkxf$+v(w!Rh=g}@6gKsrya4Kh$lu20kFq@nHtBr1aviFagv(Mv^% z(v7S$gmVcmzVZG9{`lN17Nr&Jmb5a$@RwUT!}pQ|07=njq!_X}p-uvK*o<{RTq-e2 z9{ZQ3ca1zJx8!JX&QvnAm@_bmIOV71h-ztwS5Xmeu;`QOCM(voRuT9KX0WUBZG0W4 z*J>B*reE?XJX%QOFOs9nWLR{=Z!V2n3A#Lyz-SctI5C<^pqlstPJQ-AqiF!7x~ z!u$tsBRf4#qU%Cv^A5DU@>N8ieZxpAa!r>1aw`&f6Gn>ykd0#1~ zW5(aUgAY|E8l6~-l?#93r}xvWP%XosCfc9Cu|%>^;#m_T?%#UpS)-{9P`vLclIW5Q z&+%2z5lSQe&%Zv6kxB$vSA^SXy%5r81QsH*Fe`Ux&7HP?^1pCnt!1?wR@eqEEbB02 ziwgyk`Gv9ze|Cy7;YnWq-mX4$lezlX@+=3r_fB2LrOqb2G1QAbP8jqUF$L)|$#F$o z%(1=<2gf!}rdcA%p6291Kdr~>Je*uZeRYD?^ft;$7<0rVnTRKN^8Z~@wVij_tD74a zf&WxFZ!VmF0fuVT^i&X-rN@9bQ`0y{OR&5#k|f?9CJRgWCj9~bdefkpTvyW*%YF$1 zFYee5_&f1VYJ&P2eU?xfnOO{?X1}w37_`nC)}VZDR}& z8Tfoill&*>H6zIo(IslQqW%EB`PTb%;g-eEKf4EMGTk4ZI)j#IkfEu!xXBSo&p9huT4M%L*<_Z@g8oqi?umdB;(NmLM0d(ufW_kkdxQe zD^i}i;cvySVjC^EUtvgQF&eB)t+WdNkBm-tF&C!=IYAT9k8ew)Dc=$570nR^FH7{Ur0G))185fe{NVCDy24I;b)bWIVl*QP+lru1_vcn}Z3G3Y0X}J|kvA9r^o3pRI z6K${mG@^T6pstwr30jiH_##OxzG;%+I9?N9$?K4wY-`9a%B(g&5=Vlo~dVCVw271up4Pf)SKJ>Ol5Ma+g zyRnmlm$+7<`yAfDLi7zOwTl(_nXYE64|x%FIWWe6BlENth~~Pjl@`O>K1c&KDFVQDwBn!Gw&6D_??MRif5LhEJKJaR65nHJdU6QJgt-q$UcWJa z9#hl>i&gj5Wc_L@m?u&GeG=v}l;qbea*16GtNgFpcHsxKFu%PtX^3@> za;g%?0~td-9JgEOh3Az(7rsnFz1tf#T45axZbkf4T<5tFJ#Znhh&-*o6}n1uh9Yn` z(Sp0eN^AG|;qTuFS1^b?Bh$GeP1FeQG~4VRnMcF3$a=~Q29;r`=Gez?@_U%x*X;0g zZA4(xGpx7SScQ!%Jz9>7u_?=q8%eh_+-3X~%fvcO^^Jg%;m1CIHKHUH2@NLM<)0G$ z0Sh6{^M{t_57o;xu{*wf;v)R?b+~g~8?G=2#*vE?*w`JxzRm6U=-PE$pU>l$UfjV~ z{uLY<8OK0t6kA%Nv_eo0z`8S>3~=PaFb-_pfQXwEo5xAgtHQCx6n=C$hS|J_)>1!* zaJpFIgtSUoUjInjtS479vsP9=wIqWL54Jp5+{yC7fkd*^Zc@x|EiTX0Gx#idb>jCZyGXIV~MyI@>l^mq>D1KP%PoBp36WJNpQ3& z%&kCQB#h7R7&5Q8M&d~vzdVe=)>gc-z6TC!j~yh^Hk#cnx-M-r9brjLiogv7g!rDc z{XBlD@?)IAoFT@4&o%_r5#PY*lUx#u@F)rH?=4(pFd>HBy@xedjw}1G=3@Bz{8_A{ z2Z(;k9E1edQivpZKcy5K`O5qAlv5na%wxE?#PIY5&NXmiy*r5K{jJ#G`3l0)q|GM??G`EsC$5G(hz*3fP5sWq1 zx;M!LyWGqe7C(Lq@iRwgNhT5Q?PuVP&%oV2KyuADt8xdShmvA)mX_#olJ7k1lw`PC z)^XB0Ezz=HQ(YsVWcU*}ePRc;60NbczS6+~tqSz`Ktei34e#aYdGwNib+$%u`uZq( zI=e}LTi~ZSzccDHmSM5uh9!F3;u344H6xqHNxs6*<|}9?>04M@!YFmIUM~OO;28|A z@1=_|iLm6t^;C&OcpevK6DV*{w>X>BQhjwJ9FEDN>elk7`&R~vY9nqNG%O;#vOS_D zmSmNXiGg0e?kwT$qu23DcNi}>1+c!Act!&3Ki%T63NZ}HkR{Z$%p09mhl6W^0@qlX z^j|DJX2Kl5eUw=A@w|w&equU}C1N~5v;D%l4r5G@dN|2~HQJ?eV?nUKq-~lMfm;#y z6WgoCyM(b2>%_WS;rYbYCGy)}(}H`PuGAiMOv_VGEQI(!#IE9hY91n~wI)N=8YLq| zi`|3I_~avRk~FPm1Z0j!k)$>~dI<$ajk7b-ocIZL1fw?%(&As(-PA%!L@zR9BgVB_ zZf?>LVjDd`RA?>EaZ=sStY*NpZTCe$$?$z4x_^e+fGh-9rf|$BNvv}8`Afah$W#nRPn<(# zs2}Uwqxkiod5u#T!=ERv;};|SkjZn7>YJ#CsYCOw4g@N+ z3g<~`(__32=s(!g)G+q~q=@jzwGn@oRqZ9A3*@Xy`dcougo`swWn$a3L{`+&J&$g)FtgAvjw%kc2a$onA+) z7b8AQ5Kh|Bes=<;?;8!+pat&5FZI0%_sZ<-J~77Fpg7O3TXBC z5#;VHQCs5Nh%~h+DL%c&`HAs~M6i@3WY{Y@VCHS&C&~>W$}Ehld{ez;Ef%!ocuOpQ zeqsWD{_Y2qV!Vdtj#f%sjG_1@63tDT6oES;a4R7`Lg|E9jWs@qHvtaKc=637(c_M0OwG=d<)ML5r_UJQt!X$K(F|p=;Fp3g5HyN9VQZnPt*6XGQG3OeAk%j~krhWe8r8 zWVgtA#K9Ux6F6~o9_Qvb@lHZc@>pdOX-Oj8xY-73HrLr70b52k>$_z0A!cvG4=*Rt z<>#cgkPD~hN_gi&3};6tFr8a44)!Z0KCRcND>1e?@gNO*IRkQ(2&3D*ymOhYPQ03pp4%Wv6^Bc)6fS5x6@7LWp0mwHy7g z=!hjihfb`!J6;dEC(kkW8uh#qcKKnG{Cwp{e2^3G{&uKO$?(GhxGM7VzAF+5658a& z6Gryn;9FAh6YOs3F!w;~o>6Tq?}=haafX}diqrBVTuj6$by!4ZW*mvrM^IQwkO;Rj z6!Xhye&MHSW$v-uvsH<3rI;rP9!GZU9Ieo_3bQU(+XjYvmWg#rp;pCZ@d-3S&*US4 z$R8v*8YT3kShO}|i+K0QalCozJY^7H;AAe&Hm3E_B|k)xQa(u10?c)C-vVc;R;X1q zwF^l*8yoD7$wT&fmaKakxO)gcG>tBcDJ`6xPB{KltY-C=Plh5|G-XqlW_Qga${t9&gRL|TR;8M z%S}Qi-Zee<3R+(H9K2lvmNY_c!c1cDDyA2a9yx{l!nmnlw8HL=O}r)ZP3($^n715O zSL+M`CBqMU!h~Eg$xxaiHA--`KwZ(kZA0kn>&7|~;V_+klM7ovJnJ__QbtE>{80+& z6|WF;6t-TAl`uTVNn7zH%+qgfj?xHu>J{O!Sb6Pa#!Z0$dYVJ@h3myYvm4EPWCg_(1UuoNa6n*!v@uxd+BTC)PbMRG&&ld7CF6P4$OB zg@IJ^^V5tbcLF(n+gxt=`Um0d>9@i#-_#|ZWBJc}fe=@5vnCtL1%^}Ry;gh@2e<5i ze_#mFT`!mkct#^!CBo(&lH*$lT7+|B=cyH@!G!MH#P6^xLie3&6}}1WdS7Yud;5LQ z-<^g`5l4ts_|oJw-aBy?8wS?nwao))Z*ss}AdwPc${(FrCB@P!f`ue4l^3!VoSjYM z)NB&hk{Qg<3q&GQLOMqaF*Uv=T8}H@>VO9o;lDg|x|~G>XA+2BjiJR;MrXi<&24TR zSl>)PyD3_QNwSUA*eYk3-+g;khZ%^aP2wlZ|B{E{miYj{E=xcRaU*ATR75@^7IeAQ z$(dtjT7|7|u@p;WJF&967}hz5aXMn3o6ln8=mN%*4t#AOh(S*Y9uA(8TIsC)Yx{S? z6ZDgWQ=Y=@PBQ@3t_n7w!4hq9W^!1Ab^Ecqd#jFEey3*JW7phR75F(>wuzx{B@*17 zd<#Da8ewXMgGOrhAh|WqYC%8@MP)`7mGH_-m(Q4qb3R6o3k~fiIi&{>xna_nVGK$n z&1I6;QaWj@$8uA3Py#1BS0>*{^7U_{fILYmBvWUZz3H)e={w^2Nm`#_cc_W46KE9+eBuyf&QQTw*wE2zk zkBzPiB*zlTGChCMVoc8l&H~<_O5>y10%ntC)@2bqtDQo^Wk8VsLIPJyT1M~@8zdd# zNARjRjrR!gAYn}5B&nvuB*pUZA=#{>2~iajF$XnS1$!o~XjgsiL!`0l&VW0N{xw(=s%bQm(y-!Fv z6J}YqFDKSR-^2O&2`rs^AKCFsbX`m*rp*KJZF;^Mnps}`YFbMOC>g%b*)%^9i(S!j zE4p4AF3%^CWaPNcXb@k0em^aVwBV9^y7=1On1V^Q7zi2H=;q+?bQbT8EaAv_8k4kW zmPw$+JA%k2gzQ>LNS+WkE7i6d630zunx43L;ghYXhSM1rrmj*_A^YfdO#JHBl8`|w z-Pl&!^6=lwl|6jXM<{XMfFQI6ODb~^I`+iH~tV11S7-up1E* zNa4OkG{b5@KuE4=cR6Wp@$frH(u;)Xe%(!NwC-+1k^X;+*Uq7oiX*=`MVD**EKH!3 zq?C!}0(6t+Vx+j66Aee^9?Zf2Rw7J`u#``peur9NO2TMmcC>WDw_!gC@j7~VGQNmz zu1?qL9)ZTZqd(|rBX1HuB}PTzT19$-$kCd3=*k3UIeF{#J)0?8pxc&0i6n_ol3a<8 z+a+c`p3LJf=;h!iS7M+c7g+|5C{RmmUgcDeFV!w5L)@0vZdwrMU#214DrHKviV7XO zk!n`p8~JYRd)%d)TKTw@Jw|!9o7F{{gXR?97Sg9|J7aaneWhYrrg z!;ddbpsU4?wvF^LOslgH}Z5fCC=N--kfxuc8{_YT7$GyJl6Wuj4J>C$QZ|LlDScpTSx?O9-V0W221H=+^j zonn(HiRvQNEK63gEXg%4DNf=zFL_DcOL3AfU-Dl3*|DEnY`G_@maI-vq9~FQ#a;zA zkRSn~_qN#j&bhNgf-ogg1q4XVB?q%R<(B`>%%1ztx#u4w5RNiTa&)KoFm{B-&oInoo7Nep8pc{WbOE}igo8cmzf54GcUKXQbdskykq@8p=SUlnV=zL=h z22Y+qlz>%fxV?qpl{@+SD@q|3{mfERIFXaiz1i8+9WJHhS`EW7Pv>zAvO#TPC2@mY z47g!D!Zsk5_n8yxf}PFi-LsxN;q45}(Tzk-m`kpj2Vd1?q(@|1=bZMYnEaeSmxcu# zFnq3lzv}7(hj77F8WQNTJM+D;A_fI6_5P_ipn`Tu9Wl$0a!Qb6T9g-9^!8?XVnl0C&qR8 zMVV6>F?xW9Tk6rhdp(AZ?&9|Wf@(ifE9NjF-R1N!OFcq^DCb!}9@oKzDANJM?`CCQ z_o5>Va1j`OX5H)PW$4HIu2_QN!dz~lH<-NN5Ff9D^Z>7=y2JN3)Z%c+hhR!7chkJ$ zuxjMWhqA(RIB9}ez)Y-TNAtgcZP-P5Hqu=V_6=YoDQD$;{uQztr5NJ&O3uL(ef8ML$Z`XeM~v|P zl*7-%do$aeg1gcu;fB;g1B9J{>p4w3u?~pvH#QW8Co2aT^DgHEIZj{K?;#zXb{=@H(D zp*jlK^LN^vk%iRB3y?N@1-y9`R$jWOoOH=rzyZVO=voOwB=H{5P4tPYuVRF=7z%yV zwk3MfHr}SBPb6Xol^uS#em{;xQpqdMWUid8Y{{|DjQC}{!5uEbI!vZ#MSYf??_p#$p_mT9F@^yG@O- z*XJnE**!iMFawu((FDYd8;FMPVboFs6fH@yUZ_v`GtGh8BF6KLptx^`Uv;+8~Q|2!6c;5x*Vxf z7ErB%)CdA?o#UiS-vSO8ewRMhZ6{Cgv7SCWy=eyuZd!z#JjNyG%_z2LKPeK2`U7}u zXEXM6Mlft5+~`G|i25?EBQ@|Y=AF>HqVxvzN4(h8(2Ye|i66{SneM4S|BF`v8z)Nj!Y;1&|;eT^-a^UkR>Y0;1kOj`@(-A=Ud@`jc zSxOYctxZ&!pz;Wx>%op@v~PG0fs@CK9Ge0+dB`_TrCxEy{HrO~HjPw^bYfusexFhH z`4s3JbFo^$0mEnMIe}n3*tAwkrr`FBiUJ1MNP?SU!+beov~Js|HN3H&@@e&b81!cF zc;2W|7@%*$VjJshYgB3Hr7aUC>A7MWs~Sl`S@qT??TB9*hV%RmwKtpjz|bT#m(8J7 z2K=so6c=;i>(`LmhuW4Qv{Z*tn(9VHZZ-TDSgn+Q5)Rbl-7r*bCamfH-wW*_{15Z8m6n_5mpJVRax%h)W_ybIz zK3%&m`PAFni`QOz4d49cH&IYffUkY+Ygn*g0r8jPSsbs^rm{ z38cLxP$t#Q!Af#L2`Lft8Od%jyoHmvFUbMIg(0PO2kV+S@UVS8LnR8A(Hwp?gg-AX*1%5n8kWrr9@e3}_x z7|^UD0j;SvFAI7=ReP5SVlko(rXle^Mu~wZ=vdQa0&058V+<{A0php~v^pax=ZT$E z!w6@rw+u3Db|8vj0^6Zb6dirToTTAJCM729pE&C#5)>AvjazvQ$TO?~)urYkR;ZVs z#cC&Z=Uu?s3DBB*<~+(A$q&|fzjm!SkFzeWL*9P-ZG88;-^HFidvM^u0Zf}V?IQz* zo12^Q@BjYqShsE+Jp4VFIddjv&t^Dx%6&O$d@K;*$up5QWj<2Mr^1t#6{kh~&JWi3RNg7|f?B`z{8lI7Jq-#*0^v)UADoqyWxS#S!)a-0hAQw;xmHrCZ(CkoZYWgg ze1jyKi0D*K8Yc_z864pxxxr(H5g8h!wlITmq^B_qb16aaG=_4Xi* zn>S89a8VWO+^M6x%XWg&Z5k#2t&@@FT3g#NI81OzFf80N%w=EO0Jby@qt8PfLuy@0 zS7PEPH7>r$gR7ajjOCsvA>`p8TmgDL*=9GF1dgZ`(I=Vn!-9b!4yx((v=a~4#0HNi z86J+XBZ|C7PdSpQq`@o$92`VbtkBdkQcPqtkxL8~w)2?iYqgQ~kr$zFkXq6NzLO}x zKC8GCN#v5X&_;V2+fkL1iphST@c<{%25vU;)R^HCe6S`F48Ipzz(nun_gg>F9e#%M z^N@d(CdyS*RNx=};U940jW?pIstU`NEqlKvzUa;qNBHYs|2pQ)n+FXFR;*azP$b5! z7cIw2sJp`e;uHelBvMy=RZ|FrXIL7* z(fK}6EpRDMlzS(eC-QAk3h(!&OP3mYgeK69OoO~{+S}W4=+GgwwUNsi2pGVq31?+x zWvHyIL`Fu2p&97BP>6DAhY#b}v12Bi#fujso6*Ha(~p$ru3fwMCMRg}5fHa0uW3?W zXJ;plA3u)z`g$|TP4}lJ_T}Z}p{AzBcyaCHRgfmasa#PX-YuyfYO~sFCe`xuWtUxM z+Ai<#$a*SIwORRUt+wf09XDyxB-3_z)3wb>AJ78)ZP3rLi=O$6c}uAbk;`z(`{DS2m3K}b4+}V8c(m2y_eYo>)K*5JscY*(d0{a%g*7RPVV8-#4h=@I zw~= zebh`!6DN0Ms;#Xx6V|%ByRmB3D*WYN{-puKQSwAvTU+s8|Mg#Z?6JoT5Y;hZhV&H( z&Ck!r6<1t=FMa7tm@{XN0rJXMVD;br?cYp2N0u}Cvw*AW{+r+Y#(-;099+G6wQ0M& zty|I5)MU1K84XR~zPh>^pZUyZaLX;X7+~%I;?b6VkKgs% z!c$U8Dv?rLZYIQu*2jyA*O3E+-=j4@5{ElrcvLsb2P?1@CX}h`=*1K3H{gpmuS8xJ zJsbq0!?f7;ldU+~+D(v2!oAKPiRVV;_kq3|1SL9ye6d^v(6}w;lE$$<^NTsgy7Bcz zQPia5V7A1ar&5UVk^$}NZAGNvFk(#yan0<>xas%rMs>airD;B7k`IxRByvFIqT zF9D0E)gX<7hy()WZZc*P_|Wxpu;dYN*DD3}m6JXa7Wk;}gU{BP@^%X3?%1&dfBUz8 zYdo(4n*yb4*RI9i|NY-%-@bjugDa1%o^A5x3Pktx^cb+Zckf<9wJ0emG3Rx4bs6xe zXTtEXWGjC+32X}VDj#`)1$LFczW>Ev`~{wW{&`64LEBXqC4C=YWHL#i$y5DTfAv>b zxpJl1XV*!f_1%)Z%c4Z69csHkwCewjZ+ydWr?0w-ftpxnf#rYx=YK|BUES}rUG`Ev zHTh8Is=fm0PWpfr;PsMdz1}PiYA5K-zesC3vC@=Q(>g{-H&1P}}b2wZ4OYpkrE9JQfoTc*TBRE6d7qSe#9n|K3 z`Imn&<;W8}dGci3dFP$R8?0kgzfGGq887*hPd;fT;eGY1Up=GTOZ|)8XYbTbyYJs= z|LE;@e&=(ZeeB3Q&UK!B&O04v@3-@Or{nDX&U*YU;DF(geL$=t(FA#51zvoNR5OX* ziPQuyJ$!s96I1nZ-9j&6kiJ189&@(k6&;z;hx6G$y?A{R>6PZ&cg{{5Ag{LvK|biC zLjlD202UxKiEE2Nz%{8G@MXfAl!Rz^GphW+o!8CBb+bw_y_oUa*-$sR&Ivq0tq|p6 zZ~aNUykQ%E#5z%2Qp_KuDLB&Dh7;WZps^KQEGr|Ji3WPikJ1YhO$qb!AhvDZgCmFD zrqP19b=9?4IK7%a(Fj3cf)T+_&+wsYRwd?_7h!%;Djt7rCk`Clf#K9Vaw{STk$=oz z5My_>BW5+icvnG+f-5z}iYLzrdeI~TX!5|KM3_dwLdacx!}Cr%Uu6$%Q8HGvKA!hq z*-_^!Z#Tz53Fu;$y&Q0y7#hGns!D8W^kY$B8WxxOQ0R>yJxN~i1Ps#NrA7Iia2W)t z#poq}B!iw|k9j@!e(G}Gm39ssXA9gvBBI=-6iDO!npMD5-ckWvX%EXQs_z1v@*0b0 zE2@IF-E`AU`0AtmADktYZa)v|qqj z-fmGm#8=k#88c>>w$GnGA5xzXa2B27KmOxC@c8488_)7je)1FCbI(1do|igZLN=0D zU!HOGnZA!sqQr>LEqiJ3A>HP)flHmQ0g(Dh`PdW|@zFb~}MlKq_s9fz+8Msvy7X4ad>a_XD`+jJHi@932DWRxzOjmg<*`0f+0?97ks`34sK{Y(IUj>6wV75*%|zux36+<8%p;TuvuhMeSt8 z&o`N_kgr@<0P$(>kW_) zFc2^J_SZN z*VP6tD)+YAZo|=|NAa)!`mYAWyzsohu;xcYpVH z@2aCL+JwBms*m`?_uY3NZn)uwQ%$nlE-$#s7R5urP&5wje0l{ucK|wtq(*R{LEzj*`|KyjBClu| z*_l2A^y8(+z2F*%SX<(SIID}y0{hT@Mnt)f%0#*M(oF&gCr_R6s?Lp-FA;C&}9=dV&CsK#jc70xhHOH+`2vg*duWc@Qv@M{jgeodlQ# za3cXI(E(Jn^KNpcoE^=-u@g za9vSRVb0Yxrley9yq|dD2?N5#3s<>H@*pSB@gWD1>Y>3(P0eKUra&HMo7S>(=T6dG zCL8HM^_`B@fal1OI{fNazc$y2@>;fRsR7I;bLY>5w!lXQ4C~FiyuQLfVJFDpiJ65dxMA5$EMGXC;h_0YV385t%*3O_d{G>% zYd~#XJt+-km^r12QfFP*xP3on&Yy*69(x8iT)mvMg>v$Ed4sERvH|}H?nStPC1Ukk zd+^(rHzL8C&L5u{m_B7DF6U@kN)VeiZN}lcqnI&!20D8KJiZMj#d%mXeG&@u_;6n| z50z7@F+_3m14ldY{0pz4tEC>-u+Cro?A?@bBQj;s4zPS8OO|YfvtBr}0`<+)@YIXz zFtBO`wr|@(y2Kz>UbTz@9WIWp_i&wv1@jrJD=h_?^g7Qg&BiCLo=<5z7xo=(#mgIa zVsmXhPIfh8IFG#OtRln+j@3RZ+3R@2&(Q;C?q7`N7BV-^8%Do?9n`0fX8SPCB{C^{(m9-&)3H!0D0@kavlGs|_pU5pmElmlse znEfH(H~N*eoxcEu_S;FIP{)o0M*@{+8^{Xa2v{lE=ZplkdhI^mK=I*g1|r@DL;l58Uz%Zob5XrBV2 z(rcD2WGnTl1~#HH96EH+c+KT){rJZ}ZU!*2oAMCdM1fw^SG}CKq4NZ2ckS9~r=tS!^fc@s1Z$1Nve-!%kVr}Ck4TeWTAca>BnD_7WIufAmYVO9)x?{L;(M+U7 z+<0!|RxF+|1(#>kptq+F?QI>XnN)~k0vY9_XOmvpcM;Ttc;WNzu=-pp=FR`7Hl5$a zz!R0RNu~sHf`{R5KZ4*0Ln8nKNLiqYR*?k5rV9M-PENt0 zV{O>7<5^N0f|ycMg3mm37e00WCyUz4ThAB7^E15fx&d=_*BwH8R&{K4UPYvIJFSmDslZIG%m_W$GRW zuw>y(3=Bo^>ZY9-BGmv%si>NigTaAb6y~I(AS;L7WjB^r=Hv3J8`07=h+n_54Ueqd zgu#wtxU1(NIx64iJ9#eiOje>t=x)TQxyk-I2ghz(W*) zBS(%HU?cFVq;tjJ)v*E{qAJXtJNFFWC?KSB1yXd3xV)lcNS;mil>m!8zEW+F2YBSY zrsL%iR=re*kswt0iRb)}|M-vCzJ0sNTby5UjP<%+Wt=OC<{-YZ$`^N8`HA{5vJQHt ztNz-jw#Wl1FQ^UP&Q-?6ykGWL(ptbXGcyaH``qWuBtu;T>N{N%FTVJq0lKZNErtrA zz$T(g$d0m~`fvB{J#ngqR7&)lZBa1fWmkX6ldR)yibPXWvpLtEEGe(O&&RimJWrrk z*MjOLFlSQn zkY`@b8)HjjS6Z-pI|q^E_M(v7)$mX+!B-nKVW|e-PsCRr{y2QZYF^SF7S680eAiTS zZa4qZuZd)Ww;rzh`B08>QPOQ}-^^yid--EM1LC=eTgy@CU>}CwdJSK?=i~VDO*4>7 zU@45G8&9gWl6mOhP4~_3KZ1sqVI-yJkQNZZU|130q#(=(1J}c0FShR3N9>shUy29G z1i`9tG+w3!yDa~yE`U$1x)Lj|TZ#>P8t{{!t;TLDM;vbq;(vYP|KYBWtwedQA3yuW z&vEq~tB~OFx@GR$^Nz50}G!@KgHcnWxskMH)spO#M$D6QU}^aAGQ&Il_PJ`Hkplt7H8G zxaFo5cz`l@DXg@~iG5d|;l@|)yn;WFQ}K7-{SE359zep}#iUrDg4nYF06+jqL_t)@ zaO_aK$au*g+RC`uC9K9GIJgQX`sp)I!b|nt1o3tF>Q$35jcrcl^_0(xr>2m4qoF|z z4RbQ0mp`I*|0^H%p*2t#tIB?8t#&dRWdUcR+$e*b<*NYUv}w}}mr|bBU;N@1hAJUo zDu5!;q3!F|tuq67c>-;&t~`YD>Io2QQqt2;KW*p(0w@B?0xf#YF0QD~Q+^c{6~?Ok9b`mF3Yru2!%{Q5P>soWb z@Mz0TyziF=7#6TFID*_Ef=YeWWvhu?0gmz{+0{!<>Y6upA&(EI+ZRvAl~t3lq`ZWR z3U2B`@Kqe&&S?@^Q#gwHfd~$_hjF-d2#r(@=%Uvz%=3c;qy6-5_E2t1<(YsH=7R-_ zZG3b7-7I*>`bDXv(9?>FURoV6h-^W{w{$L()l zdJF&c)QdQDyv<0f(Mb$C#c}(}+exz!CyuWXnZct^zr>ley;!yKT0DH$wbUg}!cV{d zulV`nuj2L1Ye}Q%!vFohNAT#cvT^tA*W=#Xt|gFX33}~BN`VLx=@f`f*U`1AIc|}u~TDMxBD0t&M0G)JZd7V4@kX8 zp?wyW=ir~dd>j7Y@4k;f$s`PrM;!L}$jqtv-rB zyQvB@S-ziioZD)qqnGU)Vx3$0h9Z*>SK?&aL=z2pXc zwr$(CoryWIZQHhO+n(5%FTZ!+Tle0t*ZIGDomE}6Yu7%zq81(0Iis}9F9aY2a~P{Bi#UX8^%-9?Rjpl-QN!*VA~OEmK5Xy9b0w- z=CnkK;{H;U24n*&0S)&h8(;a&T`krt8cpC&$voE=FmT}_^@ySc`T6}k!Zg&>}+gvn|e0)1h%V`-698=FiX_yy=Yo6xG<2`SZ^x=t0{iYKZQqE&to5q zWp80mCa6T)PSbS7Dk^*T%qjZ@WfJh8SPy3uh{a&;b{WtY(h~;7O+=BZYvX`% z*KLeT0%4Ez=^Bd=351Y7K*WlWg5hu0QOz?*i% ztFk9MeFLp&|8~=HS7qfK{8ZfPN=1kT-Ez4`HZj$aoe+WjxZJqict2(8tD8?Rw*wbw z2V1Owkxk-(BOl6sZe%D;J=xp;up4dF=Lb)XCDM*aS)QnuzQQEKV%1X$4ywf0qtVv> zxE+LhdR@48Qke%Dw^8P1&C137{<~{GB zVI-u=2Xb5}?C)25zg-7ml5H}8YMO$JA&LBn_X@^+C&YI&d|MOy-YRQ*ysR>L{s6WZ zdOsU{$FMF5zRtu)Kzv5T{Nf?Bt-zHU6gI)aZFLQS`I~j~(@r>*6@zt(CD$sbL{dc|=&M(;8tCAPoWw6Mmadyd6kL-~{XkpFmQXx<&=6n_eVFs{C9Lionw|cjAcDq%cw0 zk~_Z~bIS@%F@;w;S(7bIDOFyBDM)EL0$Nl6h|a1mnFW8y3Z(lii6)@bVEJh49c@vQ~Fh`}YXkn1<>;hQL| zxfw5E^N{5A2o3%oA3kq0yTiR8_dbJ)O|?`*_dUokbyiT;4dwfLSxDjeujjCg*9Uc! z+goBZuL)cKS{Vaaoi0NNZd<3T=|tA%m2jq$wN)C zbDnWPeS<%bD*zaPfHJcZiO`y&Y8E&~!3;3gglq>H#NOA5LFSUAtL2 z0j5Rv06*v-TB$g0T%`7(vIx;KXllhv!XS=cxb{QygAnmqBKXuG+^DE`;$iH~QGFFf zZJ_;#zcrv6;B*U$%7847=w&lCBh=aK$Q4MF4R&CQ?)q(xt$__Ku~Lsu^MWT&K*DZ^ zV65xyf1|L5fs9d4i!xJ;Kf%Knmf8auZmbML%rNqv15-PA@G7b-s~iG#^H5}N+&$L| zzdjDO8m|Xh5(@TreXnW|i+Nk+4Uj7da|CaDTiRWJ;88zmLV>gtrB!n!yxdOK0|ORbknD!mR{iWWJt}?_flC`G&}3-fnx3xog8P| zSOBE21W8o0VnD7MXD+GHePWf!=rV5<*mc$Q;yFJoFtPIjb6y33sSM?PGKO*xhS%kA zLQ@-Eu(!B6JGs+35YZA6($eXAI)lN2Nene7ohsL+t5<0l^0Y$h)qB6#9iAtz$}<~H zVqPojDdLrSs}pJfi4rOUDL(>$u|4ll7c}t=f%Rmq7OQ+6CBWQl5UY7W^FZCmv|D1b zOy~fM6OBL03m}VSMFR~LXh%;lqv7?h3H2VMGOSZ$_IchVH_<|rQO#U5OF7BCj)H;# z(9gD$3~QOMVZ)SG&8JqNC0BBw@Co(pESC)c$;aZ(L@r&T3)~pZ*4CDhz}7G6iaFuWp)sJcxCHlM>&ut1<5?KL54f19Sle6Xosb_{i zn0VRfx*$w~WaNr(52hbWJX0gb;)o-#qP)5nhN^^5qRWu!2JZXoZ$@SX(vmV)%;2%> z2ce8+-X2uq%t?MP4RAIph|N}n;loUVRCj0*Q7v#F=P2n(c((H#r5@#F?1nI}AT^~!)TmawMRjc@bs;AJLdqH`U_u+7iVc`1UH zW&}t-sJ?zmTR?B59-NX1aDd3$U!BC^eZI~b4tg%g=ypt?CP-lG{BW2?e#wSsC%}y29~m2yfmk~A)`V7 z)~C~d9l)6Jy0EaXDnn=6KB%JT8XUQ-?ak3@Tg`Q??dzsQ(aLi9Mn?o%e#*ZT0--sF z@MdMo>@{34UFj3bal5v*v}8s3B^rb%fM4*Q$TN0;vbxeUEx5`IqBeZ2)`teC@hB#6 z3bRKY!Ij^ixualVl*MQ5r032aq(7|K)-{6dVd+q4Tl>QrpJU_k_~q`0S>KX>cd*oP z{`14XAz|=hr2SJ7VO$LxJU_Vl#c1su%)fB?X4^s`BcPF+;h`+raoSpq= z?u1Cm}z1YK4@O1|d)^uqah; zXK37+s?F&ystMF>LpWuC16R9*I$ba9gQUBM?IO-v;|pU47nW0#FHpA++>sot;rIg9 z8Ru5G?pCUnxcrL)re~2lt!6vGTA6oE2C5~IB7@O&vq5(KguL4|jC zHiN`MGiE%UDX&uxs{po~#Bip#R$UhlNHQ*uNn3GLe;)lW1)PV_cX zIzsvMGe`g(Td#7T=#GwO$Z;|X zu~Z<(Zh;V~b7-qa{4vxDp>)76kg^NJe&i-$O>=>HlNiApSfRi6;Ea8sSx*&Il%|f9F zIO~vz&$c<@^ZZB1*`ZQB`P3x&r0f5ppq&fF$7HI+XXZq#kGOClnIS3K#Tx zGfm;(;h%MYY6WGbmg;$IsEGXY;Q3QLNV~sgUNSFi)Y6sDQU&~kX14h+KTRU(&XGDp zFS`79XaMCqM4g8ntJArq zpHcbVD`+dSMDIyw%x!Pln&Ob59%A` zY=*(48(q=eqpcPMFZ$`sVh^Z_{R6fJG7LFI)oQ$0>DiK5`4_*W_A#(92Tuz zj}E!=l()+XtmPR3X25Q{F$85N4$HV=P@dK#a2FzKZ)w|%Dgnxa#$inQn#8V&uPd9h z1j_kC<132vDt>pcJ$bq;{Ae(nW&tPWRtna=nZ2n>Qb@O|h={UFi9WeUyF<3A8xb|$ z<^t>+9VibUH+obmNR z3|h;<@5fI+l@Uk@GIB;VL2^T|>Jy6W?$WIb91lXMRgvQe5fE4INlW(O(VP}`5k?Pv zic9cQ4n?ohjk?ImNxvjUiQp?$%*Ot^E7d2kmQaFwN*;k5O?BNW%c_9gOJzl4#ZHs{paR$FBM#6=4Ybfd^#nfDoGDw^ZJHp zFodVX_((lAtM2W->xBo*AV`hTEt#RC7bhLoYEEMF6{R+*h%hmSgZs(k2v0_YXL@q4K75yK+Rl!bRD+s$5(6Ad)y#=p&fPyCY7XH5Ols3ybD z%9TaVFZHfVVPEPy2ZrvR6P=1hymen}lklclm^;H~FM|9Off zJq-X*02q}L5$6ii*zgyNf)s&FNOxK-OI2yq4*+TCDg){Zb|g02 zTb}*%r6h($fK*qrpQ-Hcg0B&$-lvrpaQ@hCwqPMv`yS-LaKIGB%5;bwo93`ur$C|5 zO%yk8lse<4~6R6x-5X4PG{=rmiMUY*%_RZOE((MYW1)@0`4eb zIhv*AdD&i>o0BV?6Z7I*o+Jx<*JG5cFb+Hh6&1Fldm@BS0{idnePJJei+k)JRnLrm ze}~_d^>2Y=2sw#6DfP7D#w*Z3&B4=Ub3T(_`t!Me`*(c~WB1KBbt4DF<<^)KrjhtW;G8a&5LFdA5Lu4>yhF&W#O#dD)Ms zJ~DgXa9Av57c4SBOIPO)KrWY?+nT(ZY3*tcWpRS!()=~GfaU>_2UizfaR;H~c~q8} zOMi-Zh@p+u)JmY%lyr1t@b+axxrpYp_4uIJw;Y0i2LWXh>YMUedz%LE9iToRZ!N00 z2HLZlC`_wWn#LAf6Pur6_c~;rNyFmkbR=XUGsO+eEE*`q;+~fp*0yj*L&?1t6*QVV zjy+$m#aLGnZ29ByUL?fujmv6@-4Esft-XSu?||x%v8@xRGUu~5LJW=muy&kPO*Nz+ z+|B^MBgZ_beAXRA@;OZgUxG7taO2l^hYW$DN@o!*BorwOPE8(xza4?`I87F`3mo#X zEu0OG!BSCbb3R^fJDC_xV!s@U*=~N7?=h;XwfAS>Md4J@E~}|v6FT7Z;Z4doFl}PL zFumQ+2g0L+okNNj$%;w@5#=_hF;YOZ$Nti_FGxi;pr2?UGa`HIq(PrGcKHPztDHbT zhSL+~vx4swk0Zxc@LY7ztqdKGORLQ5Ws!JpI6FHj&Sr@YW2O_+#BFDQ2p^rGn^fH;BSGnuz+@( zOrhW}DhVW(BjC9{^StpsNWD95exNDf0umq!=l^(x@nzTqq4u+kVN$(D#}ZuGi^EDN zD`t9Z>=)PTGocmf*%B%H_+|Iq2o5+XK_P5rpb2CVw8vy3jVM%roo-YHPpOG-Y{sSH zr)vm;x?YAwsaHjh7P7>;@`AtIQ|H4BH=ORl&xqwn=LUX(EQfi5v_R?1{j+xU zdayTJp6Eb1tl8r?3%zVb?HW@3wRKJ+6NIL6`=LT$U-wUn24ueT%;IrQ9^$c)Thq%s zp_Py?TKTnETUugQ^*8+C0EGdE3g$r9#YykRU!P5y5aRr@P1R>4DrC*z&WEbX=U<1Z2T`PPyk zwB_t)Yw?mxIHI+Q#KCPB2l-JQgr*4NdzQ%COaK$fGAND72O6tTf%bc3TX7^9uu##b z`F1bw{8n61(cvUV8pdAJciu$-xeC(U&!ev@>6~$8drj3A-ZLCc_S1)-DKA?Tl9ueNQUA;h|D)-S*Y_O(&TO1f8GtJa zF;v@KZ#qX`e0f(JeT^~bWk<8JB#Xp^q`00O_*-KUR~;GAy)(o)Cmx|bbzgeZpief+ z5@Z0DrvwY87;=t}6I~;Oib%*i-|th7egEylO$>mrSCEhg{xUvjv#rSBm;PX!`z(o+!eKSMYI zMoX2|ICFs<`^u0zfUI5VzcWEDMi=tZ%g3F&cM+1IO+H*n0b}-ZT-aCk@zH)QhYrw49Rgm;}2ywzU8&MicaMkW8hK14}=^6yvri~{AV_5eU2x=8KHw?@gpHEru1eW^By0=php^0;JOoEAmc1}MK;7^y$FTn8fz1dG^$2FcrO zMQ^667`q9=tHPgwky*eX!s1^*?G=ChEEhM4uPs+c;S!cS`y-8OqOmy@;_I`8(9ijmaR~j?2?ixlA}KK0z$+<`l_jS!-J9W2t6`a1}IJ zyLC0eJS$9+({gH*{v36SYuU~i=8&spg7wgvFHHmY!I<}8WWjwiW`&$v-rLr^{N|3N z3ASuLWp~AKz5Z z7%*HI4)iugbjDtDM}apxGH8_zf8|lAV5`i&hcjQ2hk4EohT3WeHvXK6(VfyR&?MVB zo7(FyK%?8bo1WX-vsE|r$u&6?^fweiAIh5?-hz;@?k@8?R;^MoBW@l=)SDTV1|U{=|J{j7HMrsiZpUf za)L9($N6}yQcD=Z^g0-IAymv->VQJg$jzfKiYNy`+KY?y+5%2aSzqLcZfI|T=&6kz@RnBx!+cy)v6gWEs0L2>>#&Swrm=#=Gi+IF-4%b)+nYTY=&Js z--b*0oB^e|zR{o6ut}=Zj0PIpi55bC=LZQO8lDHGu&B*pkZ=sy{8rJyxAz?Ml^j1H zYjXW8dG9AFKiw~c~m+@LnJnL1}b#qY{1P~}F*_aQ+;&O)<=%t>B>#NdMYr+!G zGz_tb?GEm$KiaUSsn-7MR{s;aItnl{xIe~})95WobS=h&{;NB7glj~n;XWUfz#?fV zGvIAEh^mMe9LiKuQURPfry2^1d5V<6lOonFR#NipBwb&#{O`P1yDbE7+LhgZ@rKQ*!*9|_+E9SD1vf*HNeH7{i9R)!0{Qr>sJ}xpjHs&IM z$R>hiqJ2Hu?h{^=By*c7HJg!NaoaWebe`(aKR7_%+GKCnXTy@ zXAM8EB2MGubA!0~{74oSB#n`PevMZI3Dm&_SFilJ;;dYv?4I=X9pqS>3q4)6A4Hk0 z5i`p`$nE%@oW3?{l_m9TU8{^kzXmZGSkcXQ&ILC@e1?l}V#4+AKYd0X3dap^cp5=3 zwy=l;Um0tw_W@Q}p&a*LiT6K$`q%Iue0mQHgDJ|PSk+!U7_8RiDHgJ9yCs-GDF+)` zSIh0sM+AiZb*;9L%SaZQtQ}S1=t4Bp(W~pj7kl?Oc$%!NZEL$HAR>o#U67&qd;k(M*yP z_$^_1Zh~_HM`F7i&NR5A8rCzuv&84m5B98})3Ci(=mu5)SqcEURMZoYj~Mfd?J!lo zJ8~oxrR}l>$8mZ&YHk?Hg$eKss-~f#b6WAUw8iI;vYFbfYdAv%v#6%&Xe}e7v(EIC zsA~dJL;`oD^luNEGo?>YZF89U<(3C{3EQ2_RLFH_j;=;47JFI zq;MFEESXu^6T)NOX~sv8<}~F(cc9(V==%x502tcW zM9uMFS(n&^8oNFbx454 zRNU`9_v{95XbQ`@-k~3NDlJ=@cL(;B3MKIPGIkt%wJ}|!DmKg zD)XJSih}a{^yV|kg5{6=jvw71-mLLsi_QhAH!Pm;)1FF5+wjb)8=#jbC0xy)Z7jWg zQ03$7@A+xoF(m$PFIJN3=DnTR?Qs6D$Lq22nIcj#;*RnZq-d!xq5!Q}Br6o_1Hmqf z8qn|`yAhg=tv&&QjkH}HZaF0zU*5vcC6+2@H2khpGukxX5L6hzy9Q)A&u;Oo&A{8r zw@ZZa+jv8H%_v`m#G|7EMj@PtXuI{W54@9(#IlM69DEUi?MvAggD&PC&DmET1N}HDT~bLg5HcVK9?<6d#ncpvIS}tpY4s~u zloT_dhTKZL5g3qo0LwB14{Q@2Kw~sEJ3WNgsuOLkPD@;eYv7WJe#P$t+PnP<-ND5b z9jwL78j_fhaF_^m&|^fraO;4Yz3q~}0IH`Y*I<%B)wzG4zI64)@~D(@-TcITJNxex z0-QrlMa^L53ady3Rk?;f+KG7qK$fc-Kr0tujFUc3KzloR?Nv>1R5YC5Z|I_U)0Wc| zvhDq%jM%**0+*q~g697F%FV0ASs^Z2a8CAy4at6OY$RlUWm^<_RIIC`C5>?9o~3eD z^@#BtH-N~BGRzV|>gdSqgUR`x)aQGD6YG`)b}<4Uzzs83Ev!7#!X{ z)Y#afJJ);vQ9<93oNzI4nHaopSm8^`C=ijEY)v|LbA8!gHK5ayxG!lF6KrdDX>wZo z-+91+mJu91?bTrISB60#kYQS4-NlA?{Yf=mS_d&q`6DB=nMJ+}jqnb#farP=2P9XB zpbS3BM=W7?K}kJSusnL2bkVy&#;$a=FhRdQ`UD4--O7}d&LEWWV1S`ixG3zB`Uh6& z04?-+Rbv=bmAuabrfk1YG${`6+2@t{QKO2~{Q2uuvY$@7N6G$9i@TfRMHT!dR(WY_ zZSJ0q;ufYtkUtB16k)X98K@T%n38c;6FMwB%P*%~Bq;>NQ;jMkev6_953i(p!a%)kG1o&U)U{|DRt6emm=84pGqawRPPp}BBS#bQ|A zE|+(gdtKJMi9im(Z(uJW2=K>^>OM|0v3+)PFM3)(?Rm^EU-fIh$>Qfrh=Q}vgxtD*H}(#d~cAfFR@+O2^GuMtNTDBPaRCyze0 z5*9R5a*iO*fdyXP&c+-euzv8^)Ufa4A1zq5kFtpM)>5m{XSgT0gX14c{yTXxUPvlP zAO|ihTvv9%ok^;F(W+PA0TQVE`eF&QyP#jt$^9JA(lrfOzFRm*r-ncgO7~zQ*%mhq zQ&CC5$lP)*YFH)|b_v0>wZ|x|i4&_yeFS9>3K~1CqOA_`%m8VQ#yZnblkhc6wvz(W z1e_UioW$f1E;NpNvlUaL92=3rfakUZRD0xaxg_03wPA84Ugk>>V{38Wem+=3_9#Q` z*;0q8RwN&Kd@uLO1R=>~9D+*BXp<_m)?9DFUKlna`AvXQOWJB&cT(EPsjf->zcjh0 zu}9p}Cwk{?&69Zl31+;uyA&fRS3#YngwEaxmNSF_hpa<2@h82IK~bIPPZS~4EcBPi zdZ=t>Oa7m%Z88{h67C=se-0@X>0l#E7Ok4=XNu1HBfLcto?mhPD=B=B>TrX( zRdw+jwfs84NYm8cyg8YXEIAg=0)HMsSEBTGK`-_;B;U?xVenW;faUEX5Vtk0t$WKm zIOmpfM(a;>j^1GJvQlRBaLcR(Er)*zJJ)Gp%x11%`Z#w)V%J+_UQmJ|ol_y)d2x6( z;LR$T|8!aXCn8_-1<{A`Yj)0X853z4kgzmb1qC{j)P`;s0mjjou%M2dwdXbqVoOey zxubSAqtP8O^TO^mq$#kuykXG<2uD2#+8Ja7P|@uD@eLDS=bufq zaL8BmVerf4?&u9n6D6?PERX3t_J>#R4l-ID%6hPLeT7FlE6e>0_$eP70;M<8R!Dhj+GTe{FC+}MNANi2w+8l5?-5|IC(J?(t))_$t3-t%t?T;=ZN@j4?9cbH z|JL*LVW+3vYeM+7pxYfdCP*ZvYHS(Iv%B6`I=@;GU&bZp4Sdg1jT;5o1@b5Q2v zkI`>_X)IdyR|g^8BI@Alz-XKG0J$4SpK?|FYA6$}OWA zdK)FrAiyUO3d^#*&8^=BK=dXC7Dbz8SY_;w(2F91`b8OOV33|9IMkac6KwGag!ozomIYIc$m-gR1xZH8|$!5?7mcmherZR zyxegNsNQCT&BuBF5T@$n_{x9ha1M0FpDXICvy%E_z|$S-E@=~Qz|jx>0QsU|otjwU z0-FpCH77A9pWhO z`6*+@#k*c27)Ujg?>+zMS^Lce-hgtiZj|iDvtfThv4mL0T?%{xppdH^)AP}|1~Q;g ztDt8^ALAWUrMSBp&wGb20M6c4P}muzL_Rl?!RJJ9DqGEPYgo6Ro*Bg@$5hqY_O}`z z248vVll*kv0N%YauY@wr8*ks`D`_!+rVHx?n+=1j)jHRbZB@?c|3cvZB5`H_7+{#_ zlzCV58(bf^Jo?p_A^%cq^{w=7CV!WGEo8{bNtP@od54fTJrxL`R1_V9B}1s9Odi$B zn$K?e4EyLbTd-wtebl?I^6@+q2dOXNQKZQV87fju&{Qg_jE%Wi>K&@ekus(j1$#8o zWu>jin?~Lz(0N77wu{FlE4w7TG?#d-tw8 zrtm*In*Uu;i~5tFRaPlY*k%74&+qNwUx_{3hjO?QbwHJ2qE3EsZ z|BZe3Gz_Zf92ETxHz6&1NLB7!J#HL6lf7`Sq1g~oh?b>u%G(-0Ya~tYQRUZ65V8M3 zzB-`gtGkDz`hP3@i62XATbIT9@oFPZQ0_{i(Uvjb0SOWo493(bNJO@vtDZi)u^*^! z5d;ksbSmMDkJga#>KnJz3}x$gz%4Ul46F6I!@pYdMy##*4L8g8!Ccer1!kwg46@F7 zP7WD5lia=~7D=lITzxRU$MP|Wb-5@8-{`;^rAnAB=nG_pj$p*k+g@Aci)MRWt>HGT zMB8Np-{AZOJId3ShkwN|qR!`JfdD|SUOytVcc`EMOYixWB+no^{uEX2nFY%LTpN2i z<--55CA<*;O^i;i$R#7NwZt_v9~>qH2imXHr(g>DzS|DWV^D>RBN)MNtPwH-|6-JY z%FUcu9xIcK@OMT^cEA?4Rrjwv8<6H|L)Vy8DV=o>*YE%8EBefM+sZBFun}Y5523ZsNG( z4OE1%-m>(CJLDk73q6s)CC16_6{TJ;%)-e|81F?p1q%@smLMV{s(HWZceKodGOX12 z;f-e{CC~8qwDY~D*}E#@#}HVX2Df>AeuYkt(Tw#yrY9<@vi>mQY4QExc9CkO5zf?` zH6`7CrJ1lkn%ywE@!;*e(fA7gU<75nSruzMf_sJB4R%LmaS=!}%B0aW;2A7NaDJ_}tc#8~p zUXT-5uwpV#9i_-vaOHREwDQ@}s%Sm?%CF|i)9sRq$}oZ+_Zsbr!lukT?riS+t1}f- z{P>GdH{E77{%4N#VE9i1k+#+s=n0LwFC=cy0FP%swA^SFCrDUKx>6ZCipHc766?3` z_ZEt^p)O!iKpr4YgEW9VCN}oudVF-)Dc|LPWkZmniawYsp$dx6(SYW4&Ol*~+}!?Y z)617x54=jE?0-r|mcTBvT9)Zg8_!0S_rla$0;>!uL7|?qM+j^!?hX-yUp<(^`(dHa zp!8))%a50yjx{zhWAw<~({V~TC~Otq4Zchhqe7ESPn5bIX))!FyDDVe!%p z_Blqq5cnvE@;yF&P?(l5y^0jm^z1DC8DTLODWY~6|7Pblj@;i+tyze29BW#%wWT4q> zVTzwzH^!3>p~SR=FbX7qa&tor{*eXe#c2LdIVJT>0f;tXS~SZDQz_^FFM3=46TKrq zfsm1TBN2u{KQ2it{>92PlBncsNDg}i>u+e9u-aMp0@eiju`s?Ow#~szEeGCHda5nR zmtzbEZ=tMM^xQbjDsitXg|&I9S!bfk6h!=k=H$AXuM9B&@%`d?Lis^|L*f-g!Sr|? zW1=iGaYhMlCrL?1`r#r?(D^x$is3u6U(N1`<~%mUbFR;iBx2k&l;VGW*!f7f zaMfeY9+Ru^{dQd*Q`W$|Oapobc7zbn`NYV3MKvXVnU!6VfmX?-*<|)4{jhZN@THjk ze!nC1lW|z`e;jVVc3mSk3K*!pO}JgVQhN3g;gyYF&!4#Ot}3%FW zoPF%p7^K^J7{6WX<>;}n+PNom-hwmZ@gzQ)jKFEv(Yd!*>o&V_71~@)xfy$Oit)F2 zsSJPk#UffbSY%gvXCx0Ld~*J8G;^=;9ZfF1dD%KvO!BZad?gsj3(9QM?9`dtnMHDS z!;2%UM%lz0@U!}>o6=`nbBa(2Z%9x8B(e**@$+Pc%LN>=ZgBz5mKYtH5S@@BT~SykiH>E&z|U<{R?zrc94IRY z(JO!@kvL8LVvzJcyEqe^=i{MTyF!Agno<|laYEmlH!TrPIlt-BA zFP0NG7~IQ`usAtglNNQ2ARw~8;lcYcu*OoV)Mm#Yas4pKnR7FedEy;h3aV0Eu-iKg z2{TN*CfQZ6|C?U;Qx8TD&1kbJkRx_yE#KsJ*k0ZITBxl~7$G3^(^mtLWJ9c&cq%3& z`_J?kdL7fs)%$+G*3vDJRyl77V~Hz)`qRs1JW!zu$7xiDC-x;M@Xv+3vvS`aWn5EE z&-PDMD9Kcg2?)x;M>;#lU6;DEMHnda00J1u=h49dinN`7N?)szJFgmi09-mz4tQAKFm?45!nY4A zJ9KsC=B4eJ$OWqG)4P>juuNgOg22|yuX6jZw7C6hYBIk3MmgfhWmMb**a-Nmqytf` zl977GCFO+vkRQb5OINaT$83Q{C`G~Pnm>C`tW}SM<=nWdZK`|XP<2|k{jJGQ3#hYK zNf0hdtzWOr^alUHd)!>Nr-^MFPF@f=g=Ag*uV8il;lu%hLdZ&}V-m9hVruaE|3>46 z=UM=m1A&0FL}%r?CTbXnmL_DG5*(@iPR)LiO7B*X5sO^Cx6SzbUyhMw(e#?=V(U#B zw9z>-elI1&DQfHWhf+?yzdLnyZ-|8EBI}oSp~zr9C^#$OtG2}7TP~|< z`~x?+FXlcW%t8NX_m*LfzeM+jU&q5T5;7&w019=mc6dcy!JlW48NA5wh<#>+OcXf# zhF;y0I10adfNPo|P1=9_NI(Z_C(mPM@-M{TRTt- zSZ<4cO{PmLn0$>|)c%tz{JP$-eSMskR*=ifhtiHX zn!X3)M0Xnb=5>a?y~)&O+9M)F=0K)KVM)XvjYznHGF0;g-GIthk@bqu1$B}jHZtBrs^GqjCZ$Izrk?C@p^o>2b^`V(TTxbnnRY^({`+SslO_s2K6>#PT^-A6UuVmQ<{M+40 zAC_>-cKxiZ?rhn4Nzb|SK#xuBe;$g2a;sW{mO{*vFi{Q%Fd*h55)s`Ep?Cu4#Kj?s zAWbNfk`fPqm?&x-_}6FGqr@JX@*J*vrR=;EKS%%jx7D${%xmjC^6aAi^hHow`Pb>M zrr+tw%s;=U?3<=&%;$cE85e==M;DPQl`#X+mM&G~Gzsu1BAdY1TC>6MsE?)nr0n#I zrhl0_jXu=~-xYsE9o>iHc1K=bu3M|=FXl!Ob%e^GJV!!s3(W$XGpU9gkefiA+0=$n z23a8wptq=Ka{_v=B8M5*X^}%RfrE+ZDrnNGo>!VJt3Y2!KF%Nb8^dAvnHQ=g2h%P8kUI{cC?7ml%)?k z8=?{)TK-7J4$l;Ye0WuZtA|4S6ck2)f-=z-%3V1Tr>p0@sO_yrDUu<`U>13~?^@_W zBGK_cSP0$YgxK43n#Ki07Pk*AKFRhjAzW|PcIX*k!t_28Z9Lg;uUhBw#5DkGbWyp| zSlHp(;ZelmHr2$)VZbxO{yRh7Hc1i@0R6A?94w$6VW`hM2h9Vkc$Z!(r;WVd%^b&r ziqnLx{Oy3hlCQGF_wTlDy0-R5G6CzOx-S4cncvve;Zyw@5W4st!N=h^%4mWbU}M$2 zKT{_gqQg!aYCsiA94b+gjEOGU27y_%&l*0d#c@h|QRC(Z1Vm&nDI%ycg%mzxg`a=2 z=NCU&vq3sgLRUq*zR7OC26IZkw&u_q4fnhe0aEDZVzVUn^JKV`#?TUS8D8g*SWQ6T zV<0Uf;d_P*(vhCz$5s&OqRQamA&OzP&|@(zcnhw6p^jc+JYXphN4mxSR$I@>6TaO8s)N2%8E3{`5{6*2{sn zQO5MY?)dnPH(BXTnZ-qD29i|+(C+HNal35@uezx{dQ>$me%5cYiV{Spnqk?K5ceE( z5{43&)EL<3x99_9p?xHPNgCm3qJ&_wef7-#U}h2hu@eK;FzMm@#{}&rV(?BocJjmM z9~ZEmZ@tSABdqOkyet8iD2}Q$)Zyrsd@8i?Tb_X+Ch$8hJ+_sV#+E;Ay)ZD+ElHDF zq2C)Ws{S;>oFR2~<$f^5`$G#X;kS<6=#m1k6htxe=qGo&xLGtJzNk?Bm%Lo(aJ0{t zI6Z_Xdde8Su3=XF^M};%nx&x^x(<1H(khD*};kb807Gb{o z`G(xhNuy|B60`7W)|FHaf(Spx}yUPHK+Oi?T8{~Nj@1{jb{#hBz6 zQb^e1r|qbWDX4sWOV(eAK0IRkgAX8#ME{=eh_tmPHxcNQh{9ns-e4Psp^Q3Ym!>Pn zI&sR0j_f*_HGFi1!m=VCM7BzX6XPl30th}saRlM`JHsp#<@%?A>qF^6pt`WW)4wXxe|>2z&pKW>^>yu_;H?8NKQ$ ziEd=eS4$NA?b-Q`3XFvKe@XbmRM84r9 zsjMs)L~VUeR`~h`gWcXbNI{)u9@>&)#-X%C8d^(?ghi80k)2M}OA(~u)YZl5Dp}&V zgxzD~_Ng#_n+|+Ii;zL6D*;$#)PWG{sa|!q4jv5Z5a1uy%-|GK7CNpr?B6WoOno&}BR? zzEN&bYBVWoVuk7+`0atl_fo*m{2{~t>-5nhT5vPTkS9?k1t)R&W-CgUzF(PR*lVGq zwBmQa#J!!t%O7?E5!` zvHj5L67c`Ecb#8NHQ$;lAPS*_^p12?ii8e|uXJglhYms@v`_*<=tY_csB{oTkluSI z3eu#7P^5?`ArN{5!5iN9_j*6v`xo4Gv(CzgS#xIgdGrk~uR^Jv&qGVB zL2Q_X2GHk69OZqAx=AjKYs@}znx=Th{#RBy(0$G)Xeq~MoRZ}{qU8L>Cid85urF8t z%nWXnX+^{X8KI?rp=2b(=Dq2bw=DydRqHBV2#rC>TEm~G$ z5{NI~3zboh-dIzXn59R^Sz9c^mbCIsuC*z|WYIoiUnAXzR+vzQ;M=*tIRBy2H{tNQx8Gwt2PMTnG=ykSGLRgx z*(tPOFdDY&!kgWn3_TToBv_;MZu(BDlsm=7YwCidemA-5ZOjCe0anItB`%%PHLWNc zqs-5C)21*pz}#PIalDrEhL&it_*tWo-qW1=RBpD~@jMlG?!hhJlJ7Q8>2`2A1ByKx zdk#J|KnJ^vul>+T}mo1Vki2V~XLc4Z6`13w5I$bpoR>)I05^vqkl^WvEIUMzy+sJ%ga#&yLHk3p* zaKrAJ2H<1 zrdLw)zpeA?4sL_Ketl_IpPG^ke!|n5RfB&p-3bQYbIA_!t_wAf3w%)U$I-SwD8vpx z>)oJc8kWGK$ygjPls=yM`rx^Qa&AEXQUe{Sy24x$6IC%GBl`C;Y4~zOxcC?zE^jxgqft%~*NMu~xRXprf zTd6mi?q5T}8&WmovAXOQ@!m+d@l-mC<3>W@3D}9otn8Dw`E01K`-^xR@+2UG-U65_K5fTUK6Z;>#B4mE#-J&=_ z5}iDh*qqBYcK#mx`!eZq1%7>C;EqJE8=?N!Ic!2AuJh>RocX8IuduMBf4O+kh6Hjm zHi59zzdU}J#s~+w|gZo zwsNR?Pt9}E$mWH8$$3(H5fn@g32;J%?1Y&g?Xc{aBzWWo%4>lqlkF3}vMzckGRR$s10dq#% zk4%gy<#QsGn>n*qO7qx$x*kdV=TnmpDo@o=$+{e^kx4pF?8ns%5bIUsaa)VGtjW4* zL5ws{bdI(WTUdPD`v`FJ+50?3q1hBW z-s9_pO1Ir!#zXS@KK;K^2}$j|SRzJRucd)<$)Ho&Lmb2E^JUX(`yT zdwzT@eS=G}_N*H}1y7G4v%4L4Uu`+jC`w5e-)D53Cl1U}$U5B_P>z0=DfyJU7aGaS z{~Vgg$DFL$a`%)XZu?VdPE$+DJv)7_hV{5u>;Nh)`3U#WyXU(%Hhax-koK2sEJgdn zKU@LV26M;T_eHgpo!!EJPB8mY0uZzd-cr}$`#QH~Is0UP#19q}`iTn&1FTl3F{}y2 zJf+jE@+(~d4BqiiKU%3y?3We!cEil8rt}ZJc9UK0x8<$TVYp2h|aB8i>0rry1R&72|)VFp7N@Hm=exu6u zd3l1r=Q_y)= zD5fxsA0&%KIJt}GLBMQ>rBywhynR#|DA=_$mLQ2uRT%mR@!PhV+6PiM8 zR-bC|JVVoZ)$6&s#rQ@BaMhes`>tSvSdwpsVGQ9+RPsi{a}M)7u|hk>5AH2_PCB0O zk5_VaZBwA1{_-gc%$m#nb;iJ^WWTK+(%Q5M>GEoBI?pax zDmOU#_k-7#K0cJxU5XvrjDeY{moZ$e(jU$W!r-M=9e3}$#2>`+4?Z5)w&=7Htp%8_`J?GuyI8&2J>+K zM(8uU4xgzko;HeFlfBk=~*JTw$y*yxmm31#X1Z28O>ZzwOS9S|iRe2nL-Vl@fJVzbASf(7l*CtU|83wLt z$*<3@(zi^JqDK9mKeB*3`VL(=Yg^LkADMIYydLm+GG9lB{n+o&0~|C_QJIc zdg>lO#RmFi*C`W41egmQ;dkW7xd{jvCDNBvDm4pVhTuhQfLrM2I(G&_5-12`^PqVnHw z^6IN&xE9Z1%g8TT8OKZKwy}x6sJ-URE`EYh)lFI+-v@}{E5nE)-(AtM4^)IWa>J(! z-{u~9UriTiYikD^J_WtAZ20jqxT239q^NK$$$3U{`9++mv^#0%IV`|sG|OxjQYMzL z5nBNZcuZ%%DB2YC+9dslu7d?>@ks&EauKT{=OGgla_^D+_(u0 z-nYd>%e=fmPld1UT&Ab3-Yeq9;ccQMRyoC1K}g6lqvf zd_qk2o`xY8<@{c{{dP81vC1auL27TgvP^OZU~Qe^N?t^lTxaC>t4x%p_0)4!2)*hh z;9}J2ZCKco+2I~-3G;$h7Gt1q3h#RiJFsyhRs@$DZQ*xQ;ifT+Mvq?YU)h*T2L`k8R-2lrcMz8|nEax97#5)a};z zQ&>~4#6L7JCll{^hQtSR2d@K~I}MXxYV@`O1Z3t@oi`0B&duTLd1kQ+cEg+RNxisj zpsgUWo+xIsEq{vrz8NqlBwp=PDUauzFSIFQLzbB3ivGn&KU0(-tSE)!wEe|LwDkK5 z&KB6sg236GJEysEH8$EN(Raj$7VSQMv*Dzgwm^(%Ff%XEfDr9aPKm~f8s=30Qo1gh z45ov7n>*>y&F@VSEXicnwMc)M^IK6zCXo$QL@3G| zYRmWikQu+FfaG0i$7!3VagZywOk$lmxBL9Vd#Mrjq2CcUQP8(F8c3vie6vKF{;*7-!ZB2ywZ3fKq^)oXe*l z_kCVRb}%A6CGD<(9+U*4U}gcK<~A>_l2*9%TMa^l;D`Z`q{k+k*Xq>G`y84jY%vIm zd6wn`T1Kpv;6dFN>#7$C>wOn8+^QYE57;j{TjJ0CHgP|ydhm)n;!D6@F`KX>+Gfnq zNK9Tos$PZTXk?XnVV!7h1g!kNNA=UkP?dU#qn6J{yd>N-2(t_A_5L444+()A<&8^x z_P!j2y)3ie3K_x_FgP#E2nX@J>O-wWa_Z*$AKWVq{L~JZ>v{E~6q4k{!!B%KbRjdd z_<2HUg(f6B=&d#3`~u>w3Zs0W_T8q&${lG{>)U|+3fn8xt8W%crH6hFsZjfT#PnAp z&kt;Cm3GZ=>-NV{f~}`Dse1XY`&+oyAY@bc&|*@bi|uYlpk7I!NZLZO=r6@ziO{i) znA7O0fXMl3WT#b2sI7YgtGsZtxi@L5_gze$r_;&(XWJdw2}22|SL*ANpD_J`ogF!P z_BQePX#MqX>bB5^YTHUC6ceGAdzYy`*u}5dupZ?o^dSFDN6t^VYRK~A=a^6Nqpls0 z!2HDFDc>FN`YWq?`Nk=a0pP~c54=9I$-GuM)oRe&>J2uyDR0^51XdN?td#7hm-{P` zBjJtAUTSx@_cArmBvSn(23jIfRv3JCKJ&KXPkzZVQ!Qcu_YDxV)+BjHo>m?*V>-u~ zSRKfuCxX#WgwNHzvOKl-+;OpKo?2dVA0I2J1qOVfA>a5DS);C#0AC_uCAKrX|HfZ` z=ltRNY7Rl;acbB9$*BJGj}jxH?}y*BJxV0RraeFa*n}lE)`XrAr6bPOBF5|QH@#QuKADKX zb}LGf@YQtxZ(Xj~Yrbsq1fNx>5MttODU#6qz1G?CzZ_ajweMpIvr~#ANQ)ZudWV1eOp%NtV?kgK0`?Vn$t5tP?!-SrSo!YiL#D<7cd1GwDX{ z1UyM}R(LUu2aOv&$99}7#B?Y-GQ85dU!`fr0JI|IDO!;5JRp+P{`Q>?iJKjk;9a5s zYqDcn;#kx5gQw$-W_tAfm5R3>6IfmiOeG-UaAMVcKJ7v}qW1S-++=u;|Ix3IN{Q}9 zy(^vZKXRjA5*0Xx<~RBhgxZ)iTKGq(DOgW_h2E8>-&-vN%rAI)HdzZ?D{}>6t%Cih z`hzjgNQ}%*qhY2WUYHLFn2q@h_5YCi9ojhhh zxD<{NkZh15R_VrbcpMRM+TwqG{0~mqSlf0dv13se2QflJyc&>h#V=`GN(E`D{MWVS zQ_6Q?1pE*}xWNgkGZ@yPkge>1(16vB7?)7#M)MN-^ozVpzU(823kX{1v`;$j7$wg_ zT&j1%mJHvqMi{Od_JF=0EB)|s31Nmps!-X(h;tk4sM>AA%}ep)hziU@y&820KxC&s z0uq`?bRtv&eUi}fSPd{+*`*4x>Jj5NRjSdw<|tuNneB{;=D%548lg(7?C_<$v*@G0 zgwp-TuPDj#h=fP??hbC60tFLRVK*ciP5(AL*0%@a8mVZzY1`Vw8Ilq8OFg;ik%UU_ z{T$ddsX#I<>l-=*0qL-pi5Rscdb~2w<*ml~@!Q61%Z8Z8S%gMIDqQQpCPRTSHcKFO z0WvUcOVEcJ177DW-nr80OGJP4v%iHRAy!GnTdy6w<0KdpFy$3moM23DB)Y{7dqBOZg=kAj|}N!XvskPg?@w9FFJF;bPmC6*SKYW`7n>)cL57022|#>b6v79_GQC8haEYH<3M$Qw`j zKjGZ5tg>r)Wjfco8Q3ln1|a14p}n)))gIjq9hq1I4chq(xkVRcLqCF+Ez^4rg;lzu^c!YplF|J8J=^bHp)a$965=9 z0v?7fzO=WQ2~tSoz{Ft9#kclAgnd4!k}2^Ez!Gx)cj*XJt~4~1#8?3&pQos znhx+k51zCbEOKwvGt5G?#goH{GNA9j_n7Hw1Q#rmZka);3~^pOjObz}-r#yUL}yZv z5VZH&?_ivk+b0zW#c`ei4xZdLfP>S04!&IULdB8Qnbl@{_AU>M7iRC#u9-kc;laNS zK>iH4a(tXIff&KR0hIhScZGXO_xgbGl?2A*>`%gJLQ6t8w4Vi$(-63jX6r-6m&*tK z_pT0AXe0i-M(e-AmtZ;FvcA*!zHROzrXJ3YgK80BmWD4MQ(-w diff --git a/snowflake_dask/item.yaml b/snowflake_dask/item.yaml deleted file mode 100644 index c12d3aba0..000000000 --- a/snowflake_dask/item.yaml +++ /dev/null @@ -1,25 +0,0 @@ -apiVersion: v1 -categories: -- data-preparation -description: Snowflake Dask - Ingest snowflake data in parallel with Dask cluster -doc: '' -example: snowflake-dask-mlrun.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: xingsheng - framework: dask -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.4.1 -name: snowflake_dask -platformVersion: 3.5.0 -spec: - filename: snowflake_dask.py - handler: load_results - image: mlrun/mlrun - kind: job - requirements: [] -url: '' -version: 1.1.0 diff --git a/snowflake_dask/requirements.txt b/snowflake_dask/requirements.txt deleted file mode 100644 index 0bca2c92f..000000000 --- a/snowflake_dask/requirements.txt +++ /dev/null @@ -1,2 +0,0 @@ -bokeh -snowflake-connector-python[pandas] diff --git a/snowflake_dask/snowflake-dask-mlrun.ipynb b/snowflake_dask/snowflake-dask-mlrun.ipynb deleted file mode 100644 index 03936f2a5..000000000 --- a/snowflake_dask/snowflake-dask-mlrun.ipynb +++ /dev/null @@ -1,437 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# This notebook is to create a function to ingest data from snowflake with a Dask cluster" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The dask frameworks enables users to parallelize their python code and run it as a distributed process on Iguazio cluster and dramatically accelerate their performance.
\n", - "In this notebook we'll create an mlrun function running as a dask client to ingest data from snowflake.
\n", - "It also demonstrates how to run parallelize query against snowflake using Dask Delayed option to query a large data set from snowflake.
\n", - "The function will be published on the function marketplace.
\n", - "For more information on dask over kubernetes: https://kubernetes.dask.org/en/latest/" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Set up the enviroment" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import mlrun\n", - "import os\n", - "import warnings\n", - "import yaml\n", - "\n", - "project_name = \"snowflake-dask\"\n", - "dask_cluster_name=\"snowflake-dask-cluster\"\n", - "artifact_path = mlrun.set_environment(project=project_name,\n", - " artifact_path = os.path.join(os.path.abspath('/v3io/projects/'), project_name))\n", - "\n", - "warnings.filterwarnings(\"ignore\")\n", - "\n", - "print(f'artifact_path = {artifact_path}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Load snowflake configuration from config file. \n", - "This is for demo purpose, in the real production code, you would need to put the snowflake connection info into secrets use the secrets in the running pod to connect to snowflake" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load connection info\n", - "with open(\".config.yaml\") as f:\n", - " connection_info = yaml.safe_load(f)\n", - "\n", - "# verify the config\n", - "print(connection_info['account'])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Create a python function" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This function querys data from snowflake using snowflake python connector for parallel processing of the query results.
\n", - "With snoeflake python connector, when you execute a query, the cursor will return the result batches.
\n", - "Using Dask Delayed it will return and process results set in parallel.
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### write the function to a py file" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%%writefile snowflake_dask.py\n", - "\"\"\"Snowflake Dask - Ingest Snowflake data with Dask\"\"\"\n", - "import warnings\n", - "import mlrun\n", - "from mlrun.execution import MLClientCtx\n", - "import snowflake.connector as snow\n", - "from dask.distributed import Client\n", - "from dask.dataframe import from_delayed\n", - "from dask import delayed\n", - "from dask import dataframe as dd\n", - "from cryptography.hazmat.backends import default_backend\n", - "from cryptography.hazmat.primitives import serialization\n", - "\n", - "warnings.filterwarnings(\"ignore\")\n", - "\n", - "@delayed\n", - "def load(batch):\n", - "\n", - " \"\"\"A delayed load one batch.\"\"\"\n", - "\n", - " try:\n", - " print(\"BATCHING\")\n", - " df_ = batch.to_pandas()\n", - " return df_\n", - " except Exception as e:\n", - " print(f\"Failed on {batch} for {e}\")\n", - " raise\n", - "\n", - "def load_results(context: MLClientCtx,\n", - " dask_client: str,\n", - " connection_info: str,\n", - " query: str,\n", - " parquet_out_dir = None,\n", - " publish_name = None\n", - " ) -> None:\n", - "\n", - " \"\"\"Snowflake Dask - Ingest Snowflake data with Dask\n", - "\n", - " :param context: the function context\n", - " :param dask_client: dask cluster function name\n", - " :param connection_info: Snowflake database connection info (this will be in a secret later)\n", - " :param query: query to for Snowflake\n", - " :param parquet_out_dir: directory path for the output parquet files\n", - " (default None, not write out)\n", - " :param publish_name: name of the dask dataframe to publish to the dask cluster\n", - " (default None, not publish)\n", - "\n", - " \"\"\"\n", - " context = mlrun.get_or_create_ctx('snawflake-dask-cluster')\n", - " sf_password = context.get_secret('sfPassword')\n", - " pk_path = context.get_secret('pkPath')\n", - " pk_password = context.get_secret('pkPassword')\n", - "\n", - " if pk_path and pk_password:\n", - " with open(pk_path, \"rb\") as key:\n", - " p_key= serialization.load_pem_private_key(\n", - " key.read(),\n", - " password=str(pk_password).encode(),\n", - " backend=default_backend()\n", - " )\n", - " pkb = p_key.private_bytes(\n", - " encoding=serialization.Encoding.DER,\n", - " format=serialization.PrivateFormat.PKCS8\n", - " ,encryption_algorithm=serialization.NoEncryption()\n", - " )\n", - " connection_info.pop('password', 'No password found')\n", - " connection_info['private_key'] = pkb\n", - " elif sf_password:\n", - " connection_info['password'] = sf_password\n", - " else:\n", - " raise Exception(\"\\nPlease set up the secret for Snowflake in your project!\\n\")\n", - "\n", - " # setup dask client from the MLRun dask cluster function\n", - " if dask_client:\n", - " client = mlrun.import_function(dask_client).client\n", - " context.logger.info(f'Existing dask client === >>> {client}\\n')\n", - " else:\n", - " client = Client()\n", - " context.logger.info(f'\\nNewly created dask client === >>> {client}\\n')\n", - "\n", - " conn = snow.connect(**connection_info)\n", - " cur = conn.cursor()\n", - " cur.execute(query)\n", - " batches = cur.get_result_batches()\n", - " context.logger.info(f'batches len === {len(batches)}\\n')\n", - "\n", - " dfs = []\n", - " for batch in batches:\n", - " if batch.rowcount > 0:\n", - " df = load(batch)\n", - " dfs.append(df)\n", - " ddf = from_delayed(dfs)\n", - "\n", - " # materialize the query results set for some sample compute\n", - "\n", - " ddf_describe = ddf.describe().compute()\n", - "\n", - " context.logger.info(f'query === >>> {query}\\n')\n", - " context.logger.info(f'ddf === >>> {ddf}\\n')\n", - " context.log_result('number of rows', len(ddf.index))\n", - " context.log_dataset(\"ddf_describe\", df=ddf_describe)\n", - "\n", - " if publish_name:\n", - " context.log_result('data_set_name', publish_name)\n", - " if not client.list_datasets():\n", - " ddf.persist(name = publish_name)\n", - " client.publish_dataset(publish_name=ddf)\n", - "\n", - " if parquet_out_dir:\n", - " dd.to_parquet(df=ddf, path=parquet_out_dir)\n", - " context.log_result('parquet directory', parquet_out_dir)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Convert the code to MLRun function" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use code_to_function to convert the code to MLRun
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "fn = mlrun.code_to_function(name=\"snowflake-dask\", \n", - " kind='job', \n", - " filename='snowflake_dask.py',\n", - " image='mlrun/mlrun',\n", - " requirements='requirements.txt',\n", - " handler=\"load_results\", \n", - " description=\"Snowflake Dask - Ingest snowflake data in parallel with Dask cluster\",\n", - " categories=[\"data-prep\"],\n", - " labels={\"author\": \"xingsheng\"}\n", - " )\n", - "fn.apply(mlrun.platforms.auto_mount())\n", - "fn.deploy()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### export function to local `function.yaml` file for testing\n", - "in the real usage, we will import a function from hub" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fn.export('function.yaml')\n", - "# print(fn.to_yaml())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### import a function from local `function.yaml' for testing (Need to change it to import from hub before PR)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fn = mlrun.import_function(\"./function.yaml\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# fn = mlrun.import_function(\"hub://snowflake_dask\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fn.apply(mlrun.platforms.auto_mount()) # this is a very important line" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### create a dask cluster and specify the configuration for the dask process (e.g. replicas, memory etc)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# function URI is db:///\n", - "dask_uri = f'db://{project_name}/{dask_cluster_name}'\n", - "dask_uri" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "dsf = mlrun.new_function(name=dask_cluster_name, \n", - " kind='dask', \n", - " image='mlrun/mlrun',\n", - " requirements=[\"bokeh\", \"snowflake-connector-python[pandas]\"]\n", - " )\n", - "dsf.apply(mlrun.mount_v3io())\n", - "dsf.spec.remote = True\n", - "dsf.spec.min_replicas = 1\n", - "dsf.spec.max_replicas = 10\n", - "dsf.spec.service_type = \"NodePort\"\n", - "dsf.with_requests(mem='4G', cpu='2')\n", - "# dsf.spec.node_port=30088\n", - "# dsf.spec.scheduler_timeout = \"5 days\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "dsf.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "client = dsf.client" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Run the function" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When running the function you would see a remote dashboard link as part of the result. click on this link takes you to the dask monitoring dashboard" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "p = 'my-local-test'\n", - "parquet_path = f\"/v3io/bigdata/pq_from_sf_dask/{p}\"\n", - "\n", - "fn.run(handler = 'load_results',\n", - " params={\"dask_client\": dask_uri, \n", - " \"connection_info\": connection_info, \n", - " \"query\": \"SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER\",\n", - " \"parquet_out_dir\": parquet_path,\n", - " \"publish_name\": \"customer\",\n", - " }\n", - " )" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "client.close()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Track the progress in the UI" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Users can view the progress and detailed information in the mlrun UI by clicking on the uid above.
\n", - "Also, to track the dask progress in the dask UI click on the \"dashboard link\" above the \"client\" section" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python [conda env:root] *", - "language": "python", - "name": "conda-root-py" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/snowflake_dask/snowflake_dask.py b/snowflake_dask/snowflake_dask.py deleted file mode 100644 index 8846e821d..000000000 --- a/snowflake_dask/snowflake_dask.py +++ /dev/null @@ -1,125 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -"""Snowflake Dask - Ingest Snaowflake data with Dask""" - -import warnings -import mlrun -from mlrun.execution import MLClientCtx -import snowflake.connector as snow -from dask.distributed import Client -from dask.dataframe import from_delayed -from dask import delayed -from dask import dataframe as dd -from cryptography.hazmat.backends import default_backend -from cryptography.hazmat.primitives import serialization - -warnings.filterwarnings("ignore") - -@delayed -def load(batch): - - """A delayed load one batch.""" - - try: - print("BATCHING") - df_ = batch.to_pandas() - return df_ - except Exception as e: - print(f"Failed on {batch} for {e}") - raise - -def load_results(context: MLClientCtx, - dask_client: str, - connection_info: str, - query: str, - parquet_out_dir = None, - publish_name = None - ) -> None: - - """Snowflake Dask - Ingest Snowflake data with Dask - - :param context: the function context - :param dask_client: dask cluster function name - :param connection_info: Snowflake database connection info (this will be in a secret later) - :param query: query to for Snowflake - :param parquet_out_dir: directory path for the output parquet files - (default None, not write out) - :param publish_name: name of the dask dataframe to publish to the dask cluster - (default None, not publish) - - """ - context = mlrun.get_or_create_ctx('snawflake-dask-cluster') - sf_password = context.get_secret('sfPassword') - pk_path = context.get_secret('pkPath') - pk_password = context.get_secret('pkPassword') - - if pk_path and pk_password: - with open(pk_path, "rb") as key: - p_key= serialization.load_pem_private_key( - key.read(), - password=str(pk_password).encode(), - backend=default_backend() - ) - pkb = p_key.private_bytes( - encoding=serialization.Encoding.DER, - format=serialization.PrivateFormat.PKCS8 - ,encryption_algorithm=serialization.NoEncryption() - ) - connection_info.pop('password', 'No password found') - connection_info['private_key'] = pkb - elif sf_password: - connection_info['password'] = sf_password - else: - raise Exception("\nPlease set up the secret for Snowflake in your project!\n") - - # setup dask client from the MLRun dask cluster function - if dask_client: - client = mlrun.import_function(dask_client).client - context.logger.info(f'Existing dask client === >>> {client}\n') - else: - client = Client() - context.logger.info(f'\nNewly created dask client === >>> {client}\n') - - conn = snow.connect(**connection_info) - cur = conn.cursor() - cur.execute(query) - batches = cur.get_result_batches() - context.logger.info(f'batches len === {len(batches)}\n') - - dfs = [] - for batch in batches: - if batch.rowcount > 0: - df = load(batch) - dfs.append(df) - ddf = from_delayed(dfs) - - # materialize the query results set for some sample compute - - ddf_describe = ddf.describe().compute() - - context.logger.info(f'query === >>> {query}\n') - context.logger.info(f'ddf === >>> {ddf}\n') - context.log_result('number of rows', len(ddf.index)) - context.log_dataset("ddf_describe", df=ddf_describe) - - if publish_name: - context.log_result('data_set_name', publish_name) - if not client.list_datasets(): - ddf.persist(name = publish_name) - client.publish_dataset(publish_name=ddf) - - if parquet_out_dir: - dd.to_parquet(df=ddf, path=parquet_out_dir) - context.log_result('parquet directory', parquet_out_dir) diff --git a/snowflake_dask/test_snowflake_dask.py b/snowflake_dask/test_snowflake_dask.py deleted file mode 100644 index fc2d4c93a..000000000 --- a/snowflake_dask/test_snowflake_dask.py +++ /dev/null @@ -1,24 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -"""Snowflake Dask unit test""" -from mlrun import import_function - -def test_snowflake_dask(): - """An unit test""" - fn_to_test = import_function("function.yaml") - - # a fake assert to pass the unit test - if fn_to_test.to_yaml().__contains__('job'): - assert True diff --git a/sql_to_file/function.yaml b/sql_to_file/function.yaml deleted file mode 100644 index 10b332a58..000000000 --- a/sql_to_file/function.yaml +++ /dev/null @@ -1,47 +0,0 @@ -kind: job -metadata: - name: sql-to-file - tag: '' - hash: 61f616fe697994e05cf018f2ee94c4ea25ed8863 - project: '' - labels: - author: adih - categories: - - data-preparation -spec: - command: '' - args: [] - image: mlrun/mlrun - env: [] - default_handler: sql_to_file - entry_points: - sql_to_file: - name: sql_to_file - doc: SQL Ingest - Ingest data using SQL query - parameters: - - name: context - type: MLClientCtx - doc: the function context - default: '' - - name: sql_query - type: str - doc: the sql query used to retrieve the data - default: '' - - name: database_url - type: str - doc: database connection URL - default: '' - - name: file_ext - type: str - doc: ("parquet") format for result file - default: parquet - outputs: - - default: '' - lineno: 9 - description: SQL To File - Ingest data using SQL query - build: - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IHBhbmRhcyBhcyBwZAppbXBvcnQgcHloaXZlCmZyb20gc3FsYWxjaGVteS5lbmdpbmUgaW1wb3J0IGNyZWF0ZV9lbmdpbmUKZnJvbSBtbHJ1bi5leGVjdXRpb24gaW1wb3J0IE1MQ2xpZW50Q3R4CgoKZGVmIHNxbF90b19maWxlKAogICAgY29udGV4dDogTUxDbGllbnRDdHgsCiAgICBzcWxfcXVlcnk6IHN0ciwKICAgIGRhdGFiYXNlX3VybDogc3RyLAogICAgZmlsZV9leHQ6IHN0ciA9ICJwYXJxdWV0IiwKKSAtPiBOb25lOgogICAgIiIiU1FMIEluZ2VzdCAtIEluZ2VzdCBkYXRhIHVzaW5nIFNRTCBxdWVyeQoKICAgIDpwYXJhbSBjb250ZXh0OiAgICAgICAgICAgdGhlIGZ1bmN0aW9uIGNvbnRleHQKICAgIDpwYXJhbSBzcWxfcXVlcnk6ICAgICAgICAgdGhlIHNxbCBxdWVyeSB1c2VkIHRvIHJldHJpZXZlIHRoZSBkYXRhCiAgICA6cGFyYW0gZGF0YWJhc2VfdXJsOiAgICAgIGRhdGFiYXNlIGNvbm5lY3Rpb24gVVJMCiAgICA6cGFyYW0gZmlsZV9leHQ6ICAgICAgICAgICgicGFycXVldCIpIGZvcm1hdCBmb3IgcmVzdWx0IGZpbGUKICAgICIiIgoKICAgIGVuZ2luZSA9IGNyZWF0ZV9lbmdpbmUoZGF0YWJhc2VfdXJsKQogICAgZGYgPSBwZC5yZWFkX3NxbChzcWxfcXVlcnksIGVuZ2luZSkKCiAgICBjb250ZXh0LmxvZ19kYXRhc2V0KAogICAgICAgICJxdWVyeSByZXN1bHQiLAogICAgICAgIGRmPWRmLAogICAgICAgIGZvcm1hdD1maWxlX2V4dCwKICAgICAgICBhcnRpZmFjdF9wYXRoPWNvbnRleHQuYXJ0aWZhY3Rfc3VicGF0aCgiZGF0YSIpLAogICAgKQo= - commands: [] - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/sql_to_file/sql_to_file.py - affinity: null -verbose: false diff --git a/sql_to_file/item.yaml b/sql_to_file/item.yaml deleted file mode 100644 index 2f6ae4c53..000000000 --- a/sql_to_file/item.yaml +++ /dev/null @@ -1,24 +0,0 @@ -apiVersion: v1 -categories: -- data-preparation -description: SQL To File - Ingest data using SQL query -doc: '' -example: sql_to_file.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: adih -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: sql-to-file -platformVersion: 3.5.0 -spec: - filename: sql_to_file.py - handler: sql_to_file - image: mlrun/mlrun - kind: job - requirements: [] -url: '' -version: 1.1.0 diff --git a/sql_to_file/requirements.txt b/sql_to_file/requirements.txt deleted file mode 100644 index 822eabb88..000000000 --- a/sql_to_file/requirements.txt +++ /dev/null @@ -1,2 +0,0 @@ -pyhive -pymysql \ No newline at end of file diff --git a/sql_to_file/sql_to_file.ipynb b/sql_to_file/sql_to_file.ipynb deleted file mode 100644 index d4a084adb..000000000 --- a/sql_to_file/sql_to_file.ipynb +++ /dev/null @@ -1,1567 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# SQL Ingest - Ingest data using SQL query " - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "ename": "ModuleNotFoundError", - "evalue": "No module named 'nuclio'", - "output_type": "error", - "traceback": [ - "\u001B[1;31m---------------------------------------------------------------------------\u001B[0m", - "\u001B[1;31mModuleNotFoundError\u001B[0m Traceback (most recent call last)", - "\u001B[1;32m\u001B[0m in \u001B[0;36m\u001B[1;34m\u001B[0m\n\u001B[0;32m 1\u001B[0m \u001B[1;31m# nuclio: ignore\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[1;32m----> 2\u001B[1;33m \u001B[1;32mimport\u001B[0m \u001B[0mnuclio\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[0m\u001B[0;32m 3\u001B[0m \u001B[1;33m\u001B[0m\u001B[0m\n", - "\u001B[1;31mModuleNotFoundError\u001B[0m: No module named 'nuclio'" - ] - } - ], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting kind to 'job'\n", - "%nuclio: setting spec.image to 'mlrun/mlrun'\n" - ] - } - ], - "source": [ - "%nuclio config kind = \"job\"\n", - "%nuclio config spec.build.baseImage = \"mlrun/mlrun\"" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "%%nuclio cmd -c\n", - "pip install --no-cache-dir git+https://github.com/v3io/PyHive.git@v0.6.999 \n", - "pip install sqlalchemy==1.3.11\n", - "pip install PyMySQL==0.9.3" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "import pyhive\n", - "from sqlalchemy.engine import create_engine\n", - "from mlrun.execution import MLClientCtx\n", - "\n", - "\n", - "def sql_to_file(\n", - " context: MLClientCtx,\n", - " sql_query: str,\n", - " database_url: str,\n", - " file_ext: str = \"parquet\",\n", - ") -> None:\n", - " \"\"\"SQL Ingest - Ingest data using SQL query\n", - "\n", - " :param context: the function context\n", - " :param sql_query: the sql query used to retrieve the data\n", - " :param database_url: database connection URL\n", - " :param file_ext: (\"parquet\") format for result file\n", - "\n", - "\"\"\"\n", - "\n", - " engine = create_engine(database_url)\n", - " df = pd.read_sql(sql_query, engine)\n", - "\n", - " context.log_dataset('query result',\n", - " df=df,\n", - " format=file_ext,\n", - " artifact_path=context.artifact_subpath('data'))\n" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### mlconfig" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "ename": "KeyError", - "evalue": "'HOME'", - "output_type": "error", - "traceback": [ - "\u001B[1;31m---------------------------------------------------------------------------\u001B[0m", - "\u001B[1;31mKeyError\u001B[0m Traceback (most recent call last)", - "\u001B[1;32m\u001B[0m in \u001B[0;36m\u001B[1;34m\u001B[0m\n\u001B[0;32m 2\u001B[0m \u001B[1;32mimport\u001B[0m \u001B[0mos\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[0;32m 3\u001B[0m \u001B[0mmlconf\u001B[0m\u001B[1;33m.\u001B[0m\u001B[0mdbpath\u001B[0m \u001B[1;33m=\u001B[0m \u001B[0mmlconf\u001B[0m\u001B[1;33m.\u001B[0m\u001B[0mdbpath\u001B[0m \u001B[1;32mor\u001B[0m \u001B[1;34m'http://mlrun-api:8080'\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[1;32m----> 4\u001B[1;33m \u001B[0mmlconf\u001B[0m\u001B[1;33m.\u001B[0m\u001B[0martifact_path\u001B[0m \u001B[1;33m=\u001B[0m \u001B[0mmlconf\u001B[0m\u001B[1;33m.\u001B[0m\u001B[0martifact_path\u001B[0m \u001B[1;32mor\u001B[0m \u001B[1;34mf'{os.environ[\"HOME\"]}/artifacts'\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[0m\u001B[0;32m 5\u001B[0m \u001B[1;33m\u001B[0m\u001B[0m\n\u001B[0;32m 6\u001B[0m \u001B[1;33m\u001B[0m\u001B[0m\n", - "\u001B[1;32mC:\\Program Files\\Python37\\lib\\os.py\u001B[0m in \u001B[0;36m__getitem__\u001B[1;34m(self, key)\u001B[0m\n\u001B[0;32m 679\u001B[0m \u001B[1;32mexcept\u001B[0m \u001B[0mKeyError\u001B[0m\u001B[1;33m:\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[0;32m 680\u001B[0m \u001B[1;31m# raise KeyError with the original key value\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[1;32m--> 681\u001B[1;33m \u001B[1;32mraise\u001B[0m \u001B[0mKeyError\u001B[0m\u001B[1;33m(\u001B[0m\u001B[0mkey\u001B[0m\u001B[1;33m)\u001B[0m \u001B[1;32mfrom\u001B[0m \u001B[1;32mNone\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[0m\u001B[0;32m 682\u001B[0m \u001B[1;32mreturn\u001B[0m \u001B[0mself\u001B[0m\u001B[1;33m.\u001B[0m\u001B[0mdecodevalue\u001B[0m\u001B[1;33m(\u001B[0m\u001B[0mvalue\u001B[0m\u001B[1;33m)\u001B[0m\u001B[1;33m\u001B[0m\u001B[1;33m\u001B[0m\u001B[0m\n\u001B[0;32m 683\u001B[0m \u001B[1;33m\u001B[0m\u001B[0m\n", - "\u001B[1;31mKeyError\u001B[0m: 'HOME'" - ] - } - ], - "source": [ - "from mlrun import mlconf\n", - "import os\n", - "mlconf.dbpath = mlconf.dbpath or 'http://mlrun-api:8080'\n", - "mlconf.artifact_path = mlconf.artifact_path or f'{os.environ[\"HOME\"]}/artifacts'\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Save function" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "def mount_secret(\n", - " secret_name, volume_mount_path, volume_name='secret', items=None\n", - "):\n", - " def _mount_secret(task):\n", - " from kubernetes import client as k8s_client\n", - " vol = k8s_client.V1SecretVolumeSource(secret_name=secret_name, items=items)\n", - " return task.add_volume(\n", - " k8s_client.V1Volume(name=volume_name, secret=vol)\n", - " ).add_volume_mount(\n", - " k8s_client.V1VolumeMount(mount_path=volume_mount_path, name=volume_name)\n", - " )\n", - " return _mount_secret" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import code_to_function, NewTask\n", - "import os\n", - "\n", - "fn = code_to_function(name=\"sql_to_file\",\n", - " handler=\"sql_to_file\",\n", - " description=\"SQL To File - Ingest data using SQL query\",\n", - " categories=[\"data-prep\"],\n", - " labels={\"author\": \"adih\"})\n", - "\n", - "if \"V3IO_ACCESS_KEY\" in list(os.environ):\n", - " fn.apply(mount_secret(secret_name='presto-tls',\n", - " volume_mount_path= '/var/run/iguazio/secrets/'))\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Build the image" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-29 12:42:44,100 starting remote build, image: .mlrun/func-default-sql-ingest-latest\n", - "\u001B[36mINFO\u001B[0m[0000] Resolved base name mlrun/mlrun:0.4.10 to mlrun/mlrun:0.4.10 \n", - "\u001B[36mINFO\u001B[0m[0000] Resolved base name mlrun/mlrun:0.4.10 to mlrun/mlrun:0.4.10 \n", - "\u001B[36mINFO\u001B[0m[0000] Retrieving image manifest mlrun/mlrun:0.4.10 \n", - "\u001B[36mINFO\u001B[0m[0000] Retrieving image manifest mlrun/mlrun:0.4.10 \n", - "\u001B[36mINFO\u001B[0m[0000] Built cross stage deps: map[] \n", - "\u001B[36mINFO\u001B[0m[0000] Retrieving image manifest mlrun/mlrun:0.4.10 \n", - "\u001B[36mINFO\u001B[0m[0000] Retrieving image manifest mlrun/mlrun:0.4.10 \n", - "\u001B[36mINFO\u001B[0m[0001] Unpacking rootfs as cmd RUN pip install --no-cache-dir git+https://github.com/v3io/PyHive.git@v0.6.999 requires it. \n", - "\u001B[36mINFO\u001B[0m[0027] Taking snapshot of full filesystem... \n", - "\u001B[36mINFO\u001B[0m[0039] Resolving paths \n", - "\u001B[36mINFO\u001B[0m[0046] RUN pip install --no-cache-dir git+https://github.com/v3io/PyHive.git@v0.6.999 \n", - "\u001B[36mINFO\u001B[0m[0046] cmd: /bin/sh \n", - "\u001B[36mINFO\u001B[0m[0046] args: [-c pip install --no-cache-dir git+https://github.com/v3io/PyHive.git@v0.6.999] \n", - "Collecting git+https://github.com/v3io/PyHive.git@v0.6.999\n", - " Cloning https://github.com/v3io/PyHive.git (to revision v0.6.999) to /tmp/pip-req-build-ycqhuolw\n", - " Running command git clone -q https://github.com/v3io/PyHive.git /tmp/pip-req-build-ycqhuolw\n", - "Requirement already satisfied: future in /usr/local/lib/python3.7/site-packages (from PyHive==0.6.1.dev0) (0.18.2)\n", - "Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/site-packages (from PyHive==0.6.1.dev0) (2.8.1)\n", - "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil->PyHive==0.6.1.dev0) (1.15.0)\n", - "Building wheels for collected packages: PyHive\n", - " Building wheel for PyHive (setup.py): started\n", - " Building wheel for PyHive (setup.py): finished with status 'done'\n", - " Created wheel for PyHive: filename=PyHive-0.6.1.dev0-py3-none-any.whl size=46402 sha256=63dca405cbae83da4cfcabfd61fd00f1683bc008c8bfa2272eac7054ec283166\n", - " Stored in directory: /tmp/pip-ephem-wheel-cache-mwb52l_u/wheels/05/11/cd/4ac4df0fcee76e5ceb614c39c56fca1eead41c0ac32ff6285d\n", - "Successfully built PyHive\n", - "Installing collected packages: PyHive\n", - "Successfully installed PyHive-0.6.1.dev0\n", - "\u001B[36mINFO\u001B[0m[0048] Taking snapshot of full filesystem... \n", - "\u001B[36mINFO\u001B[0m[0048] Resolving paths \n", - "\u001B[36mINFO\u001B[0m[0053] RUN pip install sqlalchemy==1.3.11 \n", - "\u001B[36mINFO\u001B[0m[0053] cmd: /bin/sh \n", - "\u001B[36mINFO\u001B[0m[0053] args: [-c pip install sqlalchemy==1.3.11] \n", - "Collecting sqlalchemy==1.3.11\n", - " Downloading SQLAlchemy-1.3.11.tar.gz (6.0 MB)\n", - "Building wheels for collected packages: sqlalchemy\n", - " Building wheel for sqlalchemy (setup.py): started\n", - " Building wheel for sqlalchemy (setup.py): finished with status 'done'\n", - " Created wheel for sqlalchemy: filename=SQLAlchemy-1.3.11-cp37-cp37m-linux_x86_64.whl size=1216921 sha256=9dd22e89acfbb68df0c1d189d36907a16c9393e4174598eb4bf377ce57132f3c\n", - " Stored in directory: /root/.cache/pip/wheels/0a/60/60/f26cbd183a3bb0031ace108156036dd925ec0138ee1c496a16\n", - "Successfully built sqlalchemy\n", - "Installing collected packages: sqlalchemy\n", - " Attempting uninstall: sqlalchemy\n", - " Found existing installation: SQLAlchemy 1.3.17\n", - " Uninstalling SQLAlchemy-1.3.17:\n", - " Successfully uninstalled SQLAlchemy-1.3.17\n", - "Successfully installed sqlalchemy-1.3.11\n", - "\u001B[36mINFO\u001B[0m[0057] Taking snapshot of full filesystem... \n", - "\u001B[36mINFO\u001B[0m[0057] Resolving paths \n", - "\u001B[36mINFO\u001B[0m[0063] RUN pip install PyMySQL==0.9.3 \n", - "\u001B[36mINFO\u001B[0m[0063] cmd: /bin/sh \n", - "\u001B[36mINFO\u001B[0m[0063] args: [-c pip install PyMySQL==0.9.3] \n", - "Collecting PyMySQL==0.9.3\n", - " Downloading PyMySQL-0.9.3-py2.py3-none-any.whl (47 kB)\n", - "Installing collected packages: PyMySQL\n", - "Successfully installed PyMySQL-0.9.3\n", - "\u001B[36mINFO\u001B[0m[0064] Taking snapshot of full filesystem... \n", - "\u001B[36mINFO\u001B[0m[0064] Resolving paths \n" - ] - }, - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-30 01:58:41,604 function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": "" - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.export('function.yaml')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Test" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Reading from a public MySQL DB" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "mysql_url = 'mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam'\n", - "mysql_query = 'select rfam_acc,rfam_id,auto_wiki,description,author,seed_source FROM family'" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import NewTask, run_local\n", - "\n", - "sql_task = NewTask(name='sql',\n", - " handler=sql_to_file,\n", - " params={'sql_query': mysql_query,\n", - " 'database_url': mysql_url})\n" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-29 12:43:59,253 starting run sql uid=b0914edaa58e45ee97c132200c6b60be -> http://mlrun-api:8080\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "

\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run b0914edaa58e45ee97c132200c6b60be --project default , !mlrun logs b0914edaa58e45ee97c132200c6b60be --project default\n", - "[mlrun] 2020-06-29 12:44:02,344 run executed, status=completed\n" - ] - } - ], - "source": [ - "sql_func = run_local(sql_task)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Run it on a cluster" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-29 12:44:02,350 starting run sql uid=46ff7ef67e314be49353982cdd8d073a -> http://mlrun-api:8080\n", - "[mlrun] 2020-06-29 12:44:02,622 Job is running in the background, pod: sql-mplpz\n", - "[mlrun] 2020-06-29 12:44:09,070 run executed, status=completed\n", - "final state: succeeded\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 29 12:44:06completedsql
v3io_user=admin
kind=job
owner=admin
host=sql-mplpz
sql_query=select rfam_acc,rfam_id,auto_wiki,description,author,seed_source FROM family
database_url=mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam
query result
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run 46ff7ef67e314be49353982cdd8d073a --project default , !mlrun logs 46ff7ef67e314be49353982cdd8d073a --project default\n", - "[mlrun] 2020-06-29 12:44:11,893 run executed, status=completed\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.run(sql_task)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### SQL query from Iguazio Key Value via Presto" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You need to create a table and set the sql_table path accordingly.
\n", - "you can find an example of creating such table in https://github.com/v3io/tutorials/blob/master/data-ingestion-and-preparation/basic-data-ingestion-and-preparation.ipynb" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import os\n", - "sql_table = os.path.join('v3io.users.\"'+str(os.getenv('V3IO_USERNAME'))+'/examples/stocks_tab\"')\n", - "sql_query_string = 'select * from '+sql_table+\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Done.\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
securitydescsecuritytypetimeisinminpricedateendpricenumberoftradesmnemoniccurrencysecurityidmaxpricetradedvolumestartprice
UBS I.ETF-DL G.SEL.DIV.ADETF08:27IE00BMP3HG278.4182018-03-26 00:00:00.0008.4181UBUMEUR25054508.4184038.418
GILEAD SCIENCES DL-,001Common stock08:00US375558103659.72018-03-26 00:00:00.00059.843GISEUR250649559.8474559.7
3M CO. DL-,01Common stock08:00US88579Y1010176.512018-03-26 00:00:00.000176.511MMMEUR2506577176.5139176.51
DIEBOLD NIXDORF INH.O.N.Common stock08:06DE000A0CAYB266.32018-03-26 00:00:00.00066.31WINEUR250428666.36066.3
XTR.II EUR.INF.LINK.BD 1CETF08:13LU0290358224218.972018-03-26 00:00:00.000218.971DBXKEUR2505840218.97110218.97
UBS-ETF-MSCI EMU S.C.EOADETF08:33LU0671493277100.22018-03-26 00:00:00.000100.21UEFDEUR2506045100.2180100.2
ASMALLWORLD AG SF 1Common stock08:23CH040488012912.72018-03-26 00:00:00.00012.711Q7EUR308912212.740012.7
IS.DJ GLOB.TITAN.50 U.ETFETF08:42DE000628938231.252018-03-26 00:00:00.00031.251EXI2EUR250502931.255031.25
ISHS IV-AGEING POPUL.ETFETF08:17IE00BYZK46694.9262018-03-26 00:00:00.0004.92612B77EUR25055524.926254.926
PORSCHE AUTOM.HLDG VZOCommon stock08:00DE000PAH003864.682018-03-26 00:00:00.00064.768PAH3EUR250481664.7669864.7
" - ], - "text/plain": [ - "[('UBS I.ETF-DL G.SEL.DIV.AD', 'ETF', '08:27', 'IE00BMP3HG27', 8.418, '2018-03-26 00:00:00.000', 8.418, 1, 'UBUM', 'EUR', 2505450, 8.418, 403, 8.418),\n", - " ('GILEAD SCIENCES DL-,001', 'Common stock', '08:00', 'US3755581036', 59.7, '2018-03-26 00:00:00.000', 59.84, 3, 'GIS', 'EUR', 2506495, 59.84, 745, 59.7),\n", - " ('3M CO. DL-,01', 'Common stock', '08:00', 'US88579Y1010', 176.51, '2018-03-26 00:00:00.000', 176.51, 1, 'MMM', 'EUR', 2506577, 176.51, 39, 176.51),\n", - " ('DIEBOLD NIXDORF INH.O.N.', 'Common stock', '08:06', 'DE000A0CAYB2', 66.3, '2018-03-26 00:00:00.000', 66.3, 1, 'WIN', 'EUR', 2504286, 66.3, 60, 66.3),\n", - " ('XTR.II EUR.INF.LINK.BD 1C', 'ETF', '08:13', 'LU0290358224', 218.97, '2018-03-26 00:00:00.000', 218.97, 1, 'DBXK', 'EUR', 2505840, 218.97, 110, 218.97),\n", - " ('UBS-ETF-MSCI EMU S.C.EOAD', 'ETF', '08:33', 'LU0671493277', 100.2, '2018-03-26 00:00:00.000', 100.2, 1, 'UEFD', 'EUR', 2506045, 100.2, 180, 100.2),\n", - " ('ASMALLWORLD AG SF 1', 'Common stock', '08:23', 'CH0404880129', 12.7, '2018-03-26 00:00:00.000', 12.7, 1, '1Q7', 'EUR', 3089122, 12.7, 400, 12.7),\n", - " ('IS.DJ GLOB.TITAN.50 U.ETF', 'ETF', '08:42', 'DE0006289382', 31.25, '2018-03-26 00:00:00.000', 31.25, 1, 'EXI2', 'EUR', 2505029, 31.25, 50, 31.25),\n", - " ('ISHS IV-AGEING POPUL.ETF', 'ETF', '08:17', 'IE00BYZK4669', 4.926, '2018-03-26 00:00:00.000', 4.926, 1, '2B77', 'EUR', 2505552, 4.926, 25, 4.926),\n", - " ('PORSCHE AUTOM.HLDG VZO', 'Common stock', '08:00', 'DE000PAH0038', 64.68, '2018-03-26 00:00:00.000', 64.76, 8, 'PAH3', 'EUR', 2504816, 64.76, 698, 64.7)]" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "%sql select * from $sql_table limit 10" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [], - "source": [ - "sql_task = NewTask(name='sql', \n", - " handler=sql_to_file,\n", - " params={'sql_query': sql_query_string,\n", - " 'database_url': os.getenv('DATABASE_URL')}\n", - " )\n" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-29 12:44:14,406 starting run sql uid=d32a57bb990d4142bb1f63862e8906bf -> http://mlrun-api:8080\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 29 12:44:14completedsql
v3io_user=admin
kind=handler
owner=admin
host=jupyter-b9c7995f9-4fblj
sql_query=select * from v3io.users.\"admin/examples/stocks_tab\"
database_url=presto://admin:8278ee8e-0f31-4aea-a105-2eab202bec93@presto-api-presto.default-tenant.app.cs-mlrun-test.iguazio-c0.com:443/v3io?protocol=https&requests_kwargs=%7B%22verify%22%3A+%22%2Fvar%2Frun%2Figuazio%2Fsecrets%2Ftls.crt%22%2C+%22cert%22%3A+%5B%22%2Fvar%2Frun%2Figuazio%2Fsecrets%2Ftls.crt%22%2C+%22%2Fvar%2Frun%2Figuazio%2Fsecrets%2Ftls.key%22%5D%7D
query result
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run d32a57bb990d4142bb1f63862e8906bf --project default , !mlrun logs d32a57bb990d4142bb1f63862e8906bf --project default\n", - "[mlrun] 2020-06-29 12:44:18,102 run executed, status=completed\n" - ] - } - ], - "source": [ - "sql_func = run_local(sql_task)" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-29 12:44:18,112 starting run sql uid=db9507007f6d452e9ca020e4f483e33b -> http://mlrun-api:8080\n", - "[mlrun] 2020-06-29 12:44:18,387 Job is running in the background, pod: sql-g7p4f\n", - "[mlrun] 2020-06-29 12:44:25,033 run executed, status=completed\n", - "final state: succeeded\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 29 12:44:21completedsql
v3io_user=admin
kind=job
owner=admin
host=sql-g7p4f
sql_query=select * from v3io.users.\"admin/examples/stocks_tab\"
database_url=presto://admin:8278ee8e-0f31-4aea-a105-2eab202bec93@presto-api-presto.default-tenant.app.cs-mlrun-test.iguazio-c0.com:443/v3io?protocol=https&requests_kwargs=%7B%22verify%22%3A+%22%2Fvar%2Frun%2Figuazio%2Fsecrets%2Ftls.crt%22%2C+%22cert%22%3A+%5B%22%2Fvar%2Frun%2Figuazio%2Fsecrets%2Ftls.crt%22%2C+%22%2Fvar%2Frun%2Figuazio%2Fsecrets%2Ftls.key%22%5D%7D
query result
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run db9507007f6d452e9ca020e4f483e33b --project default , !mlrun logs db9507007f6d452e9ca020e4f483e33b --project default\n", - "[mlrun] 2020-06-29 12:44:27,645 run executed, status=completed\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.run(sql_task)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} \ No newline at end of file diff --git a/sql_to_file/sql_to_file.py b/sql_to_file/sql_to_file.py deleted file mode 100644 index 6d5e152ba..000000000 --- a/sql_to_file/sql_to_file.py +++ /dev/null @@ -1,45 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import pandas as pd -import pyhive -from sqlalchemy.engine import create_engine -from mlrun.execution import MLClientCtx - - -def sql_to_file( - context: MLClientCtx, - sql_query: str, - database_url: str, - file_ext: str = "parquet", -) -> None: - """SQL Ingest - Ingest data using SQL query - - :param context: the function context - :param sql_query: the sql query used to retrieve the data - :param database_url: database connection URL - :param file_ext: ("parquet") format for result file - """ - - engine = create_engine(database_url) - df = pd.read_sql(sql_query, engine) - - context.log_dataset( - "query result", - df=df, - format=file_ext, - artifact_path=context.artifact_subpath("data"), - ) diff --git a/sql_to_file/test_sql_to_file.py b/sql_to_file/test_sql_to_file.py deleted file mode 100644 index d636b86ca..000000000 --- a/sql_to_file/test_sql_to_file.py +++ /dev/null @@ -1,31 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from mlrun import code_to_function - -mysql_url = 'mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam' -mysql_query = 'select rfam_acc,rfam_id,auto_wiki,description,author,seed_source FROM family' - - -def test_run_sql_to_file(): - fn = code_to_function(name='test_sql_to_file', - filename="sql_to_file.py", - handler="sql_to_file", - kind="job", - ) - run = fn.run(params={'sql_query': mysql_query, - 'database_url': mysql_url}, - local=True) - - assert(run.artifact("query result")) \ No newline at end of file diff --git a/stream_to_parquet/function.yaml b/stream_to_parquet/function.yaml deleted file mode 100644 index 13a76a2cb..000000000 --- a/stream_to_parquet/function.yaml +++ /dev/null @@ -1,45 +0,0 @@ -kind: remote -metadata: - name: stream-to-parquet - tag: '' - hash: 78316bfbe731714715c19f0bc6deabf8652f15c4 - project: '' - labels: - author: orz - categories: - - machine-learning - - data-preparation -spec: - command: '' - args: [] - image: mlrun/ml-models - description: Saves a stream to Parquet and can lunch drift detection task on it - min_replicas: 1 - max_replicas: 1 - env: [] - base_spec: - apiVersion: nuclio.io/v1 - kind: Function - metadata: - name: stream-to-parquet - labels: {} - annotations: - nuclio.io/generated_by: function generated from /User/test/functions/stream_to_parquet/stream_to_parquet.py - spec: - runtime: python:3.9 - handler: stream_to_parquet:handler - env: [] - volumes: [] - build: - commands: [] - noBaseImagesPull: true - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IG9zCmltcG9ydCBwYW5kYXMgYXMgcGQKaW1wb3J0IG51bXB5IGFzIG5wCmltcG9ydCBqc29uCmltcG9ydCBkYXRldGltZQppbXBvcnQgbWxydW4KCgpkZWYgcmVjb3JkX3RvX2ZlYXR1cmVzKHJlY29yZCk6CiAgICBmZWF0dXJlcyA9IHJlY29yZFsicmVxdWVzdCJdWyJpbnN0YW5jZXMiXVswXQogICAgdGltZXN0YW1wID0gcmVjb3JkWyJ3aGVuIl0KICAgIHByZWRpY3Rpb24gPSByZWNvcmRbInJlc3AiXQoKICAgIHJlY29yZCA9IHsidGltZXN0YW1wIjogdGltZXN0YW1wLCAqKmZlYXR1cmVzLCAicHJlZGljdGlvbnMiOiBwcmVkaWN0aW9ufQoKICAgIHJldHVybiByZWNvcmQKCgpkZWYgaW5pdF9jb250ZXh0KGNvbnRleHQpOgogICAgc2V0YXR0cihjb250ZXh0LCAiYmF0Y2giLCBbXSkKICAgIHNldGF0dHIoY29udGV4dCwgIndpbmRvdyIsIGludChvcy5nZXRlbnYoIndpbmRvdyIsIDEwKSkpCiAgICBzZXRhdHRyKGNvbnRleHQsICJzYXZlX3RvIiwgb3MuZ2V0ZW52KCJzYXZlX3RvIiwgIi9iaWdkYXRhL2luZmVyZW5jZV9wcS8iKSkKICAgIG9zLm1ha2VkaXJzKGNvbnRleHQuc2F2ZV90bywgZXhpc3Rfb2s9VHJ1ZSkKCiAgICBtbHJ1bi5tbGNvbmYuZGJwYXRoID0gbWxydW4ubWxjb25mLmRicGF0aCBvciAiaHR0cDovL21scnVuLWFwaTo4MDgwIgogICAgYXJ0aWZhY3RfcGF0aCA9IG9zLmdldGVudigiYXJ0aWZhY3RfcGF0aCIsIE5vbmUpCiAgICBpZiBhcnRpZmFjdF9wYXRoOgogICAgICAgIG1scnVuLm1sY29uZi5hcnRpZmFjdF9wYXRoID0gYXJ0aWZhY3RfcGF0aAogICAgaWYgImh1Yl91cmwiIGluIG9zLmVudmlyb246CiAgICAgICAgbWxydW4ubWxjb25mLmh1Yl91cmwgPSBvcy5lbnZpcm9uWyJodWJfdXJsIl0KICAgIHZpcnR1YWxfZHJpZnRfZm4gPSBtbHJ1bi5pbXBvcnRfZnVuY3Rpb24oImh1YjovL3ZpcnR1YWxfZHJpZnQiKQogICAgdmlydHVhbF9kcmlmdF9mbi5hcHBseShtbHJ1bi5hdXRvX21vdW50KCkpCiAgICBzZXRhdHRyKGNvbnRleHQsICJ2aXJ0dWFsX2RyaWZ0X2ZuIiwgdmlydHVhbF9kcmlmdF9mbikKCiAgICBwcmVkaWN0aW9uc19jb2wgPSBvcy5nZXRlbnYoInByZWRpY3Rpb25zIiwgTm9uZSkKICAgIGxhYmVsX2NvbCA9IG9zLmdldGVudigibGFiZWxfY29sIiwgTm9uZSkKICAgIHNldGF0dHIoY29udGV4dCwgImJhc2VfZGF0YXNldCIsIG9zLmdldGVudigiYmFzZV9kYXRhc2V0IiwgIiIpKQogICAgc2V0YXR0cihjb250ZXh0LCAiaW5kZXhlcyIsIGpzb24ubG9hZHMob3MuZW52aXJvbi5nZXQoImluZGV4ZXMiLCAiW10iKSkpCiAgICBzZXRhdHRyKGNvbnRleHQsICJwcmVkaWN0aW9uc19jb2wiLCBwcmVkaWN0aW9uc19jb2wpCiAgICBzZXRhdHRyKGNvbnRleHQsICJsYWJlbF9jb2wiLCBsYWJlbF9jb2wpCiAgICBzZXRhdHRyKAogICAgICAgIGNvbnRleHQsICJyZXN1bHRzX3RzZGJfY29udGFpbmVyIiwgb3MuZ2V0ZW52KCJyZXN1bHRzX3RzZGJfY29udGFpbmVyIiwgTm9uZSkKICAgICkKICAgIHNldGF0dHIoY29udGV4dCwgInJlc3VsdHNfdHNkYl90YWJsZSIsIG9zLmdldGVudigicmVzdWx0c190c2RiX3RhYmxlIiwgTm9uZSkpCgoKZGVmIGhhbmRsZXIoY29udGV4dCwgZXZlbnQpOgoKICAgIGNvbnRleHQubG9nZ2VyLmluZm8oZiJBZGRpbmcge2V2ZW50LmJvZHl9IikKICAgIGNvbnRleHQuYmF0Y2guYXBwZW5kKHJlY29yZF90b19mZWF0dXJlcyhqc29uLmxvYWRzKGV2ZW50LmJvZHkpKSkKCiAgICBpZiBsZW4oY29udGV4dC5iYXRjaCkgPiBjb250ZXh0LndpbmRvdzoKICAgICAgICBjb250ZXh0LmxvZ2dlci5pbmZvKGNvbnRleHQuYmF0Y2hbOjFdKQogICAgICAgIGNvbnRleHQubG9nZ2VyLmluZm8oY29udGV4dC5pbmRleGVzKQogICAgICAgIGRmID0gcGQuRGF0YUZyYW1lKGNvbnRleHQuYmF0Y2gpCiAgICAgICAgY29udGV4dC5sb2dnZXIuaW5mbyhmImRmIGV4YW1wbGU6IHtkZi5oZWFkKDEpfSIpCiAgICAgICAgaWYgY29udGV4dC5pbmRleGVzOgogICAgICAgICAgICBkZiA9IGRmLnNldF9pbmRleChjb250ZXh0LmluZGV4ZXMpCiAgICAgICAgZGZfcGF0aCA9IG9zLnBhdGguam9pbigKICAgICAgICAgICAgY29udGV4dC5zYXZlX3RvLAogICAgICAgICAgICBmIntkYXRldGltZS5kYXRldGltZS5ub3coKS5zdHJmdGltZSgnJVktJW0tJWRUJUg6JU06JVMnKX0ucHEiLAogICAgICAgICkKICAgICAgICBkZi50b19wYXJxdWV0KGRmX3BhdGgsaW5kZXg9RmFsc2UpCgogICAgICAgIHRhc2sgPSBtbHJ1bi5OZXdUYXNrKAogICAgICAgICAgICBuYW1lPSJkcmlmdF9tYWduaXR1ZGUiLAogICAgICAgICAgICBoYW5kbGVyPSJkcmlmdF9tYWduaXR1ZGUiLAogICAgICAgICAgICBwYXJhbXM9ewogICAgICAgICAgICAgICAgImxhYmVsX2NvbCI6IGNvbnRleHQubGFiZWxfY29sLAogICAgICAgICAgICAgICAgInByZWRpY3Rpb25fY29sIjogY29udGV4dC5wcmVkaWN0aW9uc19jb2wsCiAgICAgICAgICAgICAgICAicmVzdWx0c190c2RiX2NvbnRhaW5lciI6IGNvbnRleHQucmVzdWx0c190c2RiX2NvbnRhaW5lciwKICAgICAgICAgICAgICAgICJyZXN1bHRzX3RzZGJfdGFibGUiOiBjb250ZXh0LnJlc3VsdHNfdHNkYl90YWJsZSwKICAgICAgICAgICAgfSwKICAgICAgICAgICAgaW5wdXRzPXsidCI6IGNvbnRleHQuYmFzZV9kYXRhc2V0LCAidSI6IGRmX3BhdGh9LAogICAgICAgICAgICBhcnRpZmFjdF9wYXRoPW1scnVuLm1sY29uZi5hcnRpZmFjdF9wYXRoLAogICAgICAgICkKCiAgICAgICAgY29udGV4dC52aXJ0dWFsX2RyaWZ0X2ZuLnJ1bih0YXNrLCB3YXRjaD1GYWxzZSkKCiAgICAgICAgY29udGV4dC5iYXRjaCA9IFtdCg== - source: '' - build: - commands: [] - code_origin: https://github.com/daniels290813/functions.git#3605c9b8dcadab89a5a45f7d16dcd2fcfeca8697:/User/test/functions/stream_to_parquet/stream_to_parquet.py - origin_filename: /User/test/functions/stream_to_parquet/stream_to_parquet.py - default_handler: handler - disable_auto_mount: false - affinity: null -verbose: false diff --git a/stream_to_parquet/item.yaml b/stream_to_parquet/item.yaml deleted file mode 100644 index cbd59376e..000000000 --- a/stream_to_parquet/item.yaml +++ /dev/null @@ -1,28 +0,0 @@ -apiVersion: v1 -categories: -- machine-learning -- data-preparation -description: Saves a stream to Parquet and can lunch drift detection task on it -doc: '' -example: stream_to_parquet.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: orz -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: stream-to-parquet -platformVersion: 3.5.0 -spec: - customFields: - max_replicas: 1 - min_replicas: 1 - filename: stream_to_parquet.py - handler: handler - image: mlrun/ml-models - kind: nuclio - requirements: [] -url: '' -version: 1.1.0 diff --git a/stream_to_parquet/stream_to_parquet.ipynb b/stream_to_parquet/stream_to_parquet.ipynb deleted file mode 100644 index e47c6be92..000000000 --- a/stream_to_parquet/stream_to_parquet.ipynb +++ /dev/null @@ -1,698 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Stream to Parquet" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Part of the [network operations](https://github.com/mlrun/demos/tree/0.7.x/network-operations) demo pipeline, this function listens to a labeld stream and writes it as parquet files.
\n", - "This function also deploys the function [virtual_drift](https://github.com/mlrun/functions/tree/master/virtual_drift) from the hub, which computes drift magnitude metrics between base dataset t and dataset u,
\n", - "in our case (as well as in the demo) - base dataset (the one that the model trained on) and the dataset the model predicted.
\n", - "virtual_drift writes the output to TSDB." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Steps**\n", - "\n", - "1. [Data exploration](#Data-exploration)\n", - "2. [Creating the labeled stream](#Creating-the-labeled-stream)\n", - "3. [Importing the function](#Importing-the-function)\n", - "4. [Running the functioh remotely](#Running-the-function-remotely)\n", - "5. [Testing the function](#Testing-the-function)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Data exploration**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In order to know about the performance of a drift detector by measuring the different detection metrics, we need to know beforehand where a real drift occurs.
\n", - "This is only possible with synthetic datasets.
The scikit-multiflow framework allows generating several kinds of synthetic data to simulate the occurrence of drifts.
\n", - "[Harvard dataverse](https://dataverse.harvard.edu) provides futher explanations on the [used dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/5OWRGB) along with different kinds of drifted datasets.
\n", - "mixed_0101_abrupto has 4 concepts and 3 drifts at time steps 10000, 20000, and 30000.
\n", - "Our dataset will be train-test-splitted, the train part (first 5000 examples) is used to train the model (that is generated easly using [sklearn_classifer](https://github.com/mlrun/functions/blob/master/sklearn_classifier/sklearn_classifier.ipynb)).
\n", - "The test part (which is already predicted by the model) will be pushed to the input stream in order to detect drifts." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
X1X2X3X4class
00.01.00.4601010.5927441.0
11.01.00.5887880.5749840.0
20.00.00.4016410.6793251.0
31.01.00.3060760.1821080.0
40.00.00.9628470.5792451.0
\n", - "
" - ], - "text/plain": [ - " X1 X2 X3 X4 class\n", - "0 0.0 1.0 0.460101 0.592744 1.0\n", - "1 1.0 1.0 0.588788 0.574984 0.0\n", - "2 0.0 0.0 0.401641 0.679325 1.0\n", - "3 1.0 1.0 0.306076 0.182108 0.0\n", - "4 0.0 0.0 0.962847 0.579245 1.0" - ] - }, - "execution_count": 1, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import pandas as pd\n", - "data_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/concept_drift/mixed_0101_abrupto.csv'\n", - "base_dataset = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/concept_drift/predicted_abrupto_train.csv'\n", - "# The predicted test data is pushed to the stream\n", - "predicted_test_data_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/concept_drift/predicted_abrupto_test.csv'\n", - "# You can find the model used here\n", - "models_path = 'https://s3.wasabisys.com/iguazio/models/function-marketplace-models/concept_drift/concept_drift_random_forest.pkl'\n", - "original_data = pd.read_csv(data_path)\n", - "original_data.head()" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
X1X2X3X4classpredicted_col
349950.00.00.0101060.6472690.01.0
349961.01.00.2936510.7372911.00.0
349970.00.00.8485460.5523370.01.0
349981.01.00.6147540.8598961.00.0
349991.00.00.2653060.8437160.01.0
\n", - "
" - ], - "text/plain": [ - " X1 X2 X3 X4 class predicted_col\n", - "34995 0.0 0.0 0.010106 0.647269 0.0 1.0\n", - "34996 1.0 1.0 0.293651 0.737291 1.0 0.0\n", - "34997 0.0 0.0 0.848546 0.552337 0.0 1.0\n", - "34998 1.0 1.0 0.614754 0.859896 1.0 0.0\n", - "34999 1.0 0.0 0.265306 0.843716 0.0 1.0" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "predicted_test = pd.read_csv(predicted_test_data_path)\n", - "predicted_test.tail()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Creating the labeled stream**" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "import os \n", - "\n", - "container = os.path.join('/',os.environ['V3IO_HOME'].split('/')[0])\n", - "user = os.environ[\"V3IO_USERNAME\"]\n", - "rel_path = os.getcwd()[6:] + '/artifacts'\n", - "\n", - "base_input_stream = os.path.join(user,rel_path) + \"/inputs_stream\"\n", - "base_output_stream = os.path.join(user,rel_path) + \"/output_stream\"\n", - "input_stream = os.path.join(container,base_input_stream)\n", - "tsdb_path = os.path.join(user,rel_path) + \"/output_tsdb\"\n", - "\n", - "stream_consumer_group = 's2p'" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "import v3io.dataplane\n", - "\n", - "client = v3io.dataplane.Client()\n", - "response = client.stream.create(container = container,\n", - " stream_path=base_input_stream,\n", - " shard_count=1,\n", - " raise_for_status = v3io.dataplane.RaiseForStatus.never)\n", - "response.raise_for_status([409, 204])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Importing the function**" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-26 14:37:45,224 [info] created and saved project function-marketplace\n" - ] - } - ], - "source": [ - "import mlrun\n", - "\n", - "# Importing the function\n", - "mlrun.set_environment(project='function-marketplace')\n", - "\n", - "fn = mlrun.import_function(\"hub://stream_to_parquet:development\")\n", - "fn.apply(mlrun.auto_mount())\n", - "\n", - "fn.add_v3io_stream_trigger(stream_path=input_stream, name='stream', group=stream_consumer_group)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Running the function remotely**" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-26 14:37:45,513 [info] Starting remote function deploy\n", - "2021-10-26 14:37:45 (info) Deploying function\n", - "2021-10-26 14:37:45 (info) Building\n", - "2021-10-26 14:37:45 (info) Staging files and preparing base images\n", - "2021-10-26 14:37:45 (info) Building processor image\n", - "2021-10-26 14:37:47 (info) Build complete\n", - "2021-10-26 14:37:55 (info) Function deploy complete\n", - "> 2021-10-26 14:37:55,689 [info] successfully deployed function: {'internal_invocation_urls': ['nuclio-function-marketplace-stream-to-parquet.default-tenant.svc.cluster.local:8080'], 'external_invocation_urls': ['default-tenant.app.dev39.lab.iguazeng.com:31445']}\n" - ] - }, - { - "data": { - "text/plain": [ - "'http://default-tenant.app.dev39.lab.iguazeng.com:31445'" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import json\n", - "fn.set_envs({'window': 200,\n", - " 'save_to': os.path.join(os.path.join('/User',rel_path), 'inference_pq'),\n", - " 'prediction_col': 'predicted_col',\n", - " 'label_col': 'class',\n", - " 'base_dataset': base_dataset,\n", - " 'results_tsdb_container': container[1:],\n", - " 'results_tsdb_table': tsdb_path,\n", - " 'mount_path': os.path.join(container,user),\n", - " 'mount_remote': container,\n", - " 'artifact_path': os.path.join('/User',rel_path)})\n", - "\n", - "fn.deploy()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Testing the function**" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'data': '{\"request\": {\"instances\": [{\"X1\": 0.0, \"X2\": 0.0, \"X3\": 0.0634475073, \"X4\": 0.4136568818, \"class\": 1.0, \"predicted_col\": 1.0}]}, \"resp\": [1], \"when\": \"2021-10-26 14:37:55.864974\", \"model\": \"sklearn.ensemble.RandomForestClassifier\"}'}" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import json\n", - "import datetime\n", - "\n", - "# Reshaping the data to V3IOStream format.\n", - "def restructure_stream_event(context, event):\n", - " instances = [dict()]\n", - " for key in predicted_test.keys():\n", - " if key not in ['when', 'model', 'worker', 'hostname', 'predicted_col']:\n", - " instances[0].update({key: event.pop(key)})\n", - " instances[0].update({key: event.get(key)}) \n", - " event['request'] = {'instances': instances}\n", - " event['resp'] = [int(event.pop('predicted_col'))]\n", - " event['when'] = datetime.datetime.strftime(datetime.datetime.now(), format=\"%Y-%m-%d %H:%M:%S.%f\")\n", - " event['model'] = 'sklearn.ensemble.RandomForestClassifier'\n", - " return event\n", - " \n", - " \n", - "records = json.loads(predicted_test.to_json(orient='records'))\n", - "records = [{'data': json.dumps(restructure_stream_event(context, record))} for record in records]\n", - "\n", - "# showing first record\n", - "records[0]" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "# Pushing some data to the input stream\n", - "step = 500\n", - "for i in range(0,20000,step):\n", - " response = client.stream.put_records(container=container,\n", - " stream_path=base_input_stream, \n", - " records=records[i:i+step])" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
class_shift_helingerclass_shift_kldclass_shift_tvdprior_helingerprior_kldprior_tvdstream
time
2021-10-26 14:38:08.027000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:38:08.699000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:38:09.599000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:38:10.759000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:38:11.561000+00:000.0017590.0000250.0024881.010.01.0some_stream
........................
2021-10-26 14:39:42.037000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:39:42.191000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:39:42.586000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:39:42.816000+00:000.0017590.0000250.0024881.010.01.0some_stream
2021-10-26 14:39:49.180000+00:000.0017590.0000250.0024881.010.01.0some_stream
\n", - "

99 rows × 7 columns

\n", - "
" - ], - "text/plain": [ - " class_shift_helinger class_shift_kld \\\n", - "time \n", - "2021-10-26 14:38:08.027000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:38:08.699000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:38:09.599000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:38:10.759000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:38:11.561000+00:00 0.001759 0.000025 \n", - "... ... ... \n", - "2021-10-26 14:39:42.037000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:39:42.191000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:39:42.586000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:39:42.816000+00:00 0.001759 0.000025 \n", - "2021-10-26 14:39:49.180000+00:00 0.001759 0.000025 \n", - "\n", - " class_shift_tvd prior_helinger prior_kld \\\n", - "time \n", - "2021-10-26 14:38:08.027000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:38:08.699000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:38:09.599000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:38:10.759000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:38:11.561000+00:00 0.002488 1.0 10.0 \n", - "... ... ... ... \n", - "2021-10-26 14:39:42.037000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:39:42.191000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:39:42.586000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:39:42.816000+00:00 0.002488 1.0 10.0 \n", - "2021-10-26 14:39:49.180000+00:00 0.002488 1.0 10.0 \n", - "\n", - " prior_tvd stream \n", - "time \n", - "2021-10-26 14:38:08.027000+00:00 1.0 some_stream \n", - "2021-10-26 14:38:08.699000+00:00 1.0 some_stream \n", - "2021-10-26 14:38:09.599000+00:00 1.0 some_stream \n", - "2021-10-26 14:38:10.759000+00:00 1.0 some_stream \n", - "2021-10-26 14:38:11.561000+00:00 1.0 some_stream \n", - "... ... ... \n", - "2021-10-26 14:39:42.037000+00:00 1.0 some_stream \n", - "2021-10-26 14:39:42.191000+00:00 1.0 some_stream \n", - "2021-10-26 14:39:42.586000+00:00 1.0 some_stream \n", - "2021-10-26 14:39:42.816000+00:00 1.0 some_stream \n", - "2021-10-26 14:39:49.180000+00:00 1.0 some_stream \n", - "\n", - "[99 rows x 7 columns]" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Reading from TSDB\n", - "import v3io_frames as v3f\n", - "\n", - "v3f_client = v3f.Client(os.environ[\"V3IO_FRAMESD\"],container=container[1:])\n", - "v3f_client.read(backend='tsdb',table=tsdb_path)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "[Back to the top](#Stream-to-Parquet)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/stream_to_parquet/stream_to_parquet.py b/stream_to_parquet/stream_to_parquet.py deleted file mode 100644 index 175c12822..000000000 --- a/stream_to_parquet/stream_to_parquet.py +++ /dev/null @@ -1,96 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import os -import pandas as pd -import numpy as np -import json -import datetime -import mlrun - - -def record_to_features(record): - features = record["request"]["instances"][0] - timestamp = record["when"] - prediction = record["resp"] - - record = {"timestamp": timestamp, **features, "predictions": prediction} - - return record - - -def init_context(context): - setattr(context, "batch", []) - setattr(context, "window", int(os.getenv("window", 10))) - setattr(context, "save_to", os.getenv("save_to", "/bigdata/inference_pq/")) - os.makedirs(context.save_to, exist_ok=True) - - mlrun.mlconf.dbpath = mlrun.mlconf.dbpath or "http://mlrun-api:8080" - artifact_path = os.getenv("artifact_path", None) - if artifact_path: - mlrun.mlconf.artifact_path = artifact_path - if "hub_url" in os.environ: - mlrun.mlconf.hub_url = os.environ["hub_url"] - virtual_drift_fn = mlrun.import_function("hub://virtual_drift") - virtual_drift_fn.apply(mlrun.auto_mount()) - setattr(context, "virtual_drift_fn", virtual_drift_fn) - - predictions_col = os.getenv("predictions", None) - label_col = os.getenv("label_col", None) - setattr(context, "base_dataset", os.getenv("base_dataset", "")) - setattr(context, "indexes", json.loads(os.environ.get("indexes", "[]"))) - setattr(context, "predictions_col", predictions_col) - setattr(context, "label_col", label_col) - setattr( - context, "results_tsdb_container", os.getenv("results_tsdb_container", None) - ) - setattr(context, "results_tsdb_table", os.getenv("results_tsdb_table", None)) - - -def handler(context, event): - - context.logger.info(f"Adding {event.body}") - context.batch.append(record_to_features(json.loads(event.body))) - - if len(context.batch) > context.window: - context.logger.info(context.batch[:1]) - context.logger.info(context.indexes) - df = pd.DataFrame(context.batch) - context.logger.info(f"df example: {df.head(1)}") - if context.indexes: - df = df.set_index(context.indexes) - df_path = os.path.join( - context.save_to, - f"{datetime.datetime.now().strftime('%Y-%m-%dT%H:%M:%S')}.pq", - ) - df.to_parquet(df_path,index=False) - - task = mlrun.NewTask( - name="drift_magnitude", - handler="drift_magnitude", - params={ - "label_col": context.label_col, - "prediction_col": context.predictions_col, - "results_tsdb_container": context.results_tsdb_container, - "results_tsdb_table": context.results_tsdb_table, - }, - inputs={"t": context.base_dataset, "u": df_path}, - artifact_path=mlrun.mlconf.artifact_path, - ) - - context.virtual_drift_fn.run(task, watch=False) - - context.batch = [] diff --git a/tf1_serving/function.yaml b/tf1_serving/function.yaml deleted file mode 100644 index e9f9ef904..000000000 --- a/tf1_serving/function.yaml +++ /dev/null @@ -1,48 +0,0 @@ -kind: remote -metadata: - name: tf1-serving - tag: '' - hash: 20cdeb2119a67fc51e55474ac84d386c7b658db3 - project: '' - labels: - author: yaronh - categories: - - model-serving - - machine-learning -spec: - command: '' - args: [] - image: mlrun/mlrun - description: tf1 image classification server - min_replicas: 1 - max_replicas: 4 - env: - - name: MODEL_CLASS - value: TFModel - - name: ENABLE_EXPLAINER - value: false - base_spec: - apiVersion: nuclio.io/v1 - kind: Function - metadata: - name: tf1-serving - labels: {} - annotations: - nuclio.io/generated_by: function generated from /home/kali/functions/tf1_serving/tf1_serving.py - spec: - runtime: python:3.9 - handler: tf1_serving:handler - env: [] - volumes: [] - build: - commands: [] - noBaseImagesPull: true - functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IHdhcm5pbmdzCgp3YXJuaW5ncy5zaW1wbGVmaWx0ZXIoYWN0aW9uPSJpZ25vcmUiLCBjYXRlZ29yeT1GdXR1cmVXYXJuaW5nKQoKaW1wb3J0IGpzb24KaW1wb3J0IG51bXB5IGFzIG5wCmltcG9ydCByZXF1ZXN0cwpmcm9tIHRlbnNvcmZsb3cgaW1wb3J0IGtlcmFzCmZyb20ga2VyYXMubW9kZWxzIGltcG9ydCBsb2FkX21vZGVsCmZyb20ga2VyYXMucHJlcHJvY2Vzc2luZyBpbXBvcnQgaW1hZ2UKZnJvbSBrZXJhcy5wcmVwcm9jZXNzaW5nLmltYWdlIGltcG9ydCBsb2FkX2ltZwpmcm9tIG9zIGltcG9ydCBlbnZpcm9uLCBwYXRoCmZyb20gUElMIGltcG9ydCBJbWFnZQpmcm9tIGlvIGltcG9ydCBCeXRlc0lPCmZyb20gdXJsbGliLnJlcXVlc3QgaW1wb3J0IHVybG9wZW4KaW1wb3J0IG1scnVuCgoKY2xhc3MgVEZNb2RlbChtbHJ1bi5ydW50aW1lcy5NTE1vZGVsU2VydmVyKToKICAgIGRlZiBfX2luaXRfXyhzZWxmLCBuYW1lOiBzdHIsIG1vZGVsX2Rpcjogc3RyKToKICAgICAgICBzdXBlcigpLl9faW5pdF9fKG5hbWUsIG1vZGVsX2RpcikKCiAgICAgICAgc2VsZi5JTUFHRV9XSURUSCA9IGludChlbnZpcm9uLmdldCgiSU1BR0VfV0lEVEgiLCAiMTI4IikpCiAgICAgICAgc2VsZi5JTUFHRV9IRUlHSFQgPSBpbnQoZW52aXJvbi5nZXQoIklNQUdFX0hFSUdIVCIsICIxMjgiKSkKICAgICAgICBzZWxmLmNsYXNzZXMgPSBOb25lCiAgICAgICAgdHJ5OgogICAgICAgICAgICB3aXRoIG9wZW4oZW52aXJvblsiY2xhc3Nlc19tYXAiXSwgInIiKSBhcyBmOgogICAgICAgICAgICAgICAgc2VsZi5jbGFzc2VzID0ganNvbi5sb2FkKGYpCiAgICAgICAgZXhjZXB0OgogICAgICAgICAgICBwYXNzCgogICAgZGVmIGxvYWQoc2VsZik6CiAgICAgICAgbW9kZWxfZmlsZSwgZXh0cmFfZGF0YSA9IHNlbGYuZ2V0X21vZGVsKCIuaDUiKQogICAgICAgIHNlbGYubW9kZWwgPSBsb2FkX21vZGVsKG9wZW4obW9kZWxfZmlsZSwgInJiIikpCgogICAgZGVmIHByZXByb2Nlc3Moc2VsZiwgYm9keSk6CiAgICAgICAgdHJ5OgogICAgICAgICAgICBvdXRwdXQgPSB7Imluc3RhbmNlcyI6IFtdfQogICAgICAgICAgICBpbnN0YW5jZXMgPSBib2R5LmdldCgiaW5zdGFuY2VzIiwgW10pCiAgICAgICAgICAgIGZvciBieXRlX2ltYWdlIGluIGluc3RhbmNlczoKICAgICAgICAgICAgICAgIGltZyA9IEltYWdlLm9wZW4oYnl0ZV9pbWFnZSkKICAgICAgICAgICAgICAgIGltZyA9IGltZy5yZXNpemUoKHNlbGYuSU1BR0VfV0lEVEgsIHNlbGYuSU1BR0VfSEVJR0hUKSkKCiAgICAgICAgICAgICAgICB4ID0gaW1hZ2UuaW1nX3RvX2FycmF5KGltZykKICAgICAgICAgICAgICAgIHggPSBucC5leHBhbmRfZGltcyh4LCBheGlzPTApCiAgICAgICAgICAgICAgICBvdXRwdXRbImluc3RhbmNlcyJdLmFwcGVuZCh4KQoKICAgICAgICAgICAgb3V0cHV0WyJpbnN0YW5jZXMiXSA9IFtucC52c3RhY2sob3V0cHV0WyJpbnN0YW5jZXMiXSldCiAgICAgICAgICAgIHJldHVybiBvdXRwdXQKICAgICAgICBleGNlcHQ6CiAgICAgICAgICAgIHJhaXNlIEV4Y2VwdGlvbihmInJlY2VpdmVkOiB7Ym9keX0iKQoKICAgIGRlZiBwcmVkaWN0KHNlbGYsIGRhdGEpOgogICAgICAgIGltYWdlcyA9IGRhdGEuZ2V0KCJpbnN0YW5jZXMiLCBbXSkKCiAgICAgICAgcHJlZGljdGVkX3Byb2JhYmlsaXR5ID0gc2VsZi5tb2RlbC5wcmVkaWN0KGltYWdlcykKCiAgICAgICAgcmV0dXJuIHByZWRpY3RlZF9wcm9iYWJpbGl0eQoKICAgIGRlZiBwb3N0cHJvY2VzcyhzZWxmLCBwcmVkaWN0ZWRfcHJvYmFiaWxpdHkpOgogICAgICAgIGlmIHNlbGYuY2xhc3NlczoKICAgICAgICAgICAgcHJlZGljdGVkX2NsYXNzZXMgPSBucC5hcm91bmQocHJlZGljdGVkX3Byb2JhYmlsaXR5LCAxKS50b2xpc3QoKVswXQogICAgICAgICAgICBwcmVkaWN0ZWRfcHJvYmFiaWxpdGllcyA9IHByZWRpY3RlZF9wcm9iYWJpbGl0eS50b2xpc3QoKVswXQogICAgICAgICAgICByZXR1cm4gewogICAgICAgICAgICAgICAgInByZWRpY3Rpb24iOiBbCiAgICAgICAgICAgICAgICAgICAgc2VsZi5jbGFzc2VzW3N0cihpbnQoY2xzKSldIGZvciBjbHMgaW4gcHJlZGljdGVkX2NsYXNzZXMKICAgICAgICAgICAgICAgIF0sCiAgICAgICAgICAgICAgICBmJ3tzZWxmLmNsYXNzZXNbIjEiXX0tcHJvYmFiaWxpdHknOiBwcmVkaWN0ZWRfcHJvYmFiaWxpdGllcywKICAgICAgICAgICAgfQogICAgICAgIGVsc2U6CiAgICAgICAgICAgIHJldHVybiBwcmVkaWN0ZWRfcHJvYmFiaWxpdHkudG9saXN0KClbMF0KCmZyb20gbWxydW4ucnVudGltZXMgaW1wb3J0IG51Y2xpb19pbml0X2hvb2sKZGVmIGluaXRfY29udGV4dChjb250ZXh0KToKICAgIG51Y2xpb19pbml0X2hvb2soY29udGV4dCwgZ2xvYmFscygpLCAnc2VydmluZycpCgpkZWYgaGFuZGxlcihjb250ZXh0LCBldmVudCk6CiAgICByZXR1cm4gY29udGV4dC5tbHJ1bl9oYW5kbGVyKGNvbnRleHQsIGV2ZW50KQo= - source: '' - function_kind: serving - build: - commands: [] - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/tf1_serving/tf1_serving.py - default_handler: handler - affinity: null -verbose: false diff --git a/tf1_serving/item.yaml b/tf1_serving/item.yaml deleted file mode 100644 index 6a5648ab0..000000000 --- a/tf1_serving/item.yaml +++ /dev/null @@ -1,28 +0,0 @@ -apiVersion: v1 -categories: -- model-serving -- machine-learning -description: tf1 image classification server -doc: '' -example: tf1_serving.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: yaronh -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: tf1-serving -platformVersion: 3.5.0 -spec: - env: - ENABLE_EXPLAINER: false - MODEL_CLASS: TFModel - filename: tf1_serving.py - handler: handler - image: mlrun/mlrun - kind: nuclio:serving - requirements: [] -url: '' -version: 1.1.0 diff --git a/tf1_serving/requirements.txt b/tf1_serving/requirements.txt deleted file mode 100644 index 8d3d19557..000000000 --- a/tf1_serving/requirements.txt +++ /dev/null @@ -1,2 +0,0 @@ -pillow -tensorflow \ No newline at end of file diff --git a/tf1_serving/tf1_serving.ipynb b/tf1_serving/tf1_serving.ipynb deleted file mode 100644 index 1d42ee606..000000000 --- a/tf1_serving/tf1_serving.ipynb +++ /dev/null @@ -1,567 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Image Classification Model - Serving Function\n", - "\n", - "This notebook demonstrates how to deploy a Tensorflow model using MLRun & Nuclio.\n", - "\n", - "**In this notebook you will:**\n", - "* Write a Tensorflow-Model class to load and predict on the incoming data\n", - "* Deploy the model as a serverless function\n", - "* Invoke the serving endpoint with data as:\n", - " * URLs to images hosted on S3\n", - " * Direct image send\n", - " \n", - "**Steps:** \n", - "* [Define Nuclio function](#Define-Nuclio-function) \n", - " * [Install dependencies and set config](#Install-dependencies-and-set-config) \n", - " * [Model serving class](#Model-Serving-Class) \n", - "* [Deploy the serving function to the cluster](#Deploy-the-serving-function-to-the-cluster) \n", - "* [Define test parameters](#Define-test-parameters)\n", - "* [Test the deployed function on the cluster](#Test-the-deployed-function-on-the-cluster)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Define Nuclio Function" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To use the magic commands for deploying this jupyter notebook as a nuclio function we must first import nuclio \n", - "Since we do not want to import nuclio in the actual function, the comment annotation `nuclio: ignore` is used. This marks the cell for nuclio, telling it to ignore the cell's values when building the function." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Install dependencies and set config\n", - "> Note: Since tensorflow 1.14 is being pulled from the baseimage it is not directly installed as a build command.\n", - "If it is not installed on your system please uninstall and install using the line: `pip install tensorflow==1.14 keras`" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting kind to 'nuclio:serving'\n", - "%nuclio: setting 'MODEL_CLASS' environment variable\n", - "%nuclio: setting spec.build.baseImage to 'mlrun/mlrun'\n" - ] - } - ], - "source": [ - "%nuclio config kind=\"nuclio:serving\"\n", - "%nuclio env MODEL_CLASS=TFModel\n", - "\n", - "# tensorflow version 1 requires a different version of python than \n", - "# the default (3.7), so we override the default tag here:\n", - "\n", - "%nuclio config spec.build.baseImage = \"mlrun/mlrun\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since we are using packages which are not surely installed on our baseimage, or want to verify that a specific version of the package will be installed we use the `%nuclio cmd` annotation. \n", - ">`%nuclio cmd` works both locally and during deployment by default, but can be set with `-c` flag to only run the commands while deploying or `-l` to set the variable for the local environment only." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "%%nuclio cmd -c\n", - "pip install tensorflow==1.14 keras==2.3.1 'h5py<3.0.0'\n", - "pip install requests pillow" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Function Code" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "import warnings\n", - "warnings.simplefilter(action=\"ignore\", category=FutureWarning)" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "import json\n", - "import numpy as np\n", - "import requests\n", - "from tensorflow import keras\n", - "from keras.models import load_model\n", - "from keras.preprocessing import image\n", - "from keras.preprocessing.image import load_img\n", - "from os import environ, path\n", - "from PIL import Image\n", - "from io import BytesIO\n", - "from urllib.request import urlopen\n", - "import mlrun" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Model Serving Class" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We define the `TFModel` class which we will use to define data handling and prediction of our model. \n", - "\n", - "The class should consist of:\n", - "* `__init__(name, model_dir)` - Setup the internal parameters\n", - "* `load(self)` - How to load the model and broadcast it's ready for prediction\n", - "* `preprocess(self, body)` - How to handle the incoming event, forming the request to an `{'instances': []}` dictionary as requested by the protocol\n", - "* `predict(self, data)` - Receives and `{'instances': []}` and returns the model's prediction as a list\n", - "* `postprocess(self, data)` - Does any additional processing needed on the predictions." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "class TFModel(mlrun.runtimes.MLModelServer):\n", - " def __init__(self, name: str, model_dir: str):\n", - " super().__init__(name, model_dir)\n", - "\n", - " self.IMAGE_WIDTH = int(environ.get('IMAGE_WIDTH', '128'))\n", - " self.IMAGE_HEIGHT = int(environ.get('IMAGE_HEIGHT', '128'))\n", - " self.classes = None\n", - " try:\n", - " with open(environ['classes_map'], 'r') as f:\n", - " self.classes = json.load(f)\n", - " except:\n", - " pass\n", - " \n", - " def load(self):\n", - " model_file, extra_data = self.get_model('.h5')\n", - " self.model = load_model(open(model_file, 'rb'))\n", - " \n", - " def preprocess(self, body):\n", - " try:\n", - " output = {'instances': []}\n", - " instances = body.get('instances', [])\n", - " for byte_image in instances:\n", - " img = Image.open(byte_image)\n", - " img = img.resize((self.IMAGE_WIDTH, self.IMAGE_HEIGHT))\n", - "\n", - " # Load image\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " output['instances'].append(x)\n", - " \n", - " # Format instances list\n", - " output['instances'] = [np.vstack(output['instances'])]\n", - " return output\n", - " except:\n", - " raise Exception(f'received: {body}')\n", - " \n", - "\n", - " def predict(self, data):\n", - " images = data.get('instances', [])\n", - "\n", - " # Predict\n", - " predicted_probability = self.model.predict(images)\n", - "\n", - " # return prediction\n", - " return predicted_probability\n", - " \n", - " def postprocess(self, predicted_probability):\n", - " if self.classes:\n", - " predicted_classes = np.around(predicted_probability, 1).tolist()[0]\n", - " predicted_probabilities = predicted_probability.tolist()[0]\n", - " return {\n", - " 'prediction': [self.classes[str(int(cls))] for cls in predicted_classes], \n", - " f'{self.classes[\"1\"]}-probability': predicted_probabilities\n", - " }\n", - " else:\n", - " return predicted_probability.tolist()[0]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To let our nuclio builder know that our function code ends at this point we will use the comment annotation `nuclio: end-code`. \n", - "\n", - "Any new cell from now on will be treated as if a `nuclio: ignore` comment was set, and will not be added to the funcion." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Test the function locally" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Make sure your local TF / Keras version is the same as pulled in the nuclio image for accurate testing\n", - "\n", - "Set the served models and their file paths using: `SERVING_MODEL_ = `\n", - "\n", - "> Note: this notebook assumes the model and categories are under /User/mlrun/examples/" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "from PIL import Image\n", - "from io import BytesIO\n", - "import matplotlib.pyplot as plt\n", - "import os, requests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Define test parameters" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Test image:\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "# Testing event\n", - "cat_image_url = 'https://s3.amazonaws.com/iguazio-sample-data/images/catanddog/cat.102.jpg'\n", - "response = requests.get(cat_image_url)\n", - "cat_image = response.content\n", - "img = Image.open(BytesIO(cat_image))\n", - "\n", - "print('Test image:')\n", - "plt.imshow(img)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Define Function specifications" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import mlconf\n", - "import os\n", - "\n", - "# Model Server variables\n", - "model_class = 'TFModel'\n", - "model_name = 'cat_vs_dog_tfv1' # Define for later use in tests\n", - "models = {model_name: os.path.join(mlconf.artifact_path, 'tf1/cats_n_dogs.h5')}\n", - "\n", - "# Specific model variables\n", - "function_envs = {\n", - " 'IMAGE_HEIGHT': 128,\n", - " 'IMAGE_WIDTH': 128,\n", - " 'classes_map': os.path.join(mlconf.artifact_path, 'categories_map.json')\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Deploy the serving function to the cluster" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import new_model_server, mount_v3io" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-05-04 21:22:18,924 function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Setup the model server function\n", - "fn = new_model_server('tf1-serving', \n", - " model_class=model_class,\n", - " models=models)\n", - "fn.set_envs(function_envs)\n", - "fn.spec.description = \"tf1 image classification server\"\n", - "fn.metadata.categories = ['serving', 'dl']\n", - "fn.metadata.labels = {'author': 'yaronh'}\n", - "fn.export(\"function.yaml\")" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "if \"V3IO_HOME\" in list(os.environ):\n", - " from mlrun import mount_v3io\n", - " fn.apply(mount_v3io())\n", - "else:\n", - " # is you set up mlrun using the instructions at\n", - " # https://github.com/mlrun/mlrun/blob/master/hack/local/README.md\n", - " from mlrun.platforms import mount_pvc\n", - " fn.apply(mount_pvc('nfsvol', 'nfsvol', '/home/joyan/data'))" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-04-30 20:52:15,886 deploy started\n", - "[nuclio] 2020-04-30 20:53:46,385 (info) Build complete\n", - "[nuclio] 2020-04-30 20:53:56,566 (info) Function deploy complete\n", - "[nuclio] 2020-04-30 20:53:56,573 done updating tensorflow-v1-2layers, function address: 3.135.130.246:30961\n" - ] - } - ], - "source": [ - "# Deploy the model server\n", - "addr = fn.deploy(project='cat-and-dog-servers')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Test the deployed function on the cluster" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Test the deployed function (with URL)" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Sending event: {\"data_url\": \"https://s3.amazonaws.com/iguazio-sample-data/images/catanddog/cat.102.jpg\"}\n" - ] - }, - { - "data": { - "text/plain": [ - "b'[0.0]'" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# URL event\n", - "event_body = json.dumps({\"data_url\": cat_image_url})\n", - "print(f'Sending event: {event_body}')\n", - "\n", - "headers = {'Content-type': 'application/json'}\n", - "response = requests.post(url=addr + f'/{model_name}/predict', data=event_body, headers=headers)\n", - "response.content" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Test the deployed function (with Jpeg Image)" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Sending image from https://s3.amazonaws.com/iguazio-sample-data/images/catanddog/cat.102.jpg\n" - ] - }, - { - "data": { - "text/plain": [ - "b'[0.0]'" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "# URL event\n", - "event_body = cat_image\n", - "print(f'Sending image from {cat_image_url}')\n", - "plt.imshow(img)\n", - "\n", - "headers = {'Content-type': 'image/jpeg'}\n", - "response = requests.post(url=addr + f'/{model_name}/predict/', data=event_body, headers=headers)\n", - "response.content" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/tf1_serving/tf1_serving.py b/tf1_serving/tf1_serving.py deleted file mode 100644 index d9816c684..000000000 --- a/tf1_serving/tf1_serving.py +++ /dev/null @@ -1,87 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import warnings - -warnings.simplefilter(action="ignore", category=FutureWarning) - -import json -import numpy as np -import requests -from tensorflow import keras -from keras.models import load_model -from keras.preprocessing import image -from keras.preprocessing.image import load_img -from os import environ, path -from PIL import Image -from io import BytesIO -from urllib.request import urlopen -import mlrun - - -class TFModel(mlrun.runtimes.MLModelServer): - def __init__(self, name: str, model_dir: str): - super().__init__(name, model_dir) - - self.IMAGE_WIDTH = int(environ.get("IMAGE_WIDTH", "128")) - self.IMAGE_HEIGHT = int(environ.get("IMAGE_HEIGHT", "128")) - self.classes = None - try: - with open(environ["classes_map"], "r") as f: - self.classes = json.load(f) - except: - pass - - def load(self): - model_file, extra_data = self.get_model(".h5") - self.model = load_model(open(model_file, "rb")) - - def preprocess(self, body): - try: - output = {"instances": []} - instances = body.get("instances", []) - for byte_image in instances: - img = Image.open(byte_image) - img = img.resize((self.IMAGE_WIDTH, self.IMAGE_HEIGHT)) - - x = image.img_to_array(img) - x = np.expand_dims(x, axis=0) - output["instances"].append(x) - - output["instances"] = [np.vstack(output["instances"])] - return output - except: - raise Exception(f"received: {body}") - - def predict(self, data): - images = data.get("instances", []) - - predicted_probability = self.model.predict(images) - - return predicted_probability - - def postprocess(self, predicted_probability): - if self.classes: - predicted_classes = np.around(predicted_probability, 1).tolist()[0] - predicted_probabilities = predicted_probability.tolist()[0] - return { - "prediction": [ - self.classes[str(int(cls))] for cls in predicted_classes - ], - f'{self.classes["1"]}-probability': predicted_probabilities, - } - else: - return predicted_probability.tolist()[0] diff --git a/tf2_serving_v2/function.yaml b/tf2_serving_v2/function.yaml deleted file mode 100644 index fbe986dce..000000000 --- a/tf2_serving_v2/function.yaml +++ /dev/null @@ -1,45 +0,0 @@ -kind: serving -metadata: - name: tf2-serving-v2 - tag: '' - hash: 8748deb1d9804f9b436c913322c84d5b46c82bd9 - project: '' - labels: - author: yaronh - categories: - - model-serving - - machine-learning -spec: - command: '' - args: [] - image: mlrun/mlrun - description: tf2 image classification server v2 - min_replicas: 1 - max_replicas: 4 - env: [] - base_spec: - apiVersion: nuclio.io/v1 - kind: Function - metadata: - name: tf2-serving-v2 - labels: {} - annotations: - nuclio.io/generated_by: function generated from /home/kali/functions/tf2_serving_v2/tf2_serving_v2.py - spec: - runtime: python:3.9 - handler: tf2_serving_v2:handler - env: [] - volumes: [] - build: - commands: [] - noBaseImagesPull: true - functionSourceCode: IyBDb3B5cmlnaHQgMjAxOSBJZ3VhemlvCiMKIyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKIyB5b3UgbWF5IG5vdCB1c2UgdGhpcyBmaWxlIGV4Y2VwdCBpbiBjb21wbGlhbmNlIHdpdGggdGhlIExpY2Vuc2UuCiMgWW91IG1heSBvYnRhaW4gYSBjb3B5IG9mIHRoZSBMaWNlbnNlIGF0CiMKIyAgICAgaHR0cDovL3d3dy5hcGFjaGUub3JnL2xpY2Vuc2VzL0xJQ0VOU0UtMi4wCiMKIyBVbmxlc3MgcmVxdWlyZWQgYnkgYXBwbGljYWJsZSBsYXcgb3IgYWdyZWVkIHRvIGluIHdyaXRpbmcsIHNvZnR3YXJlCiMgZGlzdHJpYnV0ZWQgdW5kZXIgdGhlIExpY2Vuc2UgaXMgZGlzdHJpYnV0ZWQgb24gYW4gIkFTIElTIiBCQVNJUywKIyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KIyBTZWUgdGhlIExpY2Vuc2UgZm9yIHRoZSBzcGVjaWZpYyBsYW5ndWFnZSBnb3Zlcm5pbmcgcGVybWlzc2lvbnMgYW5kCiMgbGltaXRhdGlvbnMgdW5kZXIgdGhlIExpY2Vuc2UuCiMKIyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKaW1wb3J0IHdhcm5pbmdzCgp3YXJuaW5ncy5zaW1wbGVmaWx0ZXIoYWN0aW9uPSJpZ25vcmUiLCBjYXRlZ29yeT1GdXR1cmVXYXJuaW5nKQoKaW1wb3J0IGpzb24KaW1wb3J0IG51bXB5IGFzIG5wCmltcG9ydCByZXF1ZXN0cwpmcm9tIHRlbnNvcmZsb3cgaW1wb3J0IGtlcmFzCmZyb20gdGVuc29yZmxvdy5rZXJhcy5tb2RlbHMgaW1wb3J0IGxvYWRfbW9kZWwKZnJvbSB0ZW5zb3JmbG93LmtlcmFzLnByZXByb2Nlc3NpbmcgaW1wb3J0IGltYWdlCmZyb20gdGVuc29yZmxvdy5rZXJhcy5wcmVwcm9jZXNzaW5nLmltYWdlIGltcG9ydCBsb2FkX2ltZwpmcm9tIG9zIGltcG9ydCBlbnZpcm9uLCBwYXRoCmZyb20gUElMIGltcG9ydCBJbWFnZQpmcm9tIGlvIGltcG9ydCBCeXRlc0lPCmZyb20gdXJsbGliLnJlcXVlc3QgaW1wb3J0IHVybG9wZW4KaW1wb3J0IG1scnVuCgoKY2xhc3MgVEZNb2RlbChtbHJ1bi5zZXJ2aW5nLlYyTW9kZWxTZXJ2ZXIpOgogICAgZGVmIGxvYWQoc2VsZik6CiAgICAgICAgc2VsZi5JTUFHRV9XSURUSCA9IGludChlbnZpcm9uLmdldCgiSU1BR0VfV0lEVEgiLCAiMTI4IikpCiAgICAgICAgc2VsZi5JTUFHRV9IRUlHSFQgPSBpbnQoZW52aXJvbi5nZXQoIklNQUdFX0hFSUdIVCIsICIxMjgiKSkKCiAgICAgICAgdHJ5OgogICAgICAgICAgICB3aXRoIG9wZW4oZW52aXJvblsiY2xhc3Nlc19tYXAiXSwgInIiKSBhcyBmOgogICAgICAgICAgICAgICAgc2VsZi5jbGFzc2VzID0ganNvbi5sb2FkKGYpCiAgICAgICAgZXhjZXB0OgogICAgICAgICAgICBzZWxmLmNsYXNzZXMgPSBOb25lCgogICAgICAgIG1vZGVsX2ZpbGUsIGV4dHJhX2RhdGEgPSBzZWxmLmdldF9tb2RlbCgiLmg1IikKICAgICAgICBzZWxmLm1vZGVsID0gbG9hZF9tb2RlbChtb2RlbF9maWxlKQoKICAgIGRlZiBwcmVwcm9jZXNzKHNlbGYsIGJvZHksIG9wZXJhdGlvbik6CiAgICAgICAgdHJ5OgogICAgICAgICAgICBvdXRwdXQgPSB7ImlucHV0cyI6IFtdfQogICAgICAgICAgICBpbnB1dHMgPSBib2R5LmdldCgiaW5wdXRzIiwgW10pCiAgICAgICAgICAgIGZvciBieXRlX2ltYWdlIGluIGlucHV0czoKICAgICAgICAgICAgICAgIGltZyA9IEltYWdlLm9wZW4oYnl0ZV9pbWFnZSkKICAgICAgICAgICAgICAgIGltZyA9IGltZy5yZXNpemUoKHNlbGYuSU1BR0VfV0lEVEgsIHNlbGYuSU1BR0VfSEVJR0hUKSkKCiAgICAgICAgICAgICAgICB4ID0gaW1hZ2UuaW1nX3RvX2FycmF5KGltZykKICAgICAgICAgICAgICAgIHggPSBucC5leHBhbmRfZGltcyh4LCBheGlzPTApCiAgICAgICAgICAgICAgICBvdXRwdXRbImlucHV0cyJdLmFwcGVuZCh4KQoKICAgICAgICAgICAgb3V0cHV0WyJpbnB1dHMiXSA9IFtucC52c3RhY2sob3V0cHV0WyJpbnB1dHMiXSldCiAgICAgICAgICAgIHJldHVybiBvdXRwdXQKICAgICAgICBleGNlcHQ6CiAgICAgICAgICAgIHJhaXNlIEV4Y2VwdGlvbihmInJlY2VpdmVkOiB7Ym9keX0iKQoKICAgIGRlZiBwcmVkaWN0KHNlbGYsIGRhdGEpOgogICAgICAgIGltYWdlcyA9IGRhdGEuZ2V0KCJpbnB1dHMiLCBbXSkKCiAgICAgICAgcHJlZGljdGVkX3Byb2JhYmlsaXR5ID0gc2VsZi5tb2RlbC5wcmVkaWN0KGltYWdlcykKCiAgICAgICAgcmV0dXJuIHByZWRpY3RlZF9wcm9iYWJpbGl0eS50b2xpc3QoKVswXQpmcm9tIG1scnVuLnJ1bnRpbWVzIGltcG9ydCBudWNsaW9faW5pdF9ob29rCmRlZiBpbml0X2NvbnRleHQoY29udGV4dCk6CiAgICBudWNsaW9faW5pdF9ob29rKGNvbnRleHQsIGdsb2JhbHMoKSwgJ3NlcnZpbmdfdjInKQoKZGVmIGhhbmRsZXIoY29udGV4dCwgZXZlbnQpOgogICAgcmV0dXJuIGNvbnRleHQubWxydW5faGFuZGxlcihjb250ZXh0LCBldmVudCkK - source: '' - function_kind: serving_v2 - build: - commands: - - python -m pip install requests pillow tensorflow>=2.1 - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/tf2_serving_v2/tf2_serving_v2.py - secret_sources: [] - affinity: null -verbose: false diff --git a/tf2_serving_v2/item.yaml b/tf2_serving_v2/item.yaml deleted file mode 100644 index 72d48b2f0..000000000 --- a/tf2_serving_v2/item.yaml +++ /dev/null @@ -1,28 +0,0 @@ -apiVersion: v1 -categories: -- model-serving -- machine-learning -description: tf2 image classification server v2 -doc: '' -example: tf2_serving_v2.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: yaronh -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: tf2-serving-v2 -platformVersion: 3.5.0 -spec: - filename: tf2_serving_v2.py - handler: handler - image: mlrun/mlrun - kind: serving - requirements: - - requests - - pillow - - tensorflow>=2.1 -url: '' -version: 1.2.0 diff --git a/tf2_serving_v2/requirements.txt b/tf2_serving_v2/requirements.txt deleted file mode 100644 index 8d3d19557..000000000 --- a/tf2_serving_v2/requirements.txt +++ /dev/null @@ -1,2 +0,0 @@ -pillow -tensorflow \ No newline at end of file diff --git a/tf2_serving_v2/tf2_serving_v2.ipynb b/tf2_serving_v2/tf2_serving_v2.ipynb deleted file mode 100644 index 6a15b11a4..000000000 --- a/tf2_serving_v2/tf2_serving_v2.ipynb +++ /dev/null @@ -1,545 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Image Classification Model - Serving Function\n", - "\n", - "This notebook demonstrates how to deploy a Tensorflow model using MLRun & Nuclio.\n", - "\n", - "**In this notebook you will:**\n", - "* Write a Tensorflow-Model class to load and predict on the incoming data\n", - "* Deploy the model as a serverless function\n", - "* Invoke the serving endpoint with data as:\n", - " * URLs to images hosted on S3\n", - " * Direct image send\n", - " \n", - "**Steps:** \n", - "* [Define Nuclio function](#Define-Nuclio-function) \n", - " * [Install dependencies and set config](#Install-dependencies-and-set-config) \n", - " * [Model serving class](#Model-Serving-Class) \n", - "* [Deploy the serving function to the cluster](#Deploy-the-serving-function-to-the-cluster) \n", - "* [Define test parameters](#Define-test-parameters)\n", - "* [Test the deployed function on the cluster](#Test-the-deployed-function-on-the-cluster)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Define Nuclio Function" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To use the magic commands for deploying this jupyter notebook as a nuclio function we must first import nuclio \n", - "Since we do not want to import nuclio in the actual function, the comment annotation `nuclio: ignore` is used. This marks the cell for nuclio, telling it to ignore the cell's values when building the function." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "The history saving thread hit an unexpected error (DatabaseError('database disk image is malformed')).History will not be written to the database.\n" - ] - } - ], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Install dependencies and set config\n", - "> Note: Since tensorflow is being pulled from the baseimage it is not directly installed as a build command.\n", - "If it is not installed on your system please uninstall and install using the line: `pip install tensorflow`" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting kind to 'serving'\n", - "%nuclio: setting spec.build.baseImage to 'mlrun/mlrun'\n" - ] - } - ], - "source": [ - "%nuclio config kind=\"serving\"\n", - "\n", - "# tensorflow 2 use the default serving image (or the mlrun/ml-models for a faster build)\n", - "\n", - "%nuclio config spec.build.baseImage = \"mlrun/mlrun\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since we are using packages which are not surely installed on our baseimage, or want to verify that a specific version of the package will be installed we use the `%nuclio cmd` annotation. \n", - ">`%nuclio cmd` works both locally and during deployment by default, but can be set with `-c` flag to only run the commands while deploying or `-l` to set the variable for the local environment only." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "%%nuclio cmd -c\n", - "pip install tensorflow>=2.1\n", - "pip install requests pillow" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Function Code" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "import warnings\n", - "warnings.simplefilter(action=\"ignore\", category=FutureWarning)" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-01-29 23:47:50,165 [warning] Failed resolving version info. Ignoring and using defaults\n", - "> 2021-01-29 23:47:51,342 [warning] Unable to parse server or client version. Assuming compatible: {'server_version': '0.6.0-rc9', 'client_version': 'unstable'}\n" - ] - } - ], - "source": [ - "import json\n", - "import numpy as np\n", - "import requests\n", - "from tensorflow import keras\n", - "from tensorflow.keras.models import load_model\n", - "from tensorflow.keras.preprocessing import image\n", - "from tensorflow.keras.preprocessing.image import load_img\n", - "from os import environ, path\n", - "from PIL import Image\n", - "from io import BytesIO\n", - "from urllib.request import urlopen\n", - "import mlrun" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Model Serving Class" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We define the `TFModel` class which we will use to define data handling and prediction of our model. \n", - "\n", - "The class should consist of:\n", - "* `__init__(name, model_dir)` - Setup the internal parameters\n", - "* `load(self)` - How to load the model and broadcast it's ready for prediction\n", - "* `preprocess(self, body)` - How to handle the incoming event, forming the request to an `{'instances': []}` dictionary as requested by the protocol\n", - "* `predict(self, data)` - Receives and `{'instances': []}` and returns the model's prediction as a list\n", - "* `postprocess(self, data)` - Does any additional processing needed on the predictions." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "class TFModel(mlrun.serving.V2ModelServer):\n", - "\n", - " def load(self):\n", - " self.IMAGE_WIDTH = int(environ.get('IMAGE_WIDTH', '128'))\n", - " self.IMAGE_HEIGHT = int(environ.get('IMAGE_HEIGHT', '128'))\n", - " \n", - " try:\n", - " with open(environ['classes_map'], 'r') as f:\n", - " self.classes = json.load(f)\n", - " except:\n", - " self.classes = None\n", - " \n", - " model_file, extra_data = self.get_model('.h5')\n", - " self.model = load_model(model_file)\n", - " \n", - " def preprocess(self, body, operation):\n", - " try:\n", - " output = {'inputs': []}\n", - " inputs = body.get('inputs', [])\n", - " for byte_image in inputs:\n", - " img = Image.open(byte_image)\n", - " img = img.resize((self.IMAGE_WIDTH, self.IMAGE_HEIGHT))\n", - "\n", - " # Load image\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " output['inputs'].append(x)\n", - " \n", - " # Format inputs list\n", - " output['inputs'] = [np.vstack(output['inputs'])]\n", - " return output\n", - " except:\n", - " raise Exception(f'received: {body}')\n", - " \n", - "\n", - " def predict(self, data):\n", - " images = data.get('inputs', [])\n", - "\n", - " # Predict\n", - " predicted_probability = self.model.predict(images)\n", - "\n", - " # return prediction\n", - " return predicted_probability.tolist()[0]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To let our nuclio builder know that our function code ends at this point we will use the comment annotation `nuclio: end-code`. \n", - "\n", - "Any new cell from now on will be treated as if a `nuclio: ignore` comment was set, and will not be added to the funcion." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Test the function locally" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Make sure your local TF / Keras version is the same as pulled in the nuclio image for accurate testing\n", - "\n", - "Set the served models and their file paths using: `SERVING_MODEL_ = `\n", - "\n", - "> Note: this notebook assumes the model and categories are under /User/mlrun/examples/" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "from PIL import Image\n", - "from io import BytesIO\n", - "import matplotlib.pyplot as plt\n", - "import os" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Define test parameters" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Test image:\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "# Testing event\n", - "cat_image_url = 'https://s3.amazonaws.com/iguazio-sample-data/images/catanddog/cat.102.jpg'\n", - "response = requests.get(cat_image_url)\n", - "cat_image = response.content\n", - "img = Image.open(BytesIO(cat_image))\n", - "\n", - "print('Test image:')\n", - "plt.imshow(img)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Define Function specifications" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "from mlrun import mlconf\n", - "\n", - "# Specific model variables\n", - "function_envs = {\n", - " 'IMAGE_HEIGHT': 224,\n", - " 'IMAGE_WIDTH': 224,\n", - " 'classes_map': '/Userv3io/projects/cat-and-dog-servers/artifacts/categories_map.json',\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Deploy the serving function to the cluster" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import code_to_function, mount_v3io" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-01-29 23:47:54,881 [info] function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Setup the model server function\n", - "\n", - "fn = code_to_function('tf2-serving-v2', kind=\"serving\")\n", - "fn.spec.description = \"tf2 image classification server v2\"\n", - "fn.metadata.categories = ['serving', 'dl']\n", - "fn.metadata.labels = {'author': 'yaronh'}\n", - "fn.export(\"function.yaml\")\n", - "fn.set_envs(function_envs)\n", - "fn.add_model(key=\"model\",\n", - " model_path=\"/User/mlrun_repos/demos/image-classification-with-distributed-training/pipe/52f2145e-7a54-4137-8c7b-b6c20cc8b1fd/tfmodels/model.h5\",\n", - " class_name=\"TFModel\")" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "if \"V3IO_HOME\" in list(os.environ):\n", - " from mlrun import mount_v3io\n", - " fn.apply(mount_v3io())\n", - "else:\n", - " # is you set up mlrun using the instructions at\n", - " # https://github.com/mlrun/mlrun/blob/master/hack/local/README.md\n", - " from mlrun.platforms import mount_pvc\n", - " fn.apply(mount_pvc('nfsvol', 'nfsvol', '/home/joyan/data'))" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-01-29 23:47:54,893 [info] Starting remote function deploy\n", - "2021-01-29 23:47:55 (info) Deploying function\n", - "2021-01-29 23:47:55 (info) Building\n", - "2021-01-29 23:47:55 (info) Staging files and preparing base images\n", - "2021-01-29 23:47:56 (info) Building processor image\n", - "2021-01-29 23:47:57 (info) Build complete\n", - "2021-01-29 23:48:07 (info) Function deploy complete\n", - "> 2021-01-29 23:48:08,029 [info] function deployed, address=default-tenant.app.us-sales30-demo.iguazio-cd2.com:31946\n" - ] - } - ], - "source": [ - "# Deploy the model server\n", - "addr = fn.deploy(project='cat-and-dog-servers')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Test the deployed function on the cluster" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Test the deployed function (with URL)" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "payload = json.dumps({\"data_url\" : cat_image_url})" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'id': '38224902-a688-4985-9424-578ff9ccb4a5',\n", - " 'model_name': 'model',\n", - " 'outputs': [0.0]}" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.invoke(path='/v2/models/model/predict', body=payload)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Test the deployed function (with Jpeg Image)" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'id': '246c00fc-225c-44ec-b221-4e6c99f7bc5d',\n", - " 'model_name': 'model',\n", - " 'outputs': [0.0]}" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn.invoke(path='/v2/models/model/predict',\n", - " body=cat_image,\n", - " headers={'Content-type': 'image/jpeg'})" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/tf2_serving_v2/tf2_serving_v2.py b/tf2_serving_v2/tf2_serving_v2.py deleted file mode 100644 index 677280a86..000000000 --- a/tf2_serving_v2/tf2_serving_v2.py +++ /dev/null @@ -1,71 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import warnings - -warnings.simplefilter(action="ignore", category=FutureWarning) - -import json -import numpy as np -import requests -from tensorflow import keras -from tensorflow.keras.models import load_model -from tensorflow.keras.preprocessing import image -from tensorflow.keras.preprocessing.image import load_img -from os import environ, path -from PIL import Image -from io import BytesIO -from urllib.request import urlopen -import mlrun - - -class TFModel(mlrun.serving.V2ModelServer): - def load(self): - self.IMAGE_WIDTH = int(environ.get("IMAGE_WIDTH", "128")) - self.IMAGE_HEIGHT = int(environ.get("IMAGE_HEIGHT", "128")) - - try: - with open(environ["classes_map"], "r") as f: - self.classes = json.load(f) - except: - self.classes = None - - model_file, extra_data = self.get_model(".h5") - self.model = load_model(model_file) - - def preprocess(self, body, operation): - try: - output = {"inputs": []} - inputs = body.get("inputs", []) - for byte_image in inputs: - img = Image.open(byte_image) - img = img.resize((self.IMAGE_WIDTH, self.IMAGE_HEIGHT)) - - x = image.img_to_array(img) - x = np.expand_dims(x, axis=0) - output["inputs"].append(x) - - output["inputs"] = [np.vstack(output["inputs"])] - return output - except: - raise Exception(f"received: {body}") - - def predict(self, data): - images = data.get("inputs", []) - - predicted_probability = self.model.predict(images) - - return predicted_probability.tolist()[0] \ No newline at end of file diff --git a/virtual_drift/README.md b/virtual_drift/README.md deleted file mode 100644 index cd7383904..000000000 --- a/virtual_drift/README.md +++ /dev/null @@ -1,56 +0,0 @@ -# Drift Magnitude - -Concept drift and shift are major issues that greatly affect the accuracy and reliability of many real-world applications of machine learning. We can use the following Drift Magnitude metrics to map and understand our concepts and how close the properties of the data we used to train the models on are to the current data we receive. - -## How to integrate - -The Virtual Drift function is built to receive two data batches of data (as `dataitem` or `Dataframe`), base batch *t* and current batch *u*. - -```markdown -:param context: MLRun context -:param t: Base dataset for the drift metrics -:param u: Test dataset for the drift metrics -:param label_col: Label colum in t and u -:param prediction_col: Predictions column in t and u -:param discritizers: Dictionary of dicsritizers for the features if available - (Created automatically if not provided) -:param n_bins: Number of bins to be used for histrogram creation from continuous variables -:param stream_name: Output stream to push metrics to -:param results_tsdb_container: TSDB table container to push metrics to -:param results_tsdb_table: TSDB table to push metrics to -``` - -The function will calculate the selected drift mangitude metrics that were selected and apply them to the **features**, **labels** and **predictions**. It will then save those metrics and export them via Parquet and TSDB. Alerting could be added on top of the metrics via Grafana or a function. - -## Metrics - -The drift magnitude metrics we calculate are: - -### TVD - Total Variation Distance - -Provides a symetric drift distance between two periods *t* and *u* -Z - vector of random variables -P*t* - Probability distribution over timespan *t* - -![\sigma_{t, u}(Z)=\frac{1}{2}\sum_{\hat{z}\in{dom(Z)}}{|P_t{(\hat{Z})-P_u{(\hat{Z})}}|}]() - -### Helinger Distance - -Hellinger distance is an *f* divergence measuer, similar to the Kullback-Leibler (KL) divergence. However, unlike KL Divergence the Hellinger divergence is symmetric and bounded over a probability space. - -P, Q - Discrete probability distributions (P*i*, ..., P*k*). - -![H(P,Q)=\frac{1}{\sqrt{2}}\sqrt{\sum_{i=1}^{k}{(\sqrt{p_i}-\sqrt{q_i})^2}}]() - - -### KL Divergence - -KL Divergence (or relative entropy) is a measure of how one probability distribution differs from another. It is an asymmetric measure (thus it's not a metric) and it doesn't satisfy the triangle inequality. KL Divergence of 0, indicates two identical distributrions. - -![D_{KL}(P||Q)=\sum_{x\in{X}}{(P(x)\log{\frac{P(x)}{Q(x)}})}]() - -## Additional Resources - -Webb, Geoffrey I. et al. “[Characterizing Concept Drift.](https://arxiv.org/abs/1511.03816)” Data Mining and Knowledge Discovery 30.4 (2016): 964–994. Crossref. Web. - -[MLOps Live #4 - How to Detect & Remediate Drift in Production with MLOps Automation](https://www.youtube.com/watch?v=66_Q7mJZOSc&t=1296s) diff --git a/virtual_drift/function.yaml b/virtual_drift/function.yaml deleted file mode 100644 index 55dcec11c..000000000 --- a/virtual_drift/function.yaml +++ /dev/null @@ -1,129 +0,0 @@ -kind: job -metadata: - name: virtual-drift - tag: '' - hash: 8990fdd72fc550189a0c8b488b69997428b786c9 - project: '' - labels: - author: orz - categories: - - data-analysis - - machine-learning -spec: - command: '' - args: [] - image: mlrun/ml-models - env: [] - default_handler: drift_magnitude - entry_points: - to_observations: - name: to_observations - doc: '' - parameters: - - name: context - default: '' - - name: t - default: '' - - name: u - default: '' - - name: key - default: '' - outputs: - - default: '' - lineno: 16 - tvd: - name: tvd - doc: '' - parameters: - - name: t - default: '' - - name: u - default: '' - outputs: - - default: '' - lineno: 42 - helinger: - name: helinger - doc: '' - parameters: - - name: t - default: '' - - name: u - default: '' - outputs: - - default: '' - lineno: 46 - kl_divergence: - name: kl_divergence - doc: '' - parameters: - - name: t - default: '' - - name: u - default: '' - outputs: - - default: '' - lineno: 50 - all_metrics: - name: all_metrics - doc: '' - parameters: - - name: t - default: '' - - name: u - default: '' - outputs: - - default: '' - lineno: 56 - drift_magnitude: - name: drift_magnitude - doc: "Drift magnitude metrics\n Computes drift magnitude metrics between base\ - \ dataset t and dataset u.\n Metrics:\n - TVD (Total Variation Distance)\n\ - \ - Helinger\n - KL Divergence" - parameters: - - name: context - doc: MLRun context - default: '' - - name: t - type: DataFrame - doc: Base dataset for the drift metrics - default: '' - - name: u - type: DataFrame - doc: Test dataset for the drift metrics - default: '' - - name: label_col - doc: Label colum in t and u - default: null - - name: prediction_col - doc: Predictions column in t and u - default: null - - name: discretizers - type: dict - default: null - - name: n_bins - doc: Number of bins to be used for histrogram creation from continuous variables - default: 5 - - name: stream_name - type: str - doc: Output stream to push metrics to - default: some_stream - - name: results_tsdb_container - type: str - doc: TSDB table container to push metrics to - default: bigdata - - name: results_tsdb_table - type: str - doc: TSDB table to push metrics to - default: concept_drift/drift_magnitude - outputs: - - default: '' - lineno: 60 - description: Compute drift magnitude between Time-Samples T and U - build: - functionSourceCode:  - commands: - - python -m pip install scikit-learn scipy v3io_frames - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/virtual_drift/virtual_drift.py - affinity: null -verbose: false diff --git a/virtual_drift/item.yaml b/virtual_drift/item.yaml deleted file mode 100644 index d66f9e9c1..000000000 --- a/virtual_drift/item.yaml +++ /dev/null @@ -1,28 +0,0 @@ -apiVersion: v1 -categories: -- data-analysis -- machine-learning -description: Compute drift magnitude between Time-Samples T and U -doc: '' -example: virtual_drift.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: orz -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: virtual-drift -platformVersion: 3.5.0 -spec: - filename: virtual_drift.py - handler: drift_magnitude - image: mlrun/ml-models - kind: job - requirements: - - scikit-learn - - scipy - - v3io_frames -url: '' -version: 1.1.0 diff --git a/virtual_drift/virtual_drift.ipynb b/virtual_drift/virtual_drift.ipynb deleted file mode 100644 index 23b9ef432..000000000 --- a/virtual_drift/virtual_drift.ipynb +++ /dev/null @@ -1,935 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Virtual Drift\n", - "\n", - "Drift magnitude metrics\n", - " Computes drift magnitude metrics between base dataset t and dataset u. \n", - "\n", - "Metrics:\n", - "- TVD (Total Variation Distance)\n", - "- Helinger\n", - "- KL Divergence" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Steps**\n", - "\n", - "1. [Data exploration](#Data-exploration)\n", - "2. [Importing the function](#Importing-the-function)\n", - "3. [Running the function locally](#Running-the-function-locally)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Data exploration**" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - ".. _wine_dataset:\n", - "\n", - "Wine recognition dataset\n", - "------------------------\n", - "\n", - "**Data Set Characteristics:**\n", - "\n", - " :Number of Instances: 178 (50 in each of three classes)\n", - " :Number of Attributes: 13 numeric, predictive attributes and the class\n", - " :Attribute Information:\n", - " \t\t- Alcohol\n", - " \t\t- Malic acid\n", - " \t\t- Ash\n", - "\t\t- Alcalinity of ash \n", - " \t\t- Magnesium\n", - "\t\t- Total phenols\n", - " \t\t- Flavanoids\n", - " \t\t- Nonflavanoid phenols\n", - " \t\t- Proanthocyanins\n", - "\t\t- Color intensity\n", - " \t\t- Hue\n", - " \t\t- OD280/OD315 of diluted wines\n", - " \t\t- Proline\n", - "\n", - " - class:\n", - " - class_0\n", - " - class_1\n", - " - class_2\n", - "\t\t\n", - " :Summary Statistics:\n", - " \n", - " ============================= ==== ===== ======= =====\n", - " Min Max Mean SD\n", - " ============================= ==== ===== ======= =====\n", - " Alcohol: 11.0 14.8 13.0 0.8\n", - " Malic Acid: 0.74 5.80 2.34 1.12\n", - " Ash: 1.36 3.23 2.36 0.27\n", - " Alcalinity of Ash: 10.6 30.0 19.5 3.3\n", - " Magnesium: 70.0 162.0 99.7 14.3\n", - " Total Phenols: 0.98 3.88 2.29 0.63\n", - " Flavanoids: 0.34 5.08 2.03 1.00\n", - " Nonflavanoid Phenols: 0.13 0.66 0.36 0.12\n", - " Proanthocyanins: 0.41 3.58 1.59 0.57\n", - " Colour Intensity: 1.3 13.0 5.1 2.3\n", - " Hue: 0.48 1.71 0.96 0.23\n", - " OD280/OD315 of diluted wines: 1.27 4.00 2.61 0.71\n", - " Proline: 278 1680 746 315\n", - " ============================= ==== ===== ======= =====\n", - "\n", - " :Missing Attribute Values: None\n", - " :Class Distribution: class_0 (59), class_1 (71), class_2 (48)\n", - " :Creator: R.A. Fisher\n", - " :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)\n", - " :Date: July, 1988\n", - "\n", - "This is a copy of UCI ML Wine recognition datasets.\n", - "https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data\n", - "\n", - "The data is the results of a chemical analysis of wines grown in the same\n", - "region in Italy by three different cultivators. There are thirteen different\n", - "measurements taken for different constituents found in the three types of\n", - "wine.\n", - "\n", - "Original Owners: \n", - "\n", - "Forina, M. et al, PARVUS - \n", - "An Extendible Package for Data Exploration, Classification and Correlation. \n", - "Institute of Pharmaceutical and Food Analysis and Technologies,\n", - "Via Brigata Salerno, 16147 Genoa, Italy.\n", - "\n", - "Citation:\n", - "\n", - "Lichman, M. (2013). UCI Machine Learning Repository\n", - "[https://archive.ics.uci.edu/ml]. Irvine, CA: University of California,\n", - "School of Information and Computer Science. \n", - "\n", - ".. topic:: References\n", - "\n", - " (1) S. Aeberhard, D. Coomans and O. de Vel, \n", - " Comparison of Classifiers in High Dimensional Settings, \n", - " Tech. Rep. no. 92-02, (1992), Dept. of Computer Science and Dept. of \n", - " Mathematics and Statistics, James Cook University of North Queensland. \n", - " (Also submitted to Technometrics). \n", - "\n", - " The data was used with many others for comparing various \n", - " classifiers. The classes are separable, though only RDA \n", - " has achieved 100% correct classification. \n", - " (RDA : 100%, QDA 99.4%, LDA 98.9%, 1NN 96.1% (z-transformed data)) \n", - " (All results using the leave-one-out technique) \n", - "\n", - " (2) S. Aeberhard, D. Coomans and O. de Vel, \n", - " \"THE CLASSIFICATION PERFORMANCE OF RDA\" \n", - " Tech. Rep. no. 92-01, (1992), Dept. of Computer Science and Dept. of \n", - " Mathematics and Statistics, James Cook University of North Queensland. \n", - " (Also submitted to Journal of Chemometrics).\n", - "\n" - ] - } - ], - "source": [ - "# Scikit-learn's wine dataset\n", - "from sklearn.datasets import load_wine\n", - "\n", - "wine = load_wine()\n", - "print(wine[\"DESCR\"])" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "wine_t and wine_u are generated from the wine dataset, where wine_t is the entire dataset while wine_u is a sample (50%) of the entire dataset. \n", - "wine_t shape is 178 and wine_u shape is 89 \n", - "\n", - "\n" - ] - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
alcoholmalic_acidashalcalinity_of_ashmagnesiumtotal_phenolsflavanoidsnonflavanoid_phenolsproanthocyaninscolor_intensityhueod280/od315_of_diluted_winesprolineyprediction
014.231.712.4315.6127.02.803.060.282.295.641.043.921065.000
113.201.782.1411.2100.02.652.760.261.284.381.053.401050.000
213.162.362.6718.6101.02.803.240.302.815.681.033.171185.000
314.371.952.5016.8113.03.853.490.242.187.800.863.451480.000
413.242.592.8721.0118.02.802.690.391.824.321.042.93735.000
\n", - "
" - ], - "text/plain": [ - " alcohol malic_acid ash alcalinity_of_ash magnesium total_phenols \\\n", - "0 14.23 1.71 2.43 15.6 127.0 2.80 \n", - "1 13.20 1.78 2.14 11.2 100.0 2.65 \n", - "2 13.16 2.36 2.67 18.6 101.0 2.80 \n", - "3 14.37 1.95 2.50 16.8 113.0 3.85 \n", - "4 13.24 2.59 2.87 21.0 118.0 2.80 \n", - "\n", - " flavanoids nonflavanoid_phenols proanthocyanins color_intensity hue \\\n", - "0 3.06 0.28 2.29 5.64 1.04 \n", - "1 2.76 0.26 1.28 4.38 1.05 \n", - "2 3.24 0.30 2.81 5.68 1.03 \n", - "3 3.49 0.24 2.18 7.80 0.86 \n", - "4 2.69 0.39 1.82 4.32 1.04 \n", - "\n", - " od280/od315_of_diluted_wines proline y prediction \n", - "0 3.92 1065.0 0 0 \n", - "1 3.40 1050.0 0 0 \n", - "2 3.17 1185.0 0 0 \n", - "3 3.45 1480.0 0 0 \n", - "4 2.93 735.0 0 0 " - ] - }, - "execution_count": 24, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "wine_t_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/virtual_drift/wine_t.pq'\n", - "wine_u_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/virtual_drift/wine_u.pq'\n", - "wine_t=pd.read_parquet(wine_t_path)\n", - "wine_u=pd.read_parquet(wine_u_path)\n", - "print(f'wine_t and wine_u are generated from the wine dataset, where wine_t is the entire dataset while wine_u is a sample (50%) of the entire dataset. \\n\\\n", - "wine_t shape is {wine_t.shape[0]} and wine_u shape is {wine_u.shape[0]} \\n\\n')\n", - "wine_t.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Importing the function**" - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-26 13:45:22,345 [info] created and saved project function-marketplace\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 25, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import mlrun\n", - "\n", - "# Importing the function\n", - "mlrun.set_environment(project='function-marketplace')\n", - "\n", - "fn = mlrun.import_function(\"hub://virtual_drift\")\n", - "fn.apply(mlrun.auto_mount())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### **Running the function locally**" - ] - }, - { - "cell_type": "code", - "execution_count": 27, - "metadata": {}, - "outputs": [], - "source": [ - "import os \n", - "\n", - "container = os.path.join('/',os.environ['V3IO_HOME'].split('/')[0])\n", - "user = os.environ[\"V3IO_USERNAME\"]\n", - "rel_path = os.getcwd()[6:] + '/artifacts'\n", - "tsdb_path = os.path.join(user,rel_path) + \"/output_tsdb\"" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-26 14:00:41,020 [info] starting run virtual-drift-drift_magnitude uid=28ec7f08ce7c4c528114e2590ff49325 DB=http://mlrun-api:8080\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Warning - Server version '0.8.14' is different from client version '0.9.4'. Some operations may not work as expected.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-26 14:00:43,469 [info] Fitting discretizer for alcohol\n", - "> 2021-10-26 14:00:43,471 [info] Fitting discretizer for malic_acid\n", - "> 2021-10-26 14:00:43,471 [info] Fitting discretizer for ash\n", - "> 2021-10-26 14:00:43,472 [info] Fitting discretizer for alcalinity_of_ash\n", - "> 2021-10-26 14:00:43,473 [info] Fitting discretizer for magnesium\n", - "> 2021-10-26 14:00:43,474 [info] Fitting discretizer for total_phenols\n", - "> 2021-10-26 14:00:43,475 [info] Fitting discretizer for flavanoids\n", - "> 2021-10-26 14:00:43,476 [info] Fitting discretizer for nonflavanoid_phenols\n", - "> 2021-10-26 14:00:43,477 [info] Fitting discretizer for proanthocyanins\n", - "> 2021-10-26 14:00:43,477 [info] Fitting discretizer for color_intensity\n", - "> 2021-10-26 14:00:43,478 [info] Fitting discretizer for hue\n", - "> 2021-10-26 14:00:43,479 [info] Fitting discretizer for od280/od315_of_diluted_wines\n", - "> 2021-10-26 14:00:43,480 [info] Fitting discretizer for proline\n", - "> 2021-10-26 14:00:43,531 [info] Discretizing featuers\n", - "> 2021-10-26 14:00:43,752 [info] Compute prior metrics\n", - "> 2021-10-26 14:00:43,889 [info] Compute class metrics\n", - "> 2021-10-26 14:00:44,000 [info] value: inf\n", - "> 2021-10-26 14:00:44,009 [info] Timestamp: 2021-10-26 14:00:44.008992\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "divide by zero encountered in log\n", - "casting datetime64[ns] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
function-marketplace0Oct 26 14:00:41completedvirtual-drift-drift_magnitude
v3io_user=dani
kind=
owner=dani
host=jupyter-dani-6bfbd76d96-zxx6f
t
u
label_col=y
results_tsdb_container=users
results_tsdb_table=dani/test/functions/virtual_drift/artifacts/output_tsdb
prior_tvd=0.5
prior_helinger=0.541
prior_kld=10
class_shift_tvd=0.017
class_shift_helinger=0.014
class_shift_kld=0.002
discritizers
t_discrete
u_discrete
features_t_pdf
features_u_pdf
class_t_pdf
class_u_pdf
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/html": [ - " > to track results use the .show() or .logs() methods or
click here to open in UI" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "> 2021-10-26 14:00:44,153 [info] run executed, status=completed\n" - ] - } - ], - "source": [ - "virtual_drift_run=fn.run(params={'label_col': 'y',\n", - " 'results_tsdb_container': container[1:],\n", - " 'results_tsdb_table': tsdb_path},\n", - " inputs={'t': wine_t_path,\n", - " 'u': wine_u_path},\n", - " artifact_path=os.getcwd(),\n", - " local=True)" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
u
00.348315
10.382022
20.269663
\n", - "
" - ], - "text/plain": [ - " u\n", - "0 0.348315\n", - "1 0.382022\n", - "2 0.269663" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
t
00.331461
10.398876
20.269663
\n", - "
" - ], - "text/plain": [ - " t\n", - "0 0.331461\n", - "1 0.398876\n", - "2 0.269663" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "virtual_drift_run.artifact('class_u_pdf').show()\n", - "virtual_drift_run.artifact('class_t_pdf').show()" - ] - }, - { - "cell_type": "code", - "execution_count": 69, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Warning - Server version '0.8.14' is different from client version '0.9.4'. Some operations may not work as expected.\n" - ] - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
class_shift_helingerclass_shift_kldclass_shift_tvdprior_helingerprior_kldprior_tvdstream
time
2021-10-26 13:58:04.445000+00:000.013980.0015640.0168540.54119610.00.5some_stream
2021-10-26 14:00:44.008000+00:000.013980.0015640.0168540.54119610.00.5some_stream
\n", - "
" - ], - "text/plain": [ - " class_shift_helinger class_shift_kld \\\n", - "time \n", - "2021-10-26 13:58:04.445000+00:00 0.01398 0.001564 \n", - "2021-10-26 14:00:44.008000+00:00 0.01398 0.001564 \n", - "\n", - " class_shift_tvd prior_helinger prior_kld \\\n", - "time \n", - "2021-10-26 13:58:04.445000+00:00 0.016854 0.541196 10.0 \n", - "2021-10-26 14:00:44.008000+00:00 0.016854 0.541196 10.0 \n", - "\n", - " prior_tvd stream \n", - "time \n", - "2021-10-26 13:58:04.445000+00:00 0.5 some_stream \n", - "2021-10-26 14:00:44.008000+00:00 0.5 some_stream " - ] - }, - "execution_count": 69, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import v3io_frames as v3f\n", - "client = v3f.Client(os.environ[\"V3IO_FRAMESD\"],container=container[1:])\n", - "client.read(backend='tsdb',table=tsdb_path)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "[Back to the top](#Virtual-Drift)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/virtual_drift/virtual_drift.py b/virtual_drift/virtual_drift.py deleted file mode 100644 index 71dcf7129..000000000 --- a/virtual_drift/virtual_drift.py +++ /dev/null @@ -1,206 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# Generated by nuclio.export.NuclioExporter - -import os -import pandas as pd -import numpy as np -import scipy as sp -import pickle -import datetime - -import v3io_frames as v3f - -import matplotlib.pyplot as plt -from sklearn.preprocessing import KBinsDiscretizer - - -def to_observations(context, t, u, key): - t = ( - t.apply(lambda row: f"{'_'.join([str(row[val]) for val in t.columns])}", axis=1) - .value_counts() - .sort_index() - ) - u = ( - u.apply(lambda row: f"{'_'.join([str(row[val]) for val in u.columns])}", axis=1) - .value_counts() - .sort_index() - ) - - joined_uniques = pd.DataFrame([t, u]).T.fillna(0).sort_index() - joined_uniques.columns = ["t", "u"] - - t_obs = joined_uniques.loc[:, "t"] - u_obs = joined_uniques.loc[:, "u"] - - t_pdf = t_obs / t_obs.sum() - u_pdf = u_obs / u_obs.sum() - - context.log_dataset(f"{key}_t_pdf", pd.DataFrame(t_pdf), format="parquet") - context.log_dataset(f"{key}_u_pdf", pd.DataFrame(u_pdf), format="parquet") - return t_pdf, u_pdf - - -def tvd(t, u): - return sum(abs(t - u)) / 2 - - -def helinger(t, u): - return (np.sqrt(np.sum(np.power(np.sqrt(t) - np.sqrt(u), 2)))) / np.sqrt(2) - - -def kl_divergence(t, u): - t_u = np.sum(np.where(t != 0, t * np.log(t / u), 0)) - u_t = np.sum(np.where(u != 0, u * np.log(u / t), 0)) - return t_u + u_t - - -def all_metrics(t, u): - return tvd(t, u), helinger(t, u), kl_divergence(t, u) - - -def drift_magnitude( - context, - t: pd.DataFrame, - u: pd.DataFrame, - label_col=None, - prediction_col=None, - discretizers: dict = None, - n_bins=5, - stream_name: str = "some_stream", - results_tsdb_container: str = "bigdata", - results_tsdb_table: str = "concept_drift/drift_magnitude", -): - """Drift magnitude metrics - Computes drift magnitude metrics between base dataset t and dataset u. - Metrics: - - TVD (Total Variation Distance) - - Helinger - - KL Divergence - - :param context: MLRun context - :param t: Base dataset for the drift metrics - :param u: Test dataset for the drift metrics - :param label_col: Label colum in t and u - :param prediction_col: Predictions column in t and u - :param discritizers: Dictionary of dicsritizers for the features if available - (Created automatically if not provided) - :param n_bins: Number of bins to be used for histrogram creation from continuous variables - :param stream_name: Output stream to push metrics to - :param results_tsdb_container: TSDB table container to push metrics to - :param results_tsdb_table: TSDB table to push metrics to - """ - - v3io_client = v3f.Client("framesd:8081", container=results_tsdb_container) - try: - v3io_client.create("tsdb", results_tsdb_table, if_exists=1, rate="1/s") - except: - v3io_client.create( - "tsdb", results_tsdb_table, if_exists=1, attrs={"rate": "1/s"} - ) - - df_t = t.as_df() - df_u = u.as_df() - - drop_columns = [] - if label_col is not None: - drop_columns.append(label_col) - if prediction_col is not None: - drop_columns.append(prediction_col) - - continuous_features = df_t.select_dtypes(["float"]) - if discretizers is None: - discretizers = {} - for feature in continuous_features.columns: - context.logger.info(f"Fitting discretizer for {feature}") - discretizer = KBinsDiscretizer( - n_bins=n_bins, encode="ordinal", strategy="uniform" - ) - - discretizer.fit(continuous_features.loc[:, feature].values.reshape(-1, 1)) - discretizers[feature] = discretizer - os.makedirs(context.artifact_path, exist_ok=True) - discretizers_path = os.path.abspath(f"{context.artifact_path}/discritizer.pkl") - with open(discretizers_path, "wb") as f: - pickle.dump(discretizers, f) - context.log_artifact("discritizers", target_path=discretizers_path) - context.logger.info("Discretizing featuers") - for feature, discretizer in discretizers.items(): - df_t[feature] = discretizer.transform( - df_t.loc[:, feature].values.reshape(-1, 1) - ) - df_u[feature] = discretizer.transform( - df_u.loc[:, feature].values.reshape(-1, 1) - ) - df_t[feature] = df_t[feature].astype("int") - df_u[feature] = df_u[feature].astype("int") - context.log_dataset("t_discrete", df_t, format="parquet") - context.log_dataset("u_discrete", df_u, format="parquet") - - context.logger.info("Compute prior metrics") - - results = {} - t_prior, u_prior = to_observations( - context, - df_t.drop(drop_columns, axis=1), - df_u.drop(drop_columns, axis=1), - "features", - ) - results["prior_tvd"], results["prior_helinger"], results["prior_kld"] = all_metrics( - t_prior, u_prior - ) - - if prediction_col is not None: - context.logger.info("Compute prediction metrics") - t_predictions = pd.DataFrame(df_t.loc[:, prediction_col]) - u_predictions = pd.DataFrame(df_u.loc[:, prediction_col]) - t_class, u_class = to_observations( - context, t_predictions, u_predictions, "prediction" - ) - ( - results["prediction_shift_tvd"], - results["prediction_shift_helinger"], - results["prediction_shift_kld"], - ) = all_metrics(t_class, u_class) - - if label_col is not None: - context.logger.info("Compute class metrics") - t_labels = pd.DataFrame(df_t.loc[:, label_col]) - u_labels = pd.DataFrame(df_u.loc[:, label_col]) - t_class, u_class = to_observations(context, t_labels, u_labels, "class") - ( - results["class_shift_tvd"], - results["class_shift_helinger"], - results["class_shift_kld"], - ) = all_metrics(t_class, u_class) - - for key, value in results.items(): - if value == float("inf"): - context.logger.info(f"value: {value}") - results[key] = 10 - for key, result in results.items(): - context.log_result(key, round(result, 3)) - - now = pd.to_datetime(str(datetime.datetime.now())) - now - - results["timestamp"] = pd.to_datetime(str((datetime.datetime.now()))) - context.logger.info(f"Timestamp: {results['timestamp']}") - results["stream"] = stream_name - results_df = pd.DataFrame( - data=[list(results.values())], columns=list(results.keys()) - ) - results_df = results_df.set_index(["timestamp", "stream"]) - v3io_client.write("tsdb", results_tsdb_table, dfs=results_df) diff --git a/xgb_custom/function.yaml b/xgb_custom/function.yaml deleted file mode 100644 index 7c264c392..000000000 --- a/xgb_custom/function.yaml +++ /dev/null @@ -1,241 +0,0 @@ -kind: job -metadata: - name: xgb-custom - tag: '' - hash: 5a052481ac303bde0afeccef9d2c5257abc4b00e - project: '' - labels: - author: Daniel - categories: - - model-training - - machine-learning - - data-preparation -spec: - command: '' - args: [] - image: mlrun/mlrun - env: [] - default_handler: gen_outliers - entry_points: - gen_outliers: - name: gen_outliers - doc: simulate data with outliers - parameters: - - name: context - type: MLClientCtx - doc: the function's execution context - default: '' - - name: nrows - doc: (4096) number of data points - default: 4096 - - name: feats - doc: (16) number of features - default: 16 - - name: outs - doc: (64) number of outliers - default: 64 - - name: omax - doc: (10_100) max value of outliers - default: 10000 - - name: labels_col - doc: (labels) name of labels column - default: labels - - name: header - doc: () header for dataset, will default to `feat_` - default: [] - - name: label_type - doc: (int32) data type for the label column - default: int32 - - name: key - doc: key of datset in artifact store - default: xgb-outs - - name: local_path - doc: path in artifact store where data will be serialized - default: xgb_custom - outputs: - - default: '' - lineno: 22 - gradient: - name: gradient - doc: gradient of squared log error - parameters: - - name: predt - type: ndarray - default: '' - - name: dtrain - type: DMatrix - default: '' - outputs: - - default: '' - lineno: 59 - hessian: - name: hessian - doc: hessian of squared log error - parameters: - - name: predt - type: ndarray - default: '' - - name: dtrain - type: DMatrix - default: '' - outputs: - - default: '' - lineno: 65 - squared_log: - name: squared_log - doc: 'squared log error objective - - - simplified version for RMSLE used as objective function' - parameters: - - name: predt - type: ndarray - default: '' - - name: dtrain - type: DMatrix - default: '' - outputs: - - default: '' - lineno: 72 - rmsle: - name: rmsle - doc: Root mean squared log error metric. - parameters: - - name: predt - type: ndarray - default: '' - - name: dtrain - type: DMatrix - default: '' - outputs: - - default: '' - lineno: 83 - learning_curves: - name: learning_curves - doc: 'plot xgb learning curves - - - this will also log a model''s learning curves' - parameters: - - name: context - type: MLClientCtx - default: '' - - name: results - type: dict - default: '' - - name: figsz - type: Tuple[int, int] - default: - - 10 - - 10 - - name: plots_dest - type: str - default: plots - outputs: - - default: '' - lineno: 92 - fit: - name: fit - doc: "low level xgboost train api\n\nfor the xgboost `train` params see:\nhttps://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.train\n\ - \nNote: the first parameter of xgboost's `train` method is a dict of parameters\n\ - \ supplied to the booster (engine). To modify one of those simply\n\ - \ add a task parameter (when running you supply an mlrun NewTask) with\ - \ the\n prefix \"XGB_\". So for example, to set the 'tree_method' parameter\ - \ to 'approx',\n add {\"XGB_tree_method\":\"approx\"} to the task params\ - \ key." - parameters: - - name: context - type: MLClientCtx - doc: the function context - default: '' - - name: dataset - type: DataItem - doc: the full data set, train, valid and test will be extracted and each converted - to a DMatrix for input to xgboost's `train` - default: '' - - name: num_boost_round - type: int - default: 10 - - name: evals - type: List[Tuple[DMatrix, str]] - default: [] - - name: obj - type: Union[Callable, str] - default: '' - - name: feval - type: Union[Callable, str] - default: null - - name: maximize - type: bool - default: false - - name: early_stopping_rounds - type: int - default: null - - name: evals_result - type: dict - default: {} - - name: verbose_eval - type: bool - default: true - - name: xgb_model - type: DataItem - default: null - - name: callbacks - type: List[Callable] - default: [] - - name: label_column - type: str - doc: ground-truth (y) labels - default: labels - - name: encode_cols - type: dict - doc: dictionary of names and prefixes for columns that are to hot be encoded. - default: {} - - name: sample - type: int - doc: Selects the first n rows, or select a sample starting from the first. - If negative <-1, select a random sample - default: <_ast.USub object at 0x7ff7bf99a7b8> - - name: test_size - type: float - doc: (0.05) test set size - default: 0.25 - - name: valid_size - type: float - doc: (0.75) Once the test set has been removed the training set gets this - proportion. - default: 0.75 - - name: random_state - type: int - doc: (1) sklearn rng seed - default: 1994 - - name: models_dest - type: str - doc: destination subfolder for model artifacts - default: models - - name: plots_dest - type: str - doc: destination subfolder for plot artifacts - default: plots - - name: file_ext - type: str - doc: format for test_set_key hold out data - default: csv - - name: test_set_key - type: str - doc: (test-set), key of held out data in artifact store - default: test-set - - name: gpus - type: bool - doc: (False), run on gpus - default: false - outputs: - - default: '' - lineno: 114 - description: simulate data with outliers. - build: - functionSourceCode:  - commands: [] - code_origin: https://github.com/daniels290813/functions.git#55a79c32be5d233cc11efcf40cd3edbe309bfdef:/home/kali/functions/xgb_custom/xgb_custom.py - affinity: null -verbose: false diff --git a/xgb_custom/item.yaml b/xgb_custom/item.yaml deleted file mode 100644 index 3decf0708..000000000 --- a/xgb_custom/item.yaml +++ /dev/null @@ -1,26 +0,0 @@ -apiVersion: v1 -categories: -- model-training -- machine-learning -- data-preparation -description: simulate data with outliers. -doc: '' -example: xgb_custom.ipynb -generationDate: 2022-08-28:17-25 -hidden: true -icon: '' -labels: - author: Daniel -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.1.0 -name: xgb_custom -platformVersion: 3.5.0 -spec: - filename: xgb_custom.py - handler: gen_outliers - image: mlrun/mlrun - kind: job - requirements: [] -url: '' -version: 1.1.0 diff --git a/xgb_custom/requirements.txt b/xgb_custom/requirements.txt deleted file mode 100644 index 4441bae23..000000000 --- a/xgb_custom/requirements.txt +++ /dev/null @@ -1,7 +0,0 @@ -pandas -typing -xgboost -matplotlib -scikit-learn -seaborn -scikit-plot \ No newline at end of file diff --git a/xgb_custom/test_xgb_custom.py b/xgb_custom/test_xgb_custom.py deleted file mode 100644 index 81b77a2e0..000000000 --- a/xgb_custom/test_xgb_custom.py +++ /dev/null @@ -1,50 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from mlrun import import_function -import os - - -ARTIFACT_PATH = "artifacts" -FUNCTION_PATH = "functions" -PLOTS_PATH = "plots" -RUNS_PATH = "runs" -SCHEDULES_PATH = "schedules" - - -def test_local_xgb_custom(): - fn = import_function("function.yaml") - run = fn.run( - params={ - "nrows": 8192, - "label_type": "float", - "local_path": "./artifacts/inputs/xgb_custom", - }, - handler="gen_outliers", - local=True, - ) - - run = fn.run( - params={ - "num_boost_round": 40, - "verbose_eval": False, - "XGB_max_depth": 2, - "XGB_subsample": 0.9, - "test_set_key": "test-set", - }, - inputs={"dataset": run.artifact('xgb-outs').url}, - handler="fit", - local=True, - ) - assert run.artifact('learning-curves').get() diff --git a/xgb_custom/xgb_custom.ipynb b/xgb_custom/xgb_custom.ipynb deleted file mode 100644 index fe3882ab8..000000000 --- a/xgb_custom/xgb_custom.ipynb +++ /dev/null @@ -1,922 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Custom Objective and Evaluation Functions\n", - "\n", - "This demo was adapted from **[xgboost's custom metric tutorial](https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html)**. We demonstrate how to use a custom objective and a custom evaluation function using an xgboost trainer.\n", - "\n", - "This function differs from `xgb_trainer` by exposing the low-level xgboost python api, `train`. " - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import nuclio" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "from os import path\n", - "import numpy as np\n", - "from numpy.random import randint, randn, seed\n", - "import pandas as pd\n", - "from xgboost import DMatrix, train\n", - "import matplotlib.pyplot as plt\n", - "from mlrun.execution import MLClientCtx\n", - "from mlrun.datastore import DataItem\n", - "from mlrun.artifacts import PlotArtifact\n", - "from mlrun.mlutils.data import get_splits, get_sample\n", - "\n", - "from cloudpickle import dumps\n", - "\n", - "from typing import (Tuple, Dict, List, Union, Callable)\n", - "\n", - "seed(seed=1994)\n", - "\n", - "## UNCOMMENT THIS LINE TO TEST CALCULATED VALUES\n", - "DEBUG_ERROR = 0 # this will be added to the custom eval function--set it to some value like 999 " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### generate data with outliers" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "def gen_outliers(context: MLClientCtx, nrows=4096, feats=16, \n", - " outs=64, omax=10_000, labels_col=\"labels\",\n", - " header=[], label_type=\"int32\", key=\"xgb-outs\",\n", - " local_path=\"xgb_custom\"):\n", - " \"\"\"simulate data with outliers\n", - " \n", - " :param context: the function's execution context\n", - " :param nrows: (4096) number of data points\n", - " :param feats: (16) number of features\n", - " :param outs: (64) number of outliers\n", - " :param omax: (10_100) max value of outliers\n", - " :param labels_col: (labels) name of labels column\n", - " :param header: () header for dataset, will default to\n", - " `feat_`\n", - " :param label_type: (int32) data type for the label column\n", - " :param key: key of datset in artifact store\n", - " :param local_path: path in artifact store where data will be\n", - " serialized\n", - " \"\"\"\n", - " x = randn(nrows, feats)\n", - " y = randn(nrows)\n", - " y += np.abs(np.min(y))\n", - "\n", - " for i in range(0, outs):\n", - " ind = randint(0, len(y)-1)\n", - " y[ind] += randint(0, omax)\n", - " \n", - " if not header:\n", - " header = [f\"feat_{j}\" for j in range(feats)]\n", - " header.append(labels_col)\n", - "\n", - " data = pd.DataFrame(data=np.concatenate((x,y.reshape(-1,1)),axis=-1),\n", - " columns=header)\n", - " data = data.astype({labels_col: label_type})\n", - " \n", - " context.log_dataset(key, df=data, local_path=local_path)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## custom objective and eval" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "toc-hr-collapsed": true, - "toc-nb-collapsed": true - }, - "source": [ - "The following code was adapted from xgboost's documentation **[Custom Objective and Evaluation Metric](https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html?highlight=tree_method#custom-objective-and-evaluation-metric)**." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "def gradient(predt: np.ndarray, dtrain: DMatrix) -> np.ndarray:\n", - " \"\"\"gradient of squared log error\"\"\"\n", - " y = dtrain.get_label()\n", - " return (np.log1p(predt) - np.log1p(y)) / (predt + 1)\n", - "\n", - "\n", - "def hessian(predt: np.ndarray, dtrain: DMatrix) -> np.ndarray:\n", - " \"\"\"hessian of squared log error\"\"\"\n", - " y = dtrain.get_label()\n", - " return ((-np.log1p(predt) + np.log1p(y) + 1) /\n", - " np.power(predt + 1, 2))\n", - "\n", - "\n", - "def squared_log(predt: np.ndarray, dtrain: DMatrix) -> Tuple[np.ndarray,\n", - " np.ndarray]:\n", - " \"\"\"squared log error objective\n", - "\n", - " simplified version for RMSLE used as objective function\n", - " \"\"\"\n", - " predt[predt < -1] = -1 + 1e-6\n", - " grad = gradient(predt, dtrain)\n", - " hess = hessian(predt, dtrain)\n", - " return grad, hess\n", - "\n", - "def rmsle(predt: np.ndarray, dtrain: DMatrix) -> Tuple[str, float]:\n", - " \"\"\" Root mean squared log error metric.\n", - " \"\"\"\n", - " y = dtrain.get_label()\n", - " predt[predt < -1] = -1 + 1e-6\n", - " elements = np.power(np.log1p(y) - np.log1p(predt), 2)\n", - " return \"my_rmsle\", float(np.sqrt(np.sum(elements) / len(y))) + DEBUG_ERROR" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## learning curves" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [], - "source": [ - "def learning_curves(\n", - " context: MLClientCtx,\n", - " results: dict,\n", - " figsz: Tuple[int,int]=(10,10),\n", - " plots_dest: str = \"plots\"\n", - ") -> None:\n", - " \"\"\"plot xgb learning curves\n", - " \n", - " this will also log a model's learning curves\n", - " \"\"\"\n", - " plt.clf()\n", - " plt.figure(figsize=figsz)\n", - " plt.plot(results[\"train\"][\"my_rmsle\"], label=\"train-my-rmsle\")\n", - " plt.plot(results[\"valid\"][\"my_rmsle\"], label=\"valid-my-rmsle\")\n", - " plt.title(f\"learning curves\")\n", - " plt.legend()\n", - " \n", - " context.log_artifact(\n", - " PlotArtifact(f\"learning-curves\", body=plt.gcf()),\n", - " local_path=f\"{plots_dest}/learning-curves.html\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## fit" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [], - "source": [ - "def fit(\n", - " context: MLClientCtx,\n", - " dataset: DataItem,\n", - " num_boost_round: int = 10,\n", - " evals: List[Tuple[DMatrix, str]] = [],\n", - " obj: Union[Callable, str] = \"\",\n", - " feval: Union[Callable, str] = None,\n", - " maximize: bool = False,\n", - " early_stopping_rounds: int = None,\n", - " evals_result: dict = {},\n", - " verbose_eval: bool = True,\n", - " xgb_model: DataItem = None,\n", - " callbacks: List[Callable] = [],\n", - " label_column: str = \"labels\",\n", - " encode_cols: dict = {},\n", - " sample: int = -1,\n", - " test_size: float = 0.25,\n", - " valid_size: float = 0.75,\n", - " random_state: int = 1994,\n", - " models_dest: str = \"models\",\n", - " plots_dest: str = \"plots\",\n", - " file_ext: str = \"csv\",\n", - " test_set_key: str = \"test-set\",\n", - " gpus: bool = False\n", - ") -> None:\n", - " \"\"\"low level xgboost train api\n", - " \n", - " for the xgboost `train` params see:\n", - " https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.train\n", - "\n", - " Note: the first parameter of xgboost's `train` method is a dict of parameters\n", - " supplied to the booster (engine). To modify one of those simply\n", - " add a task parameter (when running you supply an mlrun NewTask) with the\n", - " prefix \"XGB_\". So for example, to set the 'tree_method' parameter to 'approx',\n", - " add {\"XGB_tree_method\":\"approx\"} to the task params key.\n", - " \n", - " :param context: the function context\n", - " :param dataset: the full data set, train, valid and test will be extracted and\n", - " each converted to a DMatrix for input to xgboost's `train`\n", - " :param label_column: ground-truth (y) labels\n", - " :param encode_cols: dictionary of names and prefixes for columns that are\n", - " to hot be encoded.\n", - " :param sample: Selects the first n rows, or select a sample\n", - " starting from the first. If negative <-1, select\n", - " a random sample\n", - " :param test_size: (0.05) test set size\n", - " :param valid_size: (0.75) Once the test set has been removed the\n", - " training set gets this proportion.\n", - " :param random_state: (1) sklearn rng seed\n", - " :param models_dest: destination subfolder for model artifacts\n", - " :param plots_dest: destination subfolder for plot artifacts\n", - " :param file_ext: format for test_set_key hold out data\n", - " :param test_set_key: (test-set), key of held out data in artifact store\n", - " :param gpus: (False), run on gpus\n", - " \"\"\"\n", - " raw, labels, header = get_sample(dataset, sample, label_column)\n", - " \n", - " # hot-encode\n", - " if encode_cols:\n", - " raw = pd.get_dummies(raw, \n", - " columns=list(encode_cols.keys()), \n", - " prefix=list(encode_cols.values()), \n", - " drop_first=True)\n", - " \n", - " # split the sample into train validate, test and calibration sets:\n", - " (xtrain, ytrain), (xvalid, yvalid), (xtest, ytest) = \\\n", - " get_splits(raw, labels, 3, test_size, valid_size, random_state)\n", - " \n", - " # save test data as regular dataframe as it may be used by other process\n", - " context.log_dataset(test_set_key, df=pd.concat([xtest, ytest], axis=1),\n", - " format=file_ext, index=False)\n", - " \n", - " # convert to xgboost DMatrix (todo - dask, gpu)\n", - " dtrain = DMatrix(xtrain, label=ytrain)\n", - " dvalid = DMatrix(xvalid, label=yvalid)\n", - " \n", - " boost_params = {\n", - " \"tree_method\": \"gpu_hist\" if gpus else \"hist\", \n", - " \"seed\": random_state,\n", - " \"disable_default_eval_metric\": 1,\n", - " \"objective\": \"reg:squaredlogerror\",\n", - " \"eval_metric\": \"rmsle\"}\n", - "\n", - " # enable user to customize `booster param` parameters\n", - " for k, v in context.parameters.items():\n", - " if k.startswith('XGB_'):\n", - " boost_params[k[4:]] = v\n", - " \n", - " # collect learning curves / training history\n", - " results = dict()\n", - " \n", - " booster = train(\n", - " boost_params,\n", - " dtrain=dtrain,\n", - " num_boost_round=num_boost_round,\n", - " evals=[(dtrain, \"train\"), (dvalid, \"valid\")],\n", - " evals_result=results,\n", - " obj=squared_log,\n", - " feval=rmsle,\n", - " maximize=maximize,\n", - " early_stopping_rounds=early_stopping_rounds,\n", - " verbose_eval=verbose_eval,\n", - " # xgb_model=xgb_model,\n", - " # callbacks: List[Callable] = []\n", - " )\n", - " \n", - " context.log_model(\"model\", \n", - " body=dumps(booster),\n", - " model_file = \"model.pkl\",\n", - " artifact_path='/User/artifacts/tttt')\n", - " \n", - " learning_curves(context, results)" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### run locally" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import NewTask, run_local" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-14 13:30:21,675 starting run gen_outliers uid=29ae7c944a184de881acc81206a92a48 -> http://mlrun-api:8080\n", - "[mlrun] 2020-06-14 13:30:22,141 log artifact xgb-outs at /User/artifacts/xgb_custom.csv, size: 2762858, db: Y\n", - "\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 14 13:30:21completedgen_outliers
v3io_user=admin
kind=handler
owner=admin
host=jupyter-7b44c8d958-kklf7
nrows=8192
label_type=float
xgb-outs
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run 29ae7c944a184de881acc81206a92a48 --project default , !mlrun logs 29ae7c944a184de881acc81206a92a48 --project default\n", - "[mlrun] 2020-06-14 13:30:22,218 run executed, status=completed\n" - ] - } - ], - "source": [ - "gen_outs_tsk = NewTask(name='gen_outliers',\n", - " handler=gen_outliers, \n", - " params={'nrows': 8192, 'label_type': 'float'})\n", - "\n", - "outliers_run = run_local(gen_outs_tsk)" - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-14 13:30:23,011 starting run fit model uid=489d2d26007e444cb81ffe7b6bc9d4a2 -> http://mlrun-api:8080\n", - "[mlrun] 2020-06-14 13:30:23,405 log artifact test-set at /User/artifacts/test-set.csv, size: 689366, db: Y\n", - "[mlrun] 2020-06-14 13:30:23,545 log artifact model at /User/artifacts/tttt/, size: 17052, db: Y\n", - "[mlrun] 2020-06-14 13:30:23,712 log artifact learning-curves at /User/artifacts/plots/learning-curves.html, size: 31641, db: Y\n", - "\n" - ] - }, - { - "data": { - "text/html": [ - "\n", - "
\n", - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
projectuiditerstartstatenamelabelsinputsparametersresultsartifacts
default0Jun 14 13:30:23completedfit model
v3io_user=admin
kind=handler
owner=admin
host=jupyter-7b44c8d958-kklf7
dataset
num_boost_round=40
verbose_eval=False
XGB_max_depth=2
XGB_subsample=0.9
test-set
model
learning-curves
\n", - "
\n", - "
\n", - "
\n", - " Title\n", - " ×\n", - "
\n", - " \n", - "
\n", - "
\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "to track results use .show() or .logs() or in CLI: \n", - "!mlrun get run 489d2d26007e444cb81ffe7b6bc9d4a2 --project default , !mlrun logs 489d2d26007e444cb81ffe7b6bc9d4a2 --project default\n", - "[mlrun] 2020-06-14 13:30:23,790 run executed, status=completed\n" - ] - }, - { - "data": { - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "# THIS IS SETUP SO THAT YOU CAN COMPARE THE RESULTS AND SEE THAT THEY ARE EXACT. \n", - "# UNCOMMENT LINE `DEBUG_ERROR` AT TOP OF THIS NOTEBOOK TO TEST THAT THESE ARE \n", - "# TRULY CALCULATED VALUES\n", - "\n", - "fit_tsk = NewTask(\n", - " name='fit model', \n", - " handler=fit,\n", - " params={\"num_boost_round\" : 40, \n", - " \"verbose_eval\" : False,\n", - " \"XGB_max_depth\" : 2,\n", - " \"XGB_subsample\" : 0.9}\n", - ")\n", - "\n", - "fit_run = run_local(fit_tsk, inputs={\"dataset\":outliers_run.outputs[\"xgb-outs\"]})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# export" - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-14 13:30:50,033 function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from mlrun import code_to_function\n", - "from mlrun.platforms.other import auto_mount\n", - "\n", - "gpus = False\n", - "\n", - "# create job function object from notebook code\n", - "fn_params = {\n", - " \"name\" : \"xgb_custom\",\n", - " \"handler\" : \"fit\",\n", - " \"kind\" : \"job\",\n", - " \"image\" : \"mlrun/ml-models\" if not gpus else \"mlrun/ml-models-gpu\",\n", - " \"description\" : \"train an xgboost model using the low-level api\",\n", - " \"categories\" : [\"analysis\"],\n", - " \"labels\" : {\"author\": \"yjb\"}\n", - "}\n", - "\n", - "xgb_fn = code_to_function(**fn_params)\n", - "\n", - "xgb_fn.export(\"function.yaml\")\n", - "xgb_fn.apply(auto_mount())" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - }, - "toc-autonumbering": false, - "toc-showcode": false, - "toc-showmarkdowntxt": false, - "toc-showtags": false - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/xgb_custom/xgb_custom.py b/xgb_custom/xgb_custom.py deleted file mode 100644 index f80868e7e..000000000 --- a/xgb_custom/xgb_custom.py +++ /dev/null @@ -1,239 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -from os import path -import numpy as np -from numpy.random import randint, randn, seed -import pandas as pd -from xgboost import DMatrix, train -import matplotlib.pyplot as plt -from mlrun.execution import MLClientCtx -from mlrun.datastore import DataItem -from mlrun.artifacts import PlotArtifact -from mlrun.mlutils.data import get_splits, get_sample - -from cloudpickle import dumps - -from typing import (Tuple, Dict, List, Union, Callable) - -seed(seed=1994) - -## UNCOMMENT THIS LINE TO TEST CALCULATED VALUES -DEBUG_ERROR = 0 # this will be added to the custom eval function--set it to some value like 999 - - -def gen_outliers(context: MLClientCtx, nrows=4096, feats=16, - outs=64, omax=10_000, labels_col="labels", - header=[], label_type="int32", key="xgb-outs", - local_path="xgb_custom"): - """simulate data with outliers - - :param context: the function's execution context - :param nrows: (4096) number of data points - :param feats: (16) number of features - :param outs: (64) number of outliers - :param omax: (10_100) max value of outliers - :param labels_col: (labels) name of labels column - :param header: () header for dataset, will default to - `feat_` - :param label_type: (int32) data type for the label column - :param key: key of datset in artifact store - :param local_path: path in artifact store where data will be - serialized - """ - x = randn(nrows, feats) - y = randn(nrows) - y += np.abs(np.min(y)) - - for i in range(0, outs): - ind = randint(0, len(y) - 1) - y[ind] += randint(0, omax) - - if not header: - header = [f"feat_{j}" for j in range(feats)] - header.append(labels_col) - - data = pd.DataFrame(data=np.concatenate((x, y.reshape(-1, 1)), axis=-1), - columns=header) - data = data.astype({labels_col: label_type}) - - context.log_dataset(key, df=data, local_path=local_path) - -def gradient(predt: np.ndarray, dtrain: DMatrix) -> np.ndarray: - """gradient of squared log error""" - y = dtrain.get_label() - return (np.log1p(predt) - np.log1p(y)) / (predt + 1) - - -def hessian(predt: np.ndarray, dtrain: DMatrix) -> np.ndarray: - """hessian of squared log error""" - y = dtrain.get_label() - return ((-np.log1p(predt) + np.log1p(y) + 1) / - np.power(predt + 1, 2)) - - -def squared_log(predt: np.ndarray, dtrain: DMatrix) -> Tuple[np.ndarray, - np.ndarray]: - """squared log error objective - - simplified version for RMSLE used as objective function - """ - predt[predt < -1] = -1 + 1e-6 - grad = gradient(predt, dtrain) - hess = hessian(predt, dtrain) - return grad, hess - -def rmsle(predt: np.ndarray, dtrain: DMatrix) -> Tuple[str, float]: - """ Root mean squared log error metric. - """ - y = dtrain.get_label() - predt[predt < -1] = -1 + 1e-6 - elements = np.power(np.log1p(y) - np.log1p(predt), 2) - return "my_rmsle", float(np.sqrt(np.sum(elements) / len(y))) + DEBUG_ERROR - - -def learning_curves( - context: MLClientCtx, - results: dict, - figsz: Tuple[int, int] = (10, 10), - plots_dest: str = "plots" -) -> None: - """plot xgb learning curves - - this will also log a model's learning curves - """ - plt.clf() - plt.figure(figsize=figsz) - plt.plot(results["train"]["my_rmsle"], label="train-my-rmsle") - plt.plot(results["valid"]["my_rmsle"], label="valid-my-rmsle") - plt.title(f"learning curves") - plt.legend() - - context.log_artifact( - PlotArtifact(f"learning-curves", body=plt.gcf()), - local_path=f"{plots_dest}/learning-curves.html") - - -def fit( - context: MLClientCtx, - dataset: DataItem, - num_boost_round: int = 10, - evals: List[Tuple[DMatrix, str]] = [], - obj: Union[Callable, str] = "", - feval: Union[Callable, str] = None, - maximize: bool = False, - early_stopping_rounds: int = None, - evals_result: dict = {}, - verbose_eval: bool = True, - xgb_model: DataItem = None, - callbacks: List[Callable] = [], - label_column: str = "labels", - encode_cols: dict = {}, - sample: int = -1, - test_size: float = 0.25, - valid_size: float = 0.75, - random_state: int = 1994, - models_dest: str = "models", - plots_dest: str = "plots", - file_ext: str = "csv", - test_set_key: str = "test-set", - gpus: bool = False -) -> None: - """low level xgboost train api - - for the xgboost `train` params see: - https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.train - - Note: the first parameter of xgboost's `train` method is a dict of parameters - supplied to the booster (engine). To modify one of those simply - add a task parameter (when running you supply an mlrun NewTask) with the - prefix "XGB_". So for example, to set the 'tree_method' parameter to 'approx', - add {"XGB_tree_method":"approx"} to the task params key. - - :param context: the function context - :param dataset: the full data set, train, valid and test will be extracted and - each converted to a DMatrix for input to xgboost's `train` - :param label_column: ground-truth (y) labels - :param encode_cols: dictionary of names and prefixes for columns that are - to hot be encoded. - :param sample: Selects the first n rows, or select a sample - starting from the first. If negative <-1, select - a random sample - :param test_size: (0.05) test set size - :param valid_size: (0.75) Once the test set has been removed the - training set gets this proportion. - :param random_state: (1) sklearn rng seed - :param models_dest: destination subfolder for model artifacts - :param plots_dest: destination subfolder for plot artifacts - :param file_ext: format for test_set_key hold out data - :param test_set_key: (test-set), key of held out data in artifact store - :param gpus: (False), run on gpus - """ - raw, labels, header = get_sample(dataset, sample, label_column) - - # hot-encode - if encode_cols: - raw = pd.get_dummies(raw, - columns=list(encode_cols.keys()), - prefix=list(encode_cols.values()), - drop_first=True) - - # split the sample into train validate, test and calibration sets: - (xtrain, ytrain), (xvalid, yvalid), (xtest, ytest) = \ - get_splits(raw, labels, 3, test_size, valid_size, random_state) - - # save test data as regular dataframe as it may be used by other process - context.log_dataset(test_set_key, df=pd.concat([xtest, ytest], axis=1), - format=file_ext, index=False) - - # convert to xgboost DMatrix (todo - dask, gpu) - dtrain = DMatrix(xtrain, label=ytrain) - dvalid = DMatrix(xvalid, label=yvalid) - - boost_params = { - "tree_method": "gpu_hist" if gpus else "hist", - "seed": random_state, - "disable_default_eval_metric": 1, - "objective": "reg:squaredlogerror", - "eval_metric": "rmsle"} - - # enable user to customize `booster param` parameters - for k, v in context.parameters.items(): - if k.startswith('XGB_'): - boost_params[k[4:]] = v - - # collect learning curves / training history - results = dict() - - booster = train( - boost_params, - dtrain=dtrain, - num_boost_round=num_boost_round, - evals=[(dtrain, "train"), (dvalid, "valid")], - evals_result=results, - obj=squared_log, - feval=rmsle, - maximize=maximize, - early_stopping_rounds=early_stopping_rounds, - verbose_eval=verbose_eval, - # xgb_model=xgb_model, - # callbacks: List[Callable] = [] - ) - - context.log_model("model", - body=dumps(booster), - model_file="model.pkl", - artifact_path='artifacts/') - - learning_curves(context, results) \ No newline at end of file diff --git a/xgb_serving/function.yaml b/xgb_serving/function.yaml deleted file mode 100644 index 7073d8ba6..000000000 --- a/xgb_serving/function.yaml +++ /dev/null @@ -1,40 +0,0 @@ -kind: serving -metadata: - name: xgb-serving - tag: '' - hash: 200148a9a4815d8b0394038d973b59eda1776d36 - project: '' - labels: - author: Daniel - categories: - - model-serving - - machine-learning -spec: - command: '' - args: [] - image: mlrun/mlrun - build: - functionSourceCode: IyBDb3B5cmlnaHQgMjAxOSBJZ3VhemlvCiMKIyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKIyB5b3UgbWF5IG5vdCB1c2UgdGhpcyBmaWxlIGV4Y2VwdCBpbiBjb21wbGlhbmNlIHdpdGggdGhlIExpY2Vuc2UuCiMgWW91IG1heSBvYnRhaW4gYSBjb3B5IG9mIHRoZSBMaWNlbnNlIGF0CiMKIyAgICAgaHR0cDovL3d3dy5hcGFjaGUub3JnL2xpY2Vuc2VzL0xJQ0VOU0UtMi4wCiMKIyBVbmxlc3MgcmVxdWlyZWQgYnkgYXBwbGljYWJsZSBsYXcgb3IgYWdyZWVkIHRvIGluIHdyaXRpbmcsIHNvZnR3YXJlCiMgZGlzdHJpYnV0ZWQgdW5kZXIgdGhlIExpY2Vuc2UgaXMgZGlzdHJpYnV0ZWQgb24gYW4gIkFTIElTIiBCQVNJUywKIyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KIyBTZWUgdGhlIExpY2Vuc2UgZm9yIHRoZSBzcGVjaWZpYyBsYW5ndWFnZSBnb3Zlcm5pbmcgcGVybWlzc2lvbnMgYW5kCiMgbGltaXRhdGlvbnMgdW5kZXIgdGhlIExpY2Vuc2UuCiMKaW1wb3J0IG9zCmltcG9ydCBqc29uCmltcG9ydCBudW1weSBhcyBucApmcm9tIGNsb3VkcGlja2xlIGltcG9ydCBsb2FkCmltcG9ydCBtbHJ1bgoKCmNsYXNzIFhHQm9vc3RNb2RlbChtbHJ1bi5zZXJ2aW5nLlYyTW9kZWxTZXJ2ZXIpOgogICAgZGVmIGxvYWQoc2VsZik6CiAgICAgICAgbW9kZWxfZmlsZSwgZXh0cmFfZGF0YSA9IHNlbGYuZ2V0X21vZGVsKCIucGtsIikKICAgICAgICBzZWxmLm1vZGVsID0gbG9hZChvcGVuKHN0cihtb2RlbF9maWxlKSwgInJiIikpCgogICAgZGVmIHByZWRpY3Qoc2VsZiwgYm9keSk6CiAgICAgICAgdHJ5OgogICAgICAgICAgICBmZWF0cyA9IG5wLmFzYXJyYXkoYm9keVsiaW5wdXRzIl0sIGR0eXBlPW5wLmZsb2F0MzIpLnJlc2hhcGUoLTEsIDUpCiAgICAgICAgICAgIHJlc3VsdCA9IHNlbGYubW9kZWwucHJlZGljdChmZWF0cywgdmFsaWRhdGVfZmVhdHVyZXM9RmFsc2UpCiAgICAgICAgICAgIHJldHVybiByZXN1bHQudG9saXN0KCkKICAgICAgICBleGNlcHQgRXhjZXB0aW9uIGFzIGU6CiAgICAgICAgICAgIHJhaXNlIEV4Y2VwdGlvbigiRmFpbGVkIHRvIHByZWRpY3QgJXMiICUgZSkKZnJvbSBtbHJ1bi5ydW50aW1lcyBpbXBvcnQgbnVjbGlvX2luaXRfaG9vawpkZWYgaW5pdF9jb250ZXh0KGNvbnRleHQpOgogICAgbnVjbGlvX2luaXRfaG9vayhjb250ZXh0LCBnbG9iYWxzKCksICdzZXJ2aW5nX3YyJykKCmRlZiBoYW5kbGVyKGNvbnRleHQsIGV2ZW50KToKICAgIHJldHVybiBjb250ZXh0Lm1scnVuX2hhbmRsZXIoY29udGV4dCwgZXZlbnQpCg== - commands: [] - code_origin: https://github.com/daniels290813/functions.git#2675b0d235d93571a696296c93cfb2103cbf261f:/Users/Daniel_Sabba/functions/xgb_serving/xgb_serving.py - origin_filename: /Users/Daniel_Sabba/functions/xgb_serving/xgb_serving.py - requirements: [] - description: deploy an XGBoost model server. - default_handler: '' - disable_auto_mount: false - clone_target_dir: '' - env: [] - priority_class_name: '' - preemption_mode: prevent - min_replicas: 1 - max_replicas: 4 - source: '' - function_kind: serving_v2 - function_handler: xgb_serving:handler - base_image_pull: false - default_class: ClassifierModel - secret_sources: [] - affinity: null - tolerations: null - security_context: {} -verbose: false diff --git a/xgb_serving/item.yaml b/xgb_serving/item.yaml deleted file mode 100644 index 413e26bfa..000000000 --- a/xgb_serving/item.yaml +++ /dev/null @@ -1,29 +0,0 @@ -apiVersion: v1 -categories: -- model-serving -- machine-learning -description: deploy an XGBoost model server. -doc: '' -example: xgb_serving.ipynb -generationDate: 2022-08-28:17-25 -hidden: false -icon: '' -labels: - author: Daniel -maintainers: [] -marketplaceType: '' -mlrunVersion: 1.4.1 -name: xgb_serving -platformVersion: 3.5.3 -spec: - customFields: - default_class: ClassifierModel - filename: xgb_serving.py - handler: handler - image: mlrun/mlrun - kind: serving - requirements: [] -url: '' -version: 1.1.2 - - diff --git a/xgb_serving/requirements.txt b/xgb_serving/requirements.txt deleted file mode 100644 index a5bbcdde3..000000000 --- a/xgb_serving/requirements.txt +++ /dev/null @@ -1,7 +0,0 @@ -pandas -xgboost -cloudpickle -pygit2 -scikit-learn==1.0.2 -scikit-plot -seaborn diff --git a/xgb_serving/test_xgb_serving.py b/xgb_serving/test_xgb_serving.py deleted file mode 100644 index 52f6ccb6d..000000000 --- a/xgb_serving/test_xgb_serving.py +++ /dev/null @@ -1,67 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import mlrun -import os -import pandas as pd -from xgb_serving import XGBoostModel - - -def get_class_data(): - fn = mlrun.import_function('../gen_class_data/function.yaml') - run = fn.run(params={'key': 'classifier-data', - 'n_samples': 10_000, - 'm_features': 5, - 'k_classes': 2, - 'header': None, - 'weight': [0.5, 0.5], - 'sk_params': {'n_informative': 2}, - 'file_ext': 'csv'}, local=True, artifact_path="./artifacts") - return run - - -def xgb_trainer(): - # running data preparation function locally - gen_data_run = get_class_data() - - fn = mlrun.import_function('../xgb_trainer/function.yaml') - run = fn.run(params={'model_type': 'classifier', - 'CLASS_tree_method': 'hist', - 'CLASS_objective': 'binary:logistic', - 'CLASS_booster': 'gbtree', - 'FIT_verbose': 0, - 'label_column': 'labels'}, - local=True, inputs={'dataset': gen_data_run.status.artifacts[0]['spec']['target_path']}) - - for artifact in run.status.artifacts: - if artifact['kind'] == 'model': - assert os.path.exists(artifact['spec']['target_path']), "Failed locating model file" # validating model exists - return artifact['spec']['target_path'] + artifact['spec']['model_file'], gen_data_run.status.artifacts[0]['spec']['target_path'] - assert False, "Failed creating model" - - -def test_local_xgb_serving(): - model_path, dataset_path = xgb_trainer() - fn = mlrun.import_function('function.yaml') - - fn.add_model(key='my_model', model_path=model_path, class_name='XGBoostModel') - server = fn.to_mock_server() - - # Testing the model - df = pd.read_csv(dataset_path) - x = df.drop(['labels'], axis=1).iloc[0].tolist() - y_true = df['labels'][0] - - y_pred = server.test(path='/v2/models/my_model/predict', body={"inputs": x})['outputs'][0] - assert y_true == y_pred diff --git a/xgb_serving/xgb_serving.ipynb b/xgb_serving/xgb_serving.ipynb deleted file mode 100644 index 6c605367e..000000000 --- a/xgb_serving/xgb_serving.ipynb +++ /dev/null @@ -1,421 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Deploy a Serverless XGBoost Model Server\n", - " --------------------------------------------------------------------\n", - "\n", - "The following notebook demonstrates how to deploy an XGBoost model server (a.k.a Nuclio-serving)\n", - "\n", - "#### **notebook how-to's**\n", - "* Write and test model serving class in a notebook.\n", - "* Deploy the model server function.\n", - "* Invoke and test the serving function." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "#### **steps**\n", - "**[define a new function and its dependencies](#define-function)**
\n", - "**[test the model serving class locally](#test-locally)**
\n", - "**[deploy our serving class using as a serverless function](#deploy)**
\n", - "**[test our model server using HTTP request](#test-model-server)**
" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: ignore\n", - "import nuclio " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### **define a new function and its dependencies**" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "%nuclio: setting kind to 'nuclio:serving'\n", - "%nuclio: setting 'MODEL_CLASS' environment variable\n", - "%nuclio: setting spec.build.baseImage to 'mlrun/ml-models'\n" - ] - } - ], - "source": [ - "%nuclio config kind=\"nuclio:serving\"\n", - "%nuclio env MODEL_CLASS=XGBoostModel\n", - "\n", - "%nuclio config spec.build.baseImage = \"mlrun/ml-models\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Function Code" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "# import kfserving\n", - "import os\n", - "import json\n", - "import numpy as np\n", - "import xgboost as xgb\n", - "from cloudpickle import load\n", - "\n", - "### Model Serving Class\n", - "\n", - "import mlrun\n", - "class XGBoostModel(mlrun.runtimes.MLModelServer):\n", - " def load(self):\n", - " model_file, extra_data = self.get_model(\".pkl\")\n", - " self.model = load(open(str(model_file), \"rb\"))\n", - " \n", - "\n", - " def predict(self, body):\n", - " try:\n", - " feats = np.asarray(body[\"instances\"], dtype=np.float32).reshape(-1, 5)\n", - " result = self.model.predict(feats, validate_features=False)\n", - " return result.tolist()\n", - " except Exception as e:\n", - " raise Exception(\"Failed to predict %s\" % e)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The following end-code annotation tells ```nuclio``` to stop parsing the notebook from this cell. _**Please do not remove this cell**_:" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "# nuclio: end-code" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "## Test the function locally\n", - "\n", - "The class above can be tested locally. Just instantiate the class, `.load()` will load the model to a local dir.\n", - "\n", - "> **Verify there is a model file in the model_dir path (generated by the training notebook)**" - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import mlconf\n", - "\n", - "model_dir = os.path.join(mlconf.artifact_path, \"xgb/models\")\n", - "\n", - "my_server = XGBoostModel(\"my-model\", model_dir=model_dir)\n", - "my_server.load()" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [], - "source": [ - "DATA_PATH = mlconf.artifact_path + \"/xgb/classifier-data.csv\"\n", - "MODEL_PATH = mlconf.artifact_path + \"/xgb/models/xgb_test\"" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "xtest = pd.read_csv(DATA_PATH)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can use the `.predict(body)` method to test the model." - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": {}, - "outputs": [], - "source": [ - "import json, numpy as np\n", - "preds = my_server.predict({\"instances\":xtest.values[:10,:-1].tolist()})" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "predicted class: [1, 0, 0, 0, 0, 0, 1, 1, 0, 1]\n" - ] - } - ], - "source": [ - "print(\"predicted class:\", preds)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### **deploy our serving class using as a serverless function**\n", - "in the following section we create a new model serving function which wraps our class , and specify model and other resources.\n", - "\n", - "the `models` dict store model names and the assosiated model **dir** URL (the URL can start with `S3://` and other blob store options), the faster way is to use a shared file volume, we use `.apply(mount_v3io())` to attach a v3io (iguazio data fabric) volume to our function. By default v3io will mount the current user home into the `\\User` function path.\n", - "\n", - "**verify the model dir does contain a valid `model.bst` file**" - ] - }, - { - "cell_type": "code", - "execution_count": 41, - "metadata": {}, - "outputs": [], - "source": [ - "from mlrun import new_model_server, mount_v3io\n", - "import requests" - ] - }, - { - "cell_type": "code", - "execution_count": 42, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-14 12:49:05,013 function spec saved to path: function.yaml\n" - ] - }, - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 42, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fn = new_model_server(\"xgb-serving\",\n", - " model_class=\"XGBoostModel\",\n", - " models={\"xgb_serving_v2\": f\"{model_dir}\"})\n", - "fn.spec.description = \"xgboost test data classification server\"\n", - "fn.metadata.categories = [\"serving\", \"ml\"]\n", - "fn.metadata.labels = {\"author\": \"yaronh\", \"framework\": \"xgboost\"}\n", - "\n", - "fn.export(\"function.yaml\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## tests" - ] - }, - { - "cell_type": "code", - "execution_count": 43, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 43, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from mlrun.platforms.other import auto_mount\n", - "fn.apply(auto_mount())" - ] - }, - { - "cell_type": "code", - "execution_count": 44, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[mlrun] 2020-06-14 12:49:18,128 deploy started\n", - "[nuclio] 2020-06-14 12:49:19,213 (info) Build complete\n", - "[nuclio] 2020-06-14 12:49:27,347 (info) Function deploy complete\n", - "[nuclio] 2020-06-14 12:49:27,354 done updating default-xgb-test, function address: 3.23.82.202:30104\n" - ] - } - ], - "source": [ - "addr = fn.deploy()" - ] - }, - { - "cell_type": "code", - "execution_count": 45, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'http://3.23.82.202:30104'" - ] - }, - "execution_count": 45, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "addr" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "### **test our model server using HTTP request**\n", - "\n", - "\n", - "We invoke our model serving function using test data, the data vector is specified in the `instances` attribute." - ] - }, - { - "cell_type": "code", - "execution_count": 46, - "metadata": {}, - "outputs": [], - "source": [ - "# KFServing protocol event\n", - "event_data = {\"instances\": xtest.values[:10,:-1].tolist()}" - ] - }, - { - "cell_type": "code", - "execution_count": 47, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'[1, 0, 0, 0, 0, 0, 1, 1, 0, 1]'" - ] - }, - "execution_count": 47, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import json\n", - "resp = requests.put(addr + \"/xgb_serving_v2/predict\", json=json.dumps(event_data))\n", - "resp.text" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[1, 0, 0, 0, 0, 0, 1, 1, 0, 1]" - ] - }, - "execution_count": 26, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "preds" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**[back to top](#top)**" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/xgb_serving/xgb_serving.py b/xgb_serving/xgb_serving.py deleted file mode 100644 index a4d095e57..000000000 --- a/xgb_serving/xgb_serving.py +++ /dev/null @@ -1,33 +0,0 @@ -# Copyright 2019 Iguazio -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import os -import json -import numpy as np -from cloudpickle import load -import mlrun - - -class XGBoostModel(mlrun.serving.V2ModelServer): - def load(self): - model_file, extra_data = self.get_model(".pkl") - self.model = load(open(str(model_file), "rb")) - - def predict(self, body): - try: - feats = np.asarray(body["inputs"], dtype=np.float32).reshape(-1, 5) - result = self.model.predict(feats, validate_features=False) - return result.tolist() - except Exception as e: - raise Exception("Failed to predict %s" % e) \ No newline at end of file