Skip to content

Commit

Permalink
[Docs] Add Change log and other edits (mlrun#2906)
Browse files Browse the repository at this point in the history
  • Loading branch information
jillnogold authored Jan 16, 2023
1 parent f3438a3 commit 02236e0
Show file tree
Hide file tree
Showing 12 changed files with 387 additions and 24 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,6 @@ tests/system/env.yml
# pyenv file for working with several python versions
.python-version
*.bak
docs/contributing.md
docs/CONTRIBUTING.md
docs/tutorial/colab/01-mlrun-basics-colab.ipynb

Binary file modified docs/_static/images/marketplace-ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
337 changes: 337 additions & 0 deletions docs/change-log/_index.md

Large diffs are not rendered by default.

8 changes: 7 additions & 1 deletion docs/contents.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,10 @@ Table of Contents
genindex
api/index
cli
glossary
glossary

.. toctree::
:maxdepth: 1
:caption: Change log

change-log/_index
13 changes: 9 additions & 4 deletions docs/data-prep/ingest-data-fs.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,12 @@ also general limitations in [Attribute name restrictions](https://www.iguazio.co

## Inferring data

There are two types of inferring:
There are 2 types of infer options:
- Metadata/schema: This is responsible for describing the dataset and generating its meta-data, such as deducing the
data-types of the features and listing the entities that are involved. Options belonging to this type are
`Entities`, `Features` and `Index`. The `InferOptions` class has the `InferOptions.schema()` function which returns a value
containing all the options of this type.
- Stats/preview: This related to calculating statistics and generating a preview of the actual data in the dataset.
- Stats/preview: This relates to calculating statistics and generating a preview of the actual data in the dataset.
Options of this type are `Stats`, `Histogram` and `Preview`.

The `InferOptions class` has the following values:<br>
Expand All @@ -54,7 +54,6 @@ The `InferOptions class` basically translates to a value that can be a combinati

When simultaneously ingesting data and requesting infer options, part of the data might be ingested twice: once for inferring metadata/stats and once for the actual ingest. This is normal behavior.


## Ingest data locally

Use a Feature Set to create the basic feature-set definition and then an ingest method to run a simple ingestion "locally" in the Jupyter Notebook pod.
Expand Down Expand Up @@ -137,14 +136,20 @@ When defining a source, it maps to nuclio event triggers. <br>
You can also create a custom `source` to access various databases or data sources.

## Target stores

By default, the feature sets are saved in parquet and the Iguazio NoSQL DB ({py:class}`~mlrun.datastore.NoSqlTarget`). <br>
The parquet file is ideal for fetching large set of data for training while the key value is ideal for an online application since it supports low latency data retrieval based on key access.

```{admonition} Note
When working with the Iguazio MLOps platform the default feature set storage location is under the "Projects" container: `<project name>/fs/..` folder.
The default location can be modified in mlrun config or specified per ingest operation. The parquet/csv files can be stored in NFS, S3, Azure blob storage, Redis, and on Iguazio DB/FS.
```
### Redis target store

### Redis target store

```{admonition} Tech preview
```

The Redis online target is called, in MLRun, `RedisNoSqlTarget`. The functionality of the `RedisNoSqlTarget` is identical to the `NoSqlTarget` except for:
- The `RedisNoSqlTarget` does not support the spark engine, (only supports the storey engine).
- The `RedisNoSqlTarget` accepts path parameter in the form `<redis|rediss>://[<username>]:[<password>]@<host>[:port]`<br>
Expand Down
6 changes: 1 addition & 5 deletions docs/data-prep/ingesting_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,11 +85,7 @@ facilities for executing pySpark code using a Spark service (which can be deploy
as part of an Iguazio system) or through submitting the processing task to Spark-operator. The following page provides
additional details and code-samples:

<!---
TODO - add this once we have Spark service documentation.
1. [Spark service](???) - **do we have a page for this? Are we documenting it?**
-->
1. [Spark operator](../runtimes/spark-operator.html)
- [Spark operator](../runtimes/spark-operator.html)

In a similar manner, Dask can be used for parallel processing of the data. To read data as a Dask `DataFrame`, use the
following code:
Expand Down
6 changes: 5 additions & 1 deletion docs/ecosystem.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,4 +66,8 @@ This section lists the data stores, development tools, services, platforms, etc.
- Jenkins
- Github Actions
- Gitlab CI/CD
- KFP
- KFP

## Browser

MLRun runs on Chrome and Firefox.
1 change: 1 addition & 0 deletions docs/feature-store/using-spark-engine.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
(ingest-features-spark)=
# Ingest features with Spark

The feature store supports using Spark for ingesting, transforming, and writing results to data targets. When
Expand Down
12 changes: 6 additions & 6 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
MLRun is an open MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications. MLRun significantly reduces engineering efforts, time to production, and computation resources.
With MLRun, you can choose any IDE on your local machine or on the cloud. MLRun breaks the silos between data, ML, software, and DevOps/MLOps teams, enabling collaboration and fast continuous improvements.
Get started with MLRun **{ref}`Tutorials and examples <tutorial>`**, **{ref}`Installation and setup guide <install-setup-guide>`**, or read about **{ref}`architecture`**.
Get started with MLRun **{ref}`Tutorials and examples <tutorial>`**, **{ref}`Installation and setup guide <install-setup-guide>`**,
This page explains how MLRun addresses the [**MLOps tasks**](#mlops-tasks) and presents the [**MLRun core components**](#core-components).
This page explains how MLRun addresses the [**MLOps tasks**](#mlops-tasks), and presents the [**MLRun core components**](#core-components).
See the supported data stores, development tools, services, platforms, etc., supported by MLRun's open architecture in **{ref}`ecosystem`**.
Expand Down Expand Up @@ -49,7 +50,6 @@ See the supported data stores, development tools, services, platforms, etc., sup
```
````
`````
The [**MLOps development workflow**](./mlops-dev-flow.html) section describes the different tasks and stages in detail.
MLRun can be used to automate and orchestrate all the different tasks or just specific tasks (and integrate them with what you have already deployed).
Expand Down Expand Up @@ -79,7 +79,7 @@ In addition, the MLRun [**Feature store**](./feature-store/feature-store.html) a
{octicon}`mortar-board` **Docs:**
{bdg-link-info}`Feature store <./feature-store/feature-store.html>`
{bdg-link-info}`Data & artifacts <./concepts/data-feature-store.html>`
{bdg-link-info}`Data & artifacts <./concepts/data.html>`
, {octicon}`code-square` **Tutorials:**
{bdg-link-primary}`quick start <./tutorial/01-mlrun-basics.html>`
{bdg-link-primary}`Feature store <./feature-store/basic-demo.html>`
Expand All @@ -106,7 +106,7 @@ MLRun rapidly deploys and manages production-grade real-time or batch applicatio
{octicon}`mortar-board` **Docs:**
{bdg-link-info}`Realtime pipelines <./serving/serving-graph.html>`
{bdg-link-info}`Batch inference <./concepts/TBD.html>`
{bdg-link-info}`Batch inference <./deployment/batch_inference.html>`
, {octicon}`code-square` **Tutorials:**
{bdg-link-primary}`Realtime serving <./tutorial/03-model-serving.html>`
{bdg-link-primary}`Batch inference <./tutorial/07-batch-infer.html>`
Expand All @@ -124,7 +124,7 @@ Observability is built into the different MLRun objects (data, functions, jobs,
, {octicon}`code-square` **Tutorials:**
{bdg-link-primary}`Model monitoring & drift detection <./tutorial/05-model-monitoring.html>`

`````

<a id="core-components"></a>
## MLRun core components
Expand Down
18 changes: 15 additions & 3 deletions docs/monitoring/initial-setup-configuration.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
}
},
"source": [
"(enable-model-monitoring)=\n",
"# Enable model monitoring\n",
"\n",
"```{note}\n",
Expand All @@ -16,17 +17,28 @@
"\n",
"To see tracking results, model monitoring needs to be enabled in each model.\n",
"\n",
"To enable model monitoring, include `serving_fn.set_tracking()` in the model server.\n",
"\n",
"To utilize drift measurement, supply the train set in the training step.\n",
"\n",
"**In this section**\n",
"- [Enabling model monitoring](#enabling-model-monitoring)\n",
"- [Model monitoring demo](#model-monitoring-demo)\n",
" - [Deploy model servers](#deploy-model-servers)\n",
" - [Simulating requests](#simulating-requests)\n",
"\n",
"## Enabling model monitoring\n",
"\n",
"Model activities can be tracked into a real-time stream and time-series DB. The monitoring data\n",
"is used to create real-time dashboards and track model accuracy and drift. \n",
"To set the tracking stream options, specify the following function spec attributes:\n",
" \n",
" `fn.set_tracking(stream_path, batch, sample)`\n",
" \n",
"- **stream_path** &mdash; the v3io stream path (e.g. `v3io:///users/..`)\n",
"- **sample** &mdash; optional, sample every N requests\n",
"- **batch** &mdash; optional, send micro-batches every N requests\n",
" \n",
"## Model monitoring demo\n",
"Use the following code blocks to test and explore model monitoring."
"Use the following code to test and explore model monitoring."
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions docs/monitoring/model-monitoring-deployment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@
"* **Model** &mdash; user defined name for the model\n",
"* **Labels** &mdash; user configurable tags that are searchable\n",
"* **Uptime** &mdash; first request for production data\n",
"* **Last Prediction **&mdash; most recent request for production data\n",
"* **Error Count **&mdash; includes prediction process errors such as operational issues (For example, a function in a failed state), as well as data processing errors\n",
"* **Last Prediction** &mdash; most recent request for production data\n",
"* **Error Count** &mdash; includes prediction process errors such as operational issues (For example, a function in a failed state), as well as data processing errors\n",
"(For example, invalid timestamps, request ids, type mismatches etc.)\n",
"* **Drift** &mdash; indication of drift status (no drift (green), possible drift (yellow), drift detected (red))\n",
"* **Accuracy** &mdash; a numeric value representing the accuracy of model predictions (N/A)\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/runtimes/images.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,5 @@ These characteristics are great when you’re working in a POC or development en

### Working with images in production
For production you should create your own images to ensure that the image is fixed.
- Pin the image tag, e.g. `image="mlrun/mlrun:1.2.0"`. This maintains the image tag at 1.1.0 even when the client is upgraded. Otherwise, an upgrade of the client would also upgrade the image. (If you specify an external (not MLRun images) docker image, like python, the result is the docker/k8s default behavior, which defaults to `latest` when the tag is not provided.)
- Pin the image tag, e.g. `image="mlrun/mlrun:1.2.0"`. This maintains the image tag at the version you specified, even when the client is upgraded. Otherwise, an upgrade of the client would also upgrade the image. (If you specify an external (not MLRun images) docker image, like python, the result is the docker/k8s default behavior, which defaults to `latest` when the tag is not provided.)
- Pin the versions of requirements, again to avoid breakages, e.g. `pandas==1.4.0`. (If you only specify the package name, e.g. pandas, then pip/conda (python's package managers) just pick up the latest version.)

0 comments on commit 02236e0

Please sign in to comment.