diff --git a/.gitignore b/.gitignore index e34b766beb..319356af9f 100644 --- a/.gitignore +++ b/.gitignore @@ -27,4 +27,6 @@ tests/system/env.yml # pyenv file for working with several python versions .python-version *.bak -docs/contributing.md +docs/CONTRIBUTING.md +docs/tutorial/colab/01-mlrun-basics-colab.ipynb + diff --git a/docs/_static/images/marketplace-ui.png b/docs/_static/images/marketplace-ui.png index 9e2b8221d5..8716cb5354 100644 Binary files a/docs/_static/images/marketplace-ui.png and b/docs/_static/images/marketplace-ui.png differ diff --git a/docs/change-log/_index.md b/docs/change-log/_index.md new file mode 100644 index 0000000000..13677a6c03 --- /dev/null +++ b/docs/change-log/_index.md @@ -0,0 +1,337 @@ +(change-log)= +# Change log +- [v1.2.1](#v1-2-1) +- [v1.2.0](#v1-2-0) +- [v1.1.3](#1-1-3) +- [v1.0.6](#v1-0-6) +- [v1.0.5](#v1-0-5) +- [v1.0.4](#v1-0-4) +- [v1.0.3](#v1-0-3) +- [v1.0.2](#v1-0-2) +- [v1.0.0](#v1-0-0) +- [Open issues](#open-issues) +- [Limitations](#limitations) +- [Deprecations](#deprecations) + +## v1.2.1 + +### New and updated features + +#### Feature store +- Supports ingesting Avro-encoded Kafka records. [View in Git](https://github.com/mlrun/mlrun/issues/2649). + +### Closed issues + +- Fix: the **Projects | Jobs | Monitor Workflows** view is now accurate when filtering for > 1 hour. [View in Git](https://github.com/mlrun/mlrun/pull/2786). +- The Kubernetes **Pods** tab in **Monitor Workflows** now shows the complete pod details. [View in Git](https://github.com/mlrun/mlrun/pull/1576). +- Update the tooltips in **Projects | Jobs | Schedule** to explain that day 0 (for cron jobs) is Monday, and not Sunday. +[View in Git](https://github.com/mlrun/ui/pull/1571). +- Fix UI crash when selecting **All** in the **Tag** dropdown list of the **Projects | Feature Store | Feature Vectors** tab. [View in Git](https://github.com/mlrun/ui/pull/1549). +- Fix: now updates `next_run_time` when skipping scheduling due to concurrent runs. [View in Git](https://github.com/mlrun/mlrun/pull/2862). +- When creating a project, the error `NotImplementedError` was updated to explain that MLRun does not have a +DB to connect to. [View in Git](https://github.com/mlrun/mlrun/pull/2856). +- When previewing a **DirArtifact** in the UI, it now returns the requested directory. Previously it was returning the directory list from the root of the container. [View in Git](https://github.com/mlrun/mlrun/pull/2592). +- Load source at runtime or build time now fully supports .zip files, which were not fully supported previously. + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.2.1) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.2.1) + + +## v1.2.0 + +### New and updated features + +#### Artifacts +- Support for artifact tagging: + - SDK: Add `tag_artifacts` and `delete_artifacts_tags` that can be used to modify existing artifacts tags and have + more than one version for an artifact. + - UI: You can add and edit artifact tags in the UI. + - API: Introduce new endpoints in `/projects//tags`. + +#### Auth +- Support S3 profile and assume-role when using `fsspec`. +- Support GitHub fine grained tokens. + +#### Documentation +- Restructured, and added new content. + +#### Feature store +- Support Redis as an online feature set for storey engine only. (See [Redis target store](../data-prep/ingest-data-fs.html#redis-target-store).) +- Fully supports ingesting with pandas engine, now equivalent to ingestion with `storey` engine (TechPreview): + - Support DataFrame with multi-index. + - Support mlrun steps when using pandas engine: `OneHotEncoder` , `DateExtractor`, `MapValue`, `Imputer` and `FeatureValidation`. +- Add new step: `DropFeature` for pandas and storey engines. (TechPreview) +- Add param query for `get_offline_feature` for filtering the output. + +#### Frameworks +- Add `HuggingFaceModelServer` to `mlrun.frameworks` at `mlrun.frameworks.huggingface` to serve `HuggingFace` models. + +#### Functions +- Add `function.with_annotations({"framework":"tensorflow"})` to user-created functions. +- Add `overwrite_build_params` to `project.build_function()` so the user can choose whether or not to keep the +build params that were used in previous function builds. +- `deploy_function` has a new option of mock deployment that allows running the function locally. + +#### Installation +- New option to install `google-cloud` requirements using `mlrun[google-cloud]`: when installing MLRun for integration +with GCP clients, only compatible packages are installed. + +#### Models +- The Labels in the **Models > Overview** tab can be edited. + +#### Third party integrations +- Supports Confluent Kafka (Tech Preview). + +#### Internal +- Refactor artifacts endpoints to follow the MLRun convention of `/projects//artifacts/...`. (The previous API will be deprecated in a future release.) +- Add `/api/_internal/memory-reports/` endpoints for memory related metrics to better understand the memory consumption of the API. +- Improve the HTTP retry mechanism. +- Support a new lightweight mechanism for KFP pods to pull the run state they triggered. Default behavior is legacy, +which pulls the logs of the run to figure out the run state. +The new behavior can be enabled using a feature flag configured in the API. + +### Breaking changes + +- Feature store: Ingestion using pandas now takes the dataframe and creates indices out of the entity column +(and removes it as a column in this df). This could cause breakage for existing custom steps when using a pandas engine. + +### Closed issues + +- Support logging artifacts larger than 5GB to V3IO. [View in Git](https://github.com/mlrun/mlrun/issues/2455). +- Limit KFP to kfp~=1.8.0, <1.8.14 due to non-backwards changes done in 1.8.14 for ParallelFor, which isn’t compatible with the MLRun managed KFP server (1.8.1). [View in Git](https://github.com/mlrun/mlrun/issues/2516). +- Add `artifact_path` enrichment from project `artifact_path`. Previously, the parameter wasn't applied to project runs when defining `project.artifact_path`. [View in Git](https://github.com/mlrun/mlrun/issues/2507). +- Align timeouts for requests that are getting re-routed from worker to chief (for projects/background related endpoints). [View in Git](https://github.com/mlrun/mlrun/issues/2565). +- Fix legacy artifacts load when loading a project. Fixed corner cases when legacy artifacts were saved to yaml and loaded back into the system using `load_project()`. [View in Git](https://github.com/mlrun/mlrun/issues/2584). +- Fix artifact `latest` tag enrichment to happen also when user defined a specific tag. [View in Git](https://github.com/mlrun/mlrun/issues/2572). +- Fix zip source extraction during function build. [View in Git](https://github.com/mlrun/mlrun/issues/2588). +- Fix Docker compose deployment so Nuclio is configured properly with a platformConfig file that sets proper mounts and network +configuration for Nuclio functions, meaning that they run in the same network as MLRun. +[View in Git](https://github.com/mlrun/mlrun/issues/2601). +- Workaround for background tasks getting cancelled prematurely, due to the current FastAPI version that +has a bug in the starlette package it uses. The bug caused the task to get cancelled if the client’s HTTP connection +was closed before the task was done. [View in Git](https://github.com/mlrun/mlrun/issues/2618). +- Fix run fails after deploying function without defined image. [View in Git](https://github.com/mlrun/mlrun/pull/2530). +- Fix scheduled jobs failed on GKE with resource quota error. [View in Git](https://github.com/mlrun/mlrun/pull/2520). +- Can now delete a model via tag. [View in Git](https://github.com/mlrun/mlrun/pull/2433). + + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.2.0) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.2.0) + + + +## v1.1.3 + +### Closed issues + +- The CLI supports overwriting the schedule when creating scheduling workflow. [View in Git](https://github.com/mlrun/mlrun/pull/2651). +- Slack now notifies when a project fails in `load_and_run()`. [View in Git](https://github.com/mlrun/mlrun/pull/2794). +- Timeout is executed properly when running a pipeline in CLI. [View in Git]https://github.com/mlrun/mlrun/pull/2635). +- Uvicorn Keep Alive Timeout (`http_connection_timeout_keep_alive`) is now configurable, with default=11. This maintains +API-client connections. [View in Git](https://github.com/mlrun/mlrun/pull/2613). + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.1.3) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.1.3) + +## v1.1.2 + +### New and updated features + +**V3IO** +- v3io-py bumped to 0.5.19. +- v3io-fs bumped to 0.1.15. + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.1.2) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.1.2-rc3) + +## v1.1.1 + +### New and updated features + +#### API +- Supports workflow scheduling. + +#### UI +- Projects: Supports editing model labels. + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.1.1) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.1.1) + + +## v1.1.0 + +### New and updated features + +#### API +- MLRun scalability: Workers are used to handle the connection to the MLRun database and can be increased to +improve handling of high workloads against the MLRun DB. You can configure the number of workers for an MLRun +service, which is applied to the service's user-created pods. The default is 2. + - v1.1.0 cannot run on top of 3.0.x. + - For Iguazio v <3.5.0 number of workers set to 1 by default. To change this number, contact support (helm-chart change required). + - Multi-instance is not supported for MLrun running on SQLite. +- Supports pipeline scheduling. + +#### Documentation +- Added Azure and S3 examples to {ref}`ingest-features-spark`. + +#### Feature store +- Supports S3, Azure, GCS targets when using Spark as an engine for the feature store. +- Snowflake as datasource has a connector ID: `iguazio_platform`. +- You can add a time-based filter condition when running `get_offline_feature` with a given vector. + +#### Storey +- MLRun can write to parquet with flexible schema per batch for ParquetTarget: useful for inconsistent or unknown schema. + +#### UI + +- The **Projects** home page now has three tiles, Data, Jobs and Workflows, Deployment, that guide you through key +capabilities of Iguazio, and provide quick access to common tasks. +- The **Projects | Jobs | Monitor Jobs** tab now displays the Spark UI URL. +- The information of the Drift Analysis tab is now displayed in the Model Overview. +- If there is an error, the error messages are now displayed in the **Projects | Jobs | Monitor** jobs tab. + +#### Workflows +- The steps in **Workflows** are color-coded to identify their status: blue=running; green=completed; red=error. + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.1.0) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.1.0) + +## v1.0.6 + +### Closed issues +- Import from mlrun fails with "ImportError: cannot import name dataclass_transform". + Workaround for previous releases: + Install `pip install pydantic==1.9.2` after `align_mlrun.sh`. +- MLRun FeatureSet was not not enriching with security context when running from the UI. [View in Git](https://github.com/mlrun/mlrun/pull/2250). +- MlRun Accesskey presents as cleartext in the mlrun yaml, when the mlrun function is created by feature set + request from the UI. [View in Git](https://github.com/mlrun/mlrun/pull/2250). + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.0.6) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.0.6) + +## v1.0.5 + +### Closed issues +- MLRun: remove root permissions. [View in Git](https://github.com/mlrun/mlrun/pull/). +- Users running a pipeline via CLI project run (watch=true) can now set the timeout (previously was 1 hour). [View in Git](https://github.com/mlrun/mlrun/pull/). +- MLRun: Supports pushing images to ECR. [View in Git](https://github.com/mlrun/mlrun/pull/). + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.0.5) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.0.5) + +## v1.0.4 + +### New and updated features +- Bump storey to 1.0.6. +- Add typing-extensions explictly. +- Add vulnerability check to CI and fix vulnerabilities. + +### Closed issues +- Limit Azure transitive dependency to avoid new bug. [View in Git](https://github.com/mlrun/mlrun/pull/2034). +- Fix GPU image to have new signing keys. [View in Git](https://github.com/mlrun/mlrun/pull/2030). +- Spark: Allow mounting v3io on driver but not executors. [View in Git](https://github.com/mlrun/mlrun/pull/2023). +- Tests: Send only string headers to align to new requests limitation. [View in Git](https://github.com/mlrun/mlrun/pull/2039). + + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.0.4) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.0.4) + +## v1.0.3 + +### New and updated features +- Jupyter Image: Relax `artifact_path` settings and add README notebook. [View in Git](https://github.com/mlrun/mlrun/pull/2011). +- Images: Fix security vulnerabilities. [View in Git](https://github.com/mlrun/mlrun/pull/1997). + +### Closed issues + +- API: Fix projects leader to sync enrichment to followers. [View in Git](https://github.com/mlrun/mlrun/pull/2009). +- Projects: Fixes and usability improvements for working with archives. [View in Git](https://github.com/mlrun/mlrun/pull/2006). + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.0.3) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.0.3) + +## v1.0.2 + +### New and updated features + +- Runtimes: Add java options to Spark job parameters. [View in Git](https://github.com/mlrun/mlrun/pull/1968). +- Spark: Allow setting executor and driver core parameter in Spark operator. [View in Git](https://github.com/mlrun/mlrun/pull/1973). +- API: Block unauthorized paths on files endpoints. [View in Git](https://github.com/mlrun/mlrun/pull/1967). +- Documentation: New quick start guide and updated docker install section. [View in Git](https://github.com/mlrun/mlrun/pull/1948). + +### Closed issues +- Frameworks: Fix to logging the target columns in favor of model monitoring. [View in Git](https://github.com/mlrun/mlrun/pull/1929). +- Projects: Fix/support archives with project run/build/deploy methods. [View in Git](https://github.com/mlrun/mlrun/pull/1966). +- Runtimes: Fix jobs stuck in non-terminal state after node drain/preemption. [View in Git](https://github.com/mlrun/mlrun/pull/1964). +- Requirements: Fix ImportError on ingest to Azure. [View in Git](https://github.com/mlrun/mlrun/pull/1949). + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.0.2) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.0.2) + +## v1.0.0 + +### New and updated features + +#### Feature store +- Supports snowflake as a datasource for the feature store. + +#### Graph +- A new tab under **Projects | Models** named **Real-time pipelines** displays the real time pipeline graph, +with a drill-down to view the steps and their details. [Tech Preview] + +#### Projects +- Setting owner and members are in a dedicated **Project Settings** section. +- The **Project Monitoring** report has a new tile named **Consumer groups (v3io streams)** that shows the total number + of consumer groups, with drill-down capabilities for more details. + +#### Resource management +- Supports preemptible nodes. +- Supports configuring CPU, GPU, and memory default limits for user jobs. + +#### UI +- Supports configuring pod priority. +- Enhanced masking of sensitive data. +- The dataset tab is now in the **Projects** main menu (was previously under the Feature store). + +### See more +- [MLRun change log in GitHub](https://github.com/mlrun/mlrun/releases/tag/v1.0.0) +- [UI change log in GitHub](https://github.com/mlrun/ui/releases/tag/v1.0.0) + + +## Open issues + +| ID | Description | Workaround | Opened | +| ---- | -------------------------------------------------------| --------------------------------------------- | ------ | +| 2489 | Cannot pickle a class inside an mlrun function. | Use cloudpickle instead of pickle | 1.2.0 | +| 2223 | Cannot deploy a function when notebook names contain "." (ModuleNotFoundError) | Do not use "." in notebook name | 1.0.0 | +| 2199 | Spark operator job fails with default requests args. | NA | 1.0.0 | +| 1584 | Cannot run `code_to_function` when filename contains special characters | Do not use special characters in filenames | 1.0.0 | +| [2621](https://github.com/mlrun/mlrun/issues/2621) | Running a workflow whose project has `init_git=True`, results in Project error | Run `git config --global --add safe.directory '*'` (can substitute specific directory for *). | 1.1.0 | +| 2407 | Kafka ingestion service on empty feature set returns an error. | Ingest a sample of the data manually. This creates the schema for the feature set and then the ingestion service accepts new records. | 1.1.0 | + +## Limitations + + +| ID | Description | Workaround | Opened | +| ---- | -------------------------------------------------------------- | ------------------------------------ | ----------| +| 2014 | Model deployment returns ResourceNotFoundException (Nuclio error that Service is invalid.) | Verify that all `metadata.labels` values are 63 characters or less (Kubernetes limitation). | v1.0.0 | + + + +## Deprecations + + +| In v. | ID |Description | +|------ | ---- | --------------------------------------------------------------------| +| 1.0.0 | | MLRun / Nuclio do not support python 3.6 | \ No newline at end of file diff --git a/docs/contents.rst b/docs/contents.rst index a1559671d2..2b1b386860 100644 --- a/docs/contents.rst +++ b/docs/contents.rst @@ -38,4 +38,10 @@ Table of Contents genindex api/index cli - glossary \ No newline at end of file + glossary + +.. toctree:: + :maxdepth: 1 + :caption: Change log + + change-log/_index diff --git a/docs/data-prep/ingest-data-fs.md b/docs/data-prep/ingest-data-fs.md index ef5e11b9c0..f683992c0d 100644 --- a/docs/data-prep/ingest-data-fs.md +++ b/docs/data-prep/ingest-data-fs.md @@ -32,12 +32,12 @@ also general limitations in [Attribute name restrictions](https://www.iguazio.co ## Inferring data -There are two types of inferring: +There are 2 types of infer options: - Metadata/schema: This is responsible for describing the dataset and generating its meta-data, such as deducing the data-types of the features and listing the entities that are involved. Options belonging to this type are `Entities`, `Features` and `Index`. The `InferOptions` class has the `InferOptions.schema()` function which returns a value containing all the options of this type. -- Stats/preview: This related to calculating statistics and generating a preview of the actual data in the dataset. +- Stats/preview: This relates to calculating statistics and generating a preview of the actual data in the dataset. Options of this type are `Stats`, `Histogram` and `Preview`. The `InferOptions class` has the following values:
@@ -54,7 +54,6 @@ The `InferOptions class` basically translates to a value that can be a combinati When simultaneously ingesting data and requesting infer options, part of the data might be ingested twice: once for inferring metadata/stats and once for the actual ingest. This is normal behavior. - ## Ingest data locally Use a Feature Set to create the basic feature-set definition and then an ingest method to run a simple ingestion "locally" in the Jupyter Notebook pod. @@ -137,6 +136,7 @@ When defining a source, it maps to nuclio event triggers.
You can also create a custom `source` to access various databases or data sources. ## Target stores + By default, the feature sets are saved in parquet and the Iguazio NoSQL DB ({py:class}`~mlrun.datastore.NoSqlTarget`).
The parquet file is ideal for fetching large set of data for training while the key value is ideal for an online application since it supports low latency data retrieval based on key access. @@ -144,7 +144,12 @@ The parquet file is ideal for fetching large set of data for training while the When working with the Iguazio MLOps platform the default feature set storage location is under the "Projects" container: `/fs/..` folder. The default location can be modified in mlrun config or specified per ingest operation. The parquet/csv files can be stored in NFS, S3, Azure blob storage, Redis, and on Iguazio DB/FS. ``` -### Redis target store + +### Redis target store + +```{admonition} Tech preview +``` + The Redis online target is called, in MLRun, `RedisNoSqlTarget`. The functionality of the `RedisNoSqlTarget` is identical to the `NoSqlTarget` except for: - The `RedisNoSqlTarget` does not support the spark engine, (only supports the storey engine). - The `RedisNoSqlTarget` accepts path parameter in the form `://[]:[]@[:port]`
diff --git a/docs/data-prep/ingesting_data.md b/docs/data-prep/ingesting_data.md index e701c90abf..8510c74a9e 100644 --- a/docs/data-prep/ingesting_data.md +++ b/docs/data-prep/ingesting_data.md @@ -85,11 +85,7 @@ facilities for executing pySpark code using a Spark service (which can be deploy as part of an Iguazio system) or through submitting the processing task to Spark-operator. The following page provides additional details and code-samples: - -1. [Spark operator](../runtimes/spark-operator.html) +- [Spark operator](../runtimes/spark-operator.html) In a similar manner, Dask can be used for parallel processing of the data. To read data as a Dask `DataFrame`, use the following code: diff --git a/docs/ecosystem.md b/docs/ecosystem.md index c8166daab6..03cfce651c 100644 --- a/docs/ecosystem.md +++ b/docs/ecosystem.md @@ -66,4 +66,8 @@ This section lists the data stores, development tools, services, platforms, etc. - Jenkins - Github Actions - Gitlab CI/CD -- KFP \ No newline at end of file +- KFP + +## Browser + +MLRun runs on Chrome and Firefox. \ No newline at end of file diff --git a/docs/feature-store/using-spark-engine.md b/docs/feature-store/using-spark-engine.md index 0ce06d1860..ae19d51b45 100644 --- a/docs/feature-store/using-spark-engine.md +++ b/docs/feature-store/using-spark-engine.md @@ -1,3 +1,4 @@ +(ingest-features-spark)= # Ingest features with Spark The feature store supports using Spark for ingesting, transforming, and writing results to data targets. When diff --git a/docs/index.md b/docs/index.md index d114578e38..4a120ec422 100644 --- a/docs/index.md +++ b/docs/index.md @@ -5,9 +5,10 @@ MLRun is an open MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications. MLRun significantly reduces engineering efforts, time to production, and computation resources. With MLRun, you can choose any IDE on your local machine or on the cloud. MLRun breaks the silos between data, ML, software, and DevOps/MLOps teams, enabling collaboration and fast continuous improvements. -Get started with MLRun **{ref}`Tutorials and examples `**, **{ref}`Installation and setup guide `**, or read about **{ref}`architecture`**. +Get started with MLRun **{ref}`Tutorials and examples `**, **{ref}`Installation and setup guide `**, -This page explains how MLRun addresses the [**MLOps tasks**](#mlops-tasks) and presents the [**MLRun core components**](#core-components). + +This page explains how MLRun addresses the [**MLOps tasks**](#mlops-tasks), and presents the [**MLRun core components**](#core-components). See the supported data stores, development tools, services, platforms, etc., supported by MLRun's open architecture in **{ref}`ecosystem`**. @@ -49,7 +50,6 @@ See the supported data stores, development tools, services, platforms, etc., sup ``` ```` -````` The [**MLOps development workflow**](./mlops-dev-flow.html) section describes the different tasks and stages in detail. MLRun can be used to automate and orchestrate all the different tasks or just specific tasks (and integrate them with what you have already deployed). @@ -79,7 +79,7 @@ In addition, the MLRun [**Feature store**](./feature-store/feature-store.html) a {octicon}`mortar-board` **Docs:** {bdg-link-info}`Feature store <./feature-store/feature-store.html>` -{bdg-link-info}`Data & artifacts <./concepts/data-feature-store.html>` +{bdg-link-info}`Data & artifacts <./concepts/data.html>` , {octicon}`code-square` **Tutorials:** {bdg-link-primary}`quick start <./tutorial/01-mlrun-basics.html>` {bdg-link-primary}`Feature store <./feature-store/basic-demo.html>` @@ -106,7 +106,7 @@ MLRun rapidly deploys and manages production-grade real-time or batch applicatio {octicon}`mortar-board` **Docs:** {bdg-link-info}`Realtime pipelines <./serving/serving-graph.html>` -{bdg-link-info}`Batch inference <./concepts/TBD.html>` +{bdg-link-info}`Batch inference <./deployment/batch_inference.html>` , {octicon}`code-square` **Tutorials:** {bdg-link-primary}`Realtime serving <./tutorial/03-model-serving.html>` {bdg-link-primary}`Batch inference <./tutorial/07-batch-infer.html>` @@ -124,7 +124,7 @@ Observability is built into the different MLRun objects (data, functions, jobs, , {octicon}`code-square` **Tutorials:** {bdg-link-primary}`Model monitoring & drift detection <./tutorial/05-model-monitoring.html>` - +````` ## MLRun core components diff --git a/docs/monitoring/initial-setup-configuration.ipynb b/docs/monitoring/initial-setup-configuration.ipynb index 1a23111664..b25fd9dcab 100644 --- a/docs/monitoring/initial-setup-configuration.ipynb +++ b/docs/monitoring/initial-setup-configuration.ipynb @@ -8,6 +8,7 @@ } }, "source": [ + "(enable-model-monitoring)=\n", "# Enable model monitoring\n", "\n", "```{note}\n", @@ -16,17 +17,28 @@ "\n", "To see tracking results, model monitoring needs to be enabled in each model.\n", "\n", - "To enable model monitoring, include `serving_fn.set_tracking()` in the model server.\n", - "\n", "To utilize drift measurement, supply the train set in the training step.\n", "\n", "**In this section**\n", + "- [Enabling model monitoring](#enabling-model-monitoring)\n", "- [Model monitoring demo](#model-monitoring-demo)\n", " - [Deploy model servers](#deploy-model-servers)\n", " - [Simulating requests](#simulating-requests)\n", "\n", + "## Enabling model monitoring\n", + "\n", + "Model activities can be tracked into a real-time stream and time-series DB. The monitoring data\n", + "is used to create real-time dashboards and track model accuracy and drift. \n", + "To set the tracking stream options, specify the following function spec attributes:\n", + " \n", + " `fn.set_tracking(stream_path, batch, sample)`\n", + " \n", + "- **stream_path** — the v3io stream path (e.g. `v3io:///users/..`)\n", + "- **sample** — optional, sample every N requests\n", + "- **batch** — optional, send micro-batches every N requests\n", + " \n", "## Model monitoring demo\n", - "Use the following code blocks to test and explore model monitoring." + "Use the following code to test and explore model monitoring." ] }, { diff --git a/docs/monitoring/model-monitoring-deployment.ipynb b/docs/monitoring/model-monitoring-deployment.ipynb index 6547f1d4e8..9c45303409 100644 --- a/docs/monitoring/model-monitoring-deployment.ipynb +++ b/docs/monitoring/model-monitoring-deployment.ipynb @@ -80,8 +80,8 @@ "* **Model** — user defined name for the model\n", "* **Labels** — user configurable tags that are searchable\n", "* **Uptime** — first request for production data\n", - "* **Last Prediction **— most recent request for production data\n", - "* **Error Count **— includes prediction process errors such as operational issues (For example, a function in a failed state), as well as data processing errors\n", + "* **Last Prediction** — most recent request for production data\n", + "* **Error Count** — includes prediction process errors such as operational issues (For example, a function in a failed state), as well as data processing errors\n", "(For example, invalid timestamps, request ids, type mismatches etc.)\n", "* **Drift** — indication of drift status (no drift (green), possible drift (yellow), drift detected (red))\n", "* **Accuracy** — a numeric value representing the accuracy of model predictions (N/A)\n", diff --git a/docs/runtimes/images.md b/docs/runtimes/images.md index f45bb5c0b2..74f862c923 100644 --- a/docs/runtimes/images.md +++ b/docs/runtimes/images.md @@ -26,5 +26,5 @@ These characteristics are great when you’re working in a POC or development en ### Working with images in production For production you should create your own images to ensure that the image is fixed. -- Pin the image tag, e.g. `image="mlrun/mlrun:1.2.0"`. This maintains the image tag at 1.1.0 even when the client is upgraded. Otherwise, an upgrade of the client would also upgrade the image. (If you specify an external (not MLRun images) docker image, like python, the result is the docker/k8s default behavior, which defaults to `latest` when the tag is not provided.) +- Pin the image tag, e.g. `image="mlrun/mlrun:1.2.0"`. This maintains the image tag at the version you specified, even when the client is upgraded. Otherwise, an upgrade of the client would also upgrade the image. (If you specify an external (not MLRun images) docker image, like python, the result is the docker/k8s default behavior, which defaults to `latest` when the tag is not provided.) - Pin the versions of requirements, again to avoid breakages, e.g. `pandas==1.4.0`. (If you only specify the package name, e.g. pandas, then pip/conda (python's package managers) just pick up the latest version.)