-
Notifications
You must be signed in to change notification settings - Fork 12
Conversation
and move some info to the UG index page rel. #238 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explanation of changes so far 👇🏼
## Codification: the MLEM way | ||
|
||
Please choose from the navigation sidebar to the left, or click the `Next` | ||
button below ↘ | ||
Saving machine learning models to files or loading them back into Python objects | ||
may seem like a simple task at first. For example, the `pickle` and `torch` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved all this to the User Guide
... here (edited a little 🙂)
p.s. we can expand on this page, probably summarizing the usage basics and removing some details from the sub-pages (separate task)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what exactly do you suggest moving?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aguschin here I was just explaining what I did (moving the codification background info to the UG).
If you referred to "removing some details from the sub-pages (separate task)", to start with, the info from Basic concepts and Working with models could be put in the UG index (ideally rewritten in a way that explains why should you learn this paradigm, similar to what we're doing in DVC).
But agian, that would be a separate task (not for this PR). And I can help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to move to basic-concepts (see here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I don't love that page in general (Basic Concepts). What's its purpose? It only covers MLEM Object which is cornerstone so it can probably also be absorbed into the guide index page (or somewhere else).
Also, I rewrote most of the text here @omesser , so it's not the same version as what you have in https://github.com/iterative/mlem.ai/pull/281/files#diff-9e5ea1c7881a8d2248846bcfbd5ed700fb26749b25a3fc0fcb335e7c82489c17
Restyle start: clarify existing content
and related updates to the UG
## Making requests | ||
|
||
You also can create MLEM Client for your deployment to make some requests: | ||
You also can create a MLEM client for your deployment from Python code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I linked to this section from the Docker deployment section of GS (which is the main/happy path). However the GS mentions curl
and mlem apply-remote
, which are not covered here.
Should this section of the guide be more comprehensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However the GS mentions curl and mlem apply-remote, which are not covered here
mentioning and giving a link to doc/user-guide/serving/index.md
should be enough I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see. The structure of the UG is a bit confusing then, bc/ we would link from a section about deployment to a guide about serving when there's a specific guide about deployment... ⏳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the link, but the guide and/or terminology is def. still quite confusing.
- We should clarify somehow (separate task)
p.s. it may be a UX issue (product question) but hopefully we can at least help with good docs until that's addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ Agree we should work on terminology 🙏
I think @jorgeorpinel nailed an important UX gap here.
I'm gonna guess this is completely clear for whoever works on mlem because of how it is built and behaves, but in the "wild", serving and deploying of models is often interchangeable and it can cause a lot of confusions.
Definitely a separate discussion - a change in the fundamental abstraction/terminology will cause major changes in docs
Servers automatically create endpoints from model methods with payload schemas | ||
corresponding to serialized dataset types. | ||
The server is now running and listening for requests on the URL shown | ||
above.Endpoints are created automatically from model methods (using the | ||
`sample_data` [provided earlier](#saving-your-model)). You can open the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Get Started with MLEM | ||
--> | ||
|
||
<admon type="info"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks strange right at the top of the tutorial https://mlem-ai-start-edits-ojgyvycqih.herokuapp.com/doc/get-started?tab=Heroku-app. I'd argue we don't need this admon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's 3rd-party tool documentation, so it should be separated somehow IMO. Would you prefer a hidden <details>
section? I wouldn't hide it since the steps are required to follow the tutorial. In my current proposal, it looks like this:
p.s. if the admonition seems to loud to you, maybe we can use an old-school block quote (like this paragraph). Lmk
|
||
1. Model methods: Like `predict` and `predict_proba` | ||
2. Input data schema: Describes the dataframe (Iris dataset) | ||
3. Python Requirements: `sklearn` and `pandas` in this case, with the specific | ||
versions used to train the model | ||
|
||
<admon type='tip'> | ||
<admon type="info"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 was looking more appealing to me than ℹ️ . Just complaining, if that's our standard for all docs websites, please ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, is this a tip or is it related info? The way each one looks could be changed, but that's a design question IMO. No strong opinion though, feel free to revert this if you prefer.
#### Why do it the MLEM way ? | ||
|
||
Saving models to files or loading them back into python objects may seem like a | ||
deceptively simple task at first. For example, `pickle` and `torch` libraries | ||
can serialize/deserialize model objects to/from files. However, MLEM adds some | ||
"special sauce" by inspecting the objects and serializing their metadata into | ||
`.mlem` files and intelligently using this later on. This metadata is necessary | ||
to reliably enable actions like packaging and serving of different models types | ||
down in various ways. MLEM allows us to automate a lot of the pain points we | ||
would hit later on in our ML workflow by codifying and managing this metadata | ||
about our models (or other objects) for us. | ||
|
||
</details> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take: I'd keep this section. It's a bit too abstract for my taste, so I can suggest putting an example here. Saying something like:
- we need to know there is predict and predict_proba to build a REST API interface. Same for input data schema.
- we need to know it's pandas and numpy to pip install requirements in containers on Heroku/AWS
I'd keep it - with examples it can be easier to understand why MLEM is powerful.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep this section
If we want to present the GS along these lines (give more motivation) then we should incorporate the ideas into the main text. E.g. instead of "Before we explore everything that MLEM can do, you need to save a machine learning model with MLEM." (here) we can use an explanation similar to the text above.
But if we put it in a hidden section that comes after another hidden section, to me that signals that we already decided this is not really important for the GS -- and I would agree it's best to keep the GS actionable (not theoretical). Thus moved it to a longer-format doc (that's linked several times from this page).
putting an example
If I understand correctly this example would be disconnected from the main story in the GS which would complicates things for the reader. We could def. add that example in the UG though, as another way to expand on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to move it away from GS in my recent GS PR here: #281
user-guide/basic-concepts
seems to make sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
135588c
to
37ac38e
Compare
Automated style fixes for #265, created by [Restyled][]. The following restylers [made fixes](https://restyled.io/gh/iterative/repos/mlem.ai/jobs/2680383): - [prettier](https://prettier.io/docs/en/) To incorporate these changes, merge this Pull Request into the original. We recommend using the Squash or Rebase strategies. **NOTE**: As work continues on the original Pull Request, this process will re-run and update (force-push) this Pull Request with updated style fixes as necessary. If the style is fixed manually at any point (i.e. this process finds no fixes to make), this Pull Request will be closed automatically. Sorry if this was unexpected. To disable it, see our [documentation][]. [restyled]: https://restyled.io [documentation]: https://github.com/restyled-io/restyled.io/wiki/Disabling-Restyled
6a96a83
to
d9c08fd
Compare
General comment - I was working on GS too for the past few days (see #281) and the get-started changes are conflicting :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! mostly lgtm, some comments here and there, still some uncertainty about serving/deploying that we need to live with until clarified
Servers automatically create endpoints from model methods with payload schemas | ||
corresponding to serialized dataset types. | ||
The server is now running and listening for requests on the URL shown | ||
above.Endpoints are created automatically from model methods (using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above.Endpoints are created automatically from model methods (using the | |
above. Endpoints are created automatically from model methods (using the |
Note, that serving the model requires you to have the correct packages to be | ||
installed. You can check out how to create a `venv` with right packages with | ||
MLEM, or how to serve the model in a | ||
This requires the correct packages to be installed. You can check out how to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires the correct packages to be installed. You can check out how to | |
This requires the correct packages to be installed for the server to serve the model. |
(this is more for @aguschin since I'm questioning the original contents) - The connection to venv
and docker guide is not obvious here to me. Maybe instead it's better to mention that the needed requirements are inferred from the model metadata extracted when saving it, and link to model codification. wdyt @aguschin ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's a good idea.
corresponding to serialized dataset types. | ||
The server is now running and listening for requests on the URL shown | ||
above.Endpoints are created automatically from model methods (using the | ||
`sample_data` [provided earlier](#saving-your-model)). You can open the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`sample_data` [provided earlier](#saving-your-model)). You can open the | |
`sample_data` provided when [saving the model](#saving-your-model)) to infer the payload schema. You can open the |
Each server implementation also has its client counterpart (e.g. `HTTPClient` | ||
for FastAPI). Clients can be used to make requests. Since a server also exposes | ||
the model interface description, the client will know what methods are available | ||
and handle serialization for you. You can use them via `mlem apply-remote`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clients can be used to make requests
mentioning the servers is important here (maybe even to their "corresponding servers")
same for serialization and deserialization - not sure why this is cut out. There are 2 directions here - serializing requests made from the client, and de-serializing the result. It's both more accurate to mention both and also exrmplifies the value of using MLEM to instantiate a client instead of implementing one yourself
|
||
```cli | ||
$ mlem apply-remote http test_x.csv --host="0.0.0.0" --port=8080 --json | ||
$ mlem apply-remote http test_x.csv --json \ | ||
--host="0.0.0.0" --port=8080 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aesthetics/ocd - if not one line, break lines fully for CLI options:
$ mlem apply-remote http test_x.csv \
--json \
--host="0.0.0.0" \
--port=8080
## Making requests | ||
|
||
You also can create MLEM Client for your deployment to make some requests: | ||
You also can create a MLEM client for your deployment from Python code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ Agree we should work on terminology 🙏
I think @jorgeorpinel nailed an important UX gap here.
I'm gonna guess this is completely clear for whoever works on mlem because of how it is built and behaves, but in the "wild", serving and deploying of models is often interchangeable and it can cause a lot of confusions.
Definitely a separate discussion - a change in the fundamental abstraction/terminology will cause major changes in docs
Not sure whether I should continue to work on this per #281 (comment). |
@jorgeorpinel, may I kindly ask you to resolve conflicts and apply your changes on top of what was merged? I'm afraid I'll lose your intentions if I'll do that myself. |
@jorgeorpinel - might be easier for you to split 🙏 :
|
No activity in a long time, |
and move some info to the UG index page
rel. #238 (comment) +